-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling convention for vector arguments #38
Comments
Hi @PkmX thanks for the proposal. Seems reasonable. In the EPI project we implemented the following interim calling convention: Registers The process is as follows (parameters would be processed in the order of the C declaration):
We initially believed it made sense to have callee-saved registers, hence the very limited range from If we take the above algorithm and we use I wouldn't be very worried about the 3 One thing missing from the algorithm above is that segment vectors. In that case I understand |
I think that the merger of proposals resulting in Therefore, methinks that it's rather premature to nail down the calling convention. I'd prefer to have a working implementation upstream so that we can then model the best calling convention, especially on the issue of callee saved registers. |
@PkmX @rofirrim please take a look the calling convention PR. riscv-non-isa/riscv-elf-psabi-doc#171 |
This is discussion ongoing in the PSABI working group and out of scope of this repository. Closing the issue. |
Currently the vector spec only defines all vector registers/CSRs as caller saved, but it does not specify how to pass vectors as arguments.
We propose a calling convention where named vector arguments are passed from
v1
tov31
, and for vector types with LMUL > 1, it must be allocated to the next vector register that is aligned to their LMUL. Vector types with fractional LMULs and vector mask types (vbool*_t
) are treated as occupying one register. Segment vector types should be passed in consecutive vector registers aligned to the base vector's LMUL. Vector types are returned in the same manner as the first vector argument. If all vector registers for argument passing are exhausted, then the rest of the vector arguments are passed on stack as whole vector register by pointers.Some examples (the argument name corresponds to the vector register it uses):
We avoid allocating
v0
in the calling convention due to its ubiquitous purpose as the mask register, so callee do not have to move the first argument offv0
if it needs to use masked instructions.The spec already defines all vector registers as caller-saved, so all of them may be allocated either as argument-passing registers or as temporary registers. The proposal right now chooses all for passing arguments so it is possible to pass up to 3
m8
arguments via register, but it may be up for debate anyway.There is also a small optimization opportunity where smaller LMUL arguments can fill holes left by alignments due to previous larger LMUL arguments, for example:
Since
v2
tov7
remain unused in the example, them1
argument followingm8
may be packed intov2
instead of using the nextv16
register. This uses the registers more efficiently at the cost of slightly more complexity in the calling convention.Any thoughts?
The text was updated successfully, but these errors were encountered: