esimd::xmx::dpas produces seemingly unnecessary dpas instruction #13878
Unanswered
elliottbinder
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Using the dpas and dpasw operations compiled for an A770, I'm seeing that some sequences of instructions begin with an instruction of the form:
dpas.8x8 (8|M0) r106:f null:f r106:bf r106.0:bf {Atomic} // $271
where dst, src1, and src2 are all the same register. The subsequent instruction is more close to the format I would expect, with two distinct tiles of input register being multiplied and accumulated to that dst register:
dpas.8x8 (8|M0) r106:f r106:f r210:bf r242.0:bf {Atomic} // $271
Semantically, this doesn't make sense to me, as the first instruction would be interpreting the accumulating registers as the wrong data type anyway. Why are these instructions being generated and why do they not cause errors in the computation? I have not found any reference to this pattern in the instruction manual.
Subsequent dpas instructions on different input and output registers do not follow this pattern within a sequence of dpas instructions, but it appears most sequences of dpas instructions begin with this pattern.
Beta Was this translation helpful? Give feedback.
All reactions