esimd::xmx::dpas produces seemingly unnecessary dpas instruction #13878

elliottbinder · 2024-05-22T19:44:49Z

elliottbinder
May 22, 2024

Using the dpas and dpasw operations compiled for an A770, I'm seeing that some sequences of instructions begin with an instruction of the form:

dpas.8x8 (8|M0) r106:f null:f r106:bf r106.0:bf {Atomic} // $271

where dst, src1, and src2 are all the same register. The subsequent instruction is more close to the format I would expect, with two distinct tiles of input register being multiplied and accumulated to that dst register:

dpas.8x8 (8|M0) r106:f r106:f r210:bf r242.0:bf {Atomic} // $271

Semantically, this doesn't make sense to me, as the first instruction would be interpreting the accumulating registers as the wrong data type anyway. Why are these instructions being generated and why do they not cause errors in the computation? I have not found any reference to this pattern in the instruction manual.
Subsequent dpas instructions on different input and output registers do not follow this pattern within a sequence of dpas instructions, but it appears most sequences of dpas instructions begin with this pattern.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

esimd::xmx::dpas produces seemingly unnecessary dpas instruction #13878

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

esimd::xmx::dpas produces seemingly unnecessary dpas instruction #13878

Uh oh!

elliottbinder May 22, 2024

Replies: 0 comments

elliottbinder
May 22, 2024