Skip to content

Gather/Scatter instructions can set RF#7897

Open
khuey wants to merge 1 commit into
masterfrom
iX-gather-scatter-fWR
Open

Gather/Scatter instructions can set RF#7897
khuey wants to merge 1 commit into
masterfrom
iX-gather-scatter-fWR

Conversation

@khuey

@khuey khuey commented May 12, 2026

Copy link
Copy Markdown
Contributor

While looking at the SDM for the other gather/scatter issue today I noticed this language:

"This instruction can be suspended by an exception if at least one element is already gathered/scattered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set). When this happens, the destination register and the mask register are partially updated. If any traps or interrupts are pending from already scattered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued."

This applies to both VEX gathers and EVEX gather/scatters.

While looking at the SDM while other gather/scatter issue today I noticed this language:

"This instruction can be suspended by an exception if at least one element is already gathered/scattered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set). When this happens, the destination register and the mask register are partially updated. If any traps or interrupts are pending from already scattered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued."

This applies to both VEX gathers and EVEX gather/scatters.
@khuey khuey requested a review from derekbruening May 12, 2026 23:47
*/
{OP_vpgatherdd,0x66389018, catSIMD, "vpgatherdd",Vx,Hx,MVd,Hx,xx, mrm|vex|reqp,x,tevexwb[189][0]},
{OP_vpgatherdq,0x66389058, catSIMD, "vpgatherdq",Vx,Hx,MVq,Hx,xx, mrm|vex|reqp,x,tevexwb[189][2]},
{OP_vpgatherdd,0x66389018, catSIMD, "vpgatherdd",Vx,Hx,MVd,Hx,xx, mrm|vex|reqp,fWR,tevexwb[189][0]},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Is RF written with 0 all the other times, or it's only written at all in this exception case and otherwise is not touched? If the latter: how do we represent that? I recall pretending it does a read in some such cases but am not remembering exactly when we did that; also not remembering all the details of things like instr_predicate_writes_eflags().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The processor sets RF to 0 after every successful instruction execution, so yes, 0 is written all other times.

I'm not entirely sure we want this the more I think about it. What's unique about this instruction is not that it sets RF (anything that triggers a page fault can do that), it's that it can "partially complete" and be interrupted. The repeating string instructions can do that too (and they also don't have fWR).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the partially complete attribute: tools that themselves take action for each subpart of these instructions will be using drx_expand_scatter_gather() (just like they'd expand rep string instructions into explicit loops), so interruption in the middle will also interrupt the instrumentation in the middle.

For the flags: drreg and tools in general don't touch RF for normal instrumentation.

If setting RF seems more of a general mechanism with pending faults maybe like you said it's not worth marking here? OTOH it may not cause any harm to put it here since as noted most flag analyzers only look at the arithmetic flags or maybe DF.

I would be ok either way.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these instructions need to set RF while rep string instructions don't?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these instructions need to set RF while rep string instructions don't?

I.e., wouldn't a rep string fault where a pending trap/interrupt exists also want to deliver the trap/interrupt first and set RF?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked, and repeated string instructions do set RF.
From the Intel manual, Vol 3 (https://cdrdv2.intel.com/v1/dl/getContent/671447):

  • For any interrupt arriving after any iteration of a repeated string instruction but the last iteration, the value pushed for RF is 1.
  • For any trap-class exception generated by any iteration of a repeated string instruction but the last iteration, the value pushed for RF is 1.

So we should probably add fWR to OP_rep_ as well in decode_table.c?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if I land this it should change the repeating string instructions too.

@derekbruening derekbruening left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking Approve but also fine if you decide not to submit as noted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants