-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TKW] Work on buffer ops #492
Conversation
Hardcode84
commented
Feb 11, 2025
•
edited
Loading
edited
- Split read/write ops indexing on thread-dependent and thread-independent.
- Use symbolic vals for strides instead of extracting them from memref.
- For now only replace gather/scatter ops with buffer ops.
6355a62
to
eac4567
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor requests for comments and refactors, but overall looks good!
@@ -75,15 +76,29 @@ def _get_start_indices( | |||
return start_indices | |||
|
|||
|
|||
def _split_index(src: IndexExpr) -> tuple[IndexExpr, IndexExpr]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment explaining why it is not sufficient to just substitute in subs_th , why we need to compute the difference and the need for safe_subs as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
IndexingContext.current(), symbolic_shape, allow_mixed_shapes=True | ||
) | ||
if ( | ||
emitter.params.get("use_buffer_load_ops", False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you refactor this into a variable/function so something like
use_buffer_ops = emitter.params.get("use_buffer_load_ops", False)
has_integer_strides = all(isinstance(s, int) for s in strides)
can_emit_buffer_ops = use_buffer_ops and has_integer_strides and is_gather
if can_emit_buffer_ops:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
elements_per_thread: int, | ||
mask: Optional[Value], | ||
offsets_vec: Optional[Value], | ||
) -> Value: | ||
if mask is None and offsets_vec is None: | ||
return vector_d.load(vector_type, mem, start_indices) | ||
|
||
is_gather = offsets_vec is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we also need to check the address space since we can have gathers from shared memory? In this case, we cannot use buffer ops right? (Also same for scatters from shared memory)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
This reverts commit cfef902. Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
Signed-off-by: Ivan Butygin <[email protected]>
670e275
to
f0cacae
Compare
Signed-off-by: Ivan Butygin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
* Split read/write ops indexing on thread-dependent and thread-independent. * Use symbolic vals for strides instead of extracting them from memref. * For now only replace gather/scatter ops with buffer ops. --------- Signed-off-by: Ivan Butygin <[email protected]> Signed-off-by: xintin <[email protected]>