-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Support shared encoding defined with linear layout #5720
base: main
Are you sure you want to change the base?
Conversation
+cc @lezcano AMD team wants to experiment with non standard swizzling patterns(for example #4984), so I had an idea to use What do you think about renaming |
I'm interested in this discussion as well. The NV lowering path heavily relies on the legacy encoding of the SMEM layout, so going all in on using LL for SMEM at the IR level is difficult. Moreover, how are you going to query swizzling properties of SMEM represented only via LL? |
We can query the first few bases of the offset dimension to see if they are contiguous and check if the rest bases do not overlap with them. There's indeed an algorithm we plan to implement for the ldmatrix path that won't rely on the standard shared encoding |
AMD path might be much actually simpler since it doesn't slice shared memory and doesn't have ldmatrix/stmatrix instructions. The only concern is that checking if base names have "offset" to determine a shared encoding isn't a solid solution. |
@masahi just bumped into this quite recently as well, and as @Jokeren has mentioned, we'll start using new shmem layouts sooner than later. The issue with SharedLayouts is that they don't have an API like the one DistributedLayouts have. They just have very few attributes that are rather unique to their own structure. In general, a characterisation of the shared memory layout that we care about may be given by is a LinearLayout for which al its bases have at most 2 bits equal to one ( Taking this structure into account, the tricky part here is to create an API that's returns the relevant properties we need to work with this layout. This would be the equivalent to triton/lib/Dialect/TritonGPU/IR/Dialect.cpp Lines 1292 to 1464 in 47c730b
This API is going to be very different to the one from This is not the easiest task, but it's what we'll need to tackle if we want to support generic swizzled layouts. If this is a bit too much in one go, that's fine. Another way to go is to define another subclass of SharedEncodings that are less general than the one I described above, and start implementing that one and adding support across the codebase for it. That is what @masahi is set to do with I'll probably make some strides towards tackling the general case next month anyway. |
It seems currently I do not fully understand all issues related to general SMEM linear layout encoding. Since we have #5764, which I suppose will be merged soon, I will try go in two directions simultaneously:
|
That PR is being reworked following Thomas' advice. The final state will be rather similar to what you are saying: a common class with no methods. In particular, you can implement |
This PR enables basic support for linear layout used as a shared encoding.
@binarman Yes, the rework is going to take some time. As Thomas suggested, I'll first add |
a91f8e9
to
c24e6bf
Compare
Actually let me take this piece, I'll try to send something to unblock both of you soon |
ok I ended up doing a bit more refactoring, here is the PR: That should allow extending the new nivida mma shared layout as well as exposing linear layout the same way we do for distributed layout |
This PR enables basic support for linear layout used as a shared encoding.
This change is needed to support custom memory layout introduced in #4984