-
Notifications
You must be signed in to change notification settings - Fork 29
[QST] Ambiguous error message when tuning the Tileshape #376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Sorry, but why is this change needed? |
Thanks for the prompt response. I think it is a confusion with the definition of Nvidia's TiledMMA. Should this be 4 in this case? |
Hello, Just to briefly explain:
This is a Layout describing how many sub-groups we have in our work-group, and how they are arranged in M,N,K dimensions. In this example we have 8x4x1 (32) subgroups, with 8 in the M dimension, 4 in the N dimension and 1 in the K dimension. The Stride describes how these are arranged; in this case, N is the fastest moving dimension. So, SG0,SG1,SG2,SG3 are the first row, and then SG4 will be directly 'below' SG0 in the M direction. The stride ( Setting For your example, because you've dropped the tile size from
|
If we have a certain TileShape ( I believe the reason your change doesn't work is because you have So, you either need to:
Unfortunately, for the second solution, you are limited by the available block load operations. I think there isn't a |
Hi @joeatodd, thanks a lot for the detailed explanation!
If the tile shape is So, would it be okay to use a
Thanks for the tip! Prima facie, it seems this approach may lead to better performance on the GPU I'm using (Intel GPU Max 1550) with smaller tile shape since the hardware has 8 EUs per Xe core. |
Highly appreciate the detail explanations, especially regarding to the split block load! |
This happened when I was trying to change the tile shape of example 08_pvc_gemm_fp8.cpp.
I modified line 363 and line 368 using the following codes:
Then it gave the following error message:
The shape of the specified Gmemtile is 32x32 (XE_2D_U8x32x32_LD_V), and that is larger than the MMA_Atom size (XE_8x16x16_F32F16F16F32_TT).
What is the true meaning of this error message?
It would also be very helpful if you would shed some lights on the complex relationship between TileShape, GMemTiledCopy and TiledMMA. Right now I am not sure how to define the remainings when given a specified tileshape.
The text was updated successfully, but these errors were encountered: