Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about slice_col_par #19

Open
Lenan22 opened this issue Apr 7, 2024 · 2 comments
Open

questions about slice_col_par #19

Lenan22 opened this issue Apr 7, 2024 · 2 comments

Comments

@Lenan22
Copy link

Lenan22 commented Apr 7, 2024

`
int slice_col_par = (iters * blockIdx.x) / k_tiles;
int slice_col = slice_col_par; //
int slice_iters; // number of threadblock tiles in the current slice
int slice_count = 0; // total number of active threadblocks in the current slice
int slice_idx; // index of threadblock in current slice; numbered bottom to top

if (slice_col_par >= n_tiles) {

`
I have some questions about the code above. For example, if there are 108 SMs on the GPU and the calculated iters is 19, with blockIdx.x ranging from 0 to 127, is slice_col_par directly calculated based on iters=19? For instance, when blockIdx.x=5 or others, this thread block might not iterate 19 times.

@efrantar
Copy link
Member

efrantar commented Apr 7, 2024

If the batchsize is larger than 64, we essentially process multiple batchsize 64 matmuls in a single kernel invocations (to allow better partitioning). This is done by virtually replicating the matrix. Consider this example:parallel = 2, a matrix that partitions into 4 tiles and 3 SMs:

SM -> tile assignment:

00 12
01 12

01 01 // slice_col
01 23 // slice_col_par

slice_col points to the actual column in the matrix and slice_col_par to the column in the virtually replicated version.

Yes, it can happen that a few SMS (here SM 2) process less tiles than others; however, the distribution should usually be quite even since our partitioning is designed so that one SM can partially process multiple columns (see SM 0 or SM 1 above).

@Lenan22
Copy link
Author

Lenan22 commented Apr 10, 2024

Thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants