Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data-Tiling: Migrate round_dims_to to iteration_sizes after encoding specialization is on by default #19897

Open
hanhanW opened this issue Feb 4, 2025 · 2 comments
Assignees
Labels
codegen Shared code generation infrastructure and dialects enhancement ➕ New feature or request good first issue 🌱 Good for newcomers

Comments

@hanhanW
Copy link
Contributor

hanhanW commented Feb 4, 2025

The round_dims_to field in the encoding was useful for data-tiling late materialization path, because it provides the hint for both host and device code. The host side can allocate the storage buffer based on the hint, and the device gets the limitation of padding space. (Otherwise, the device could access the buffer out of bounds.)

However, it is not the ideal solution because the device could request larger tile sizes for some cases (e.g., matvec), which leads to inefficient strategy. Also, the host could allocate a huge buffer that is not fully used by the device. Sometimes the device just need a little more storage buffer, but not unconditionally pad each dimension to large size.

Today, we have encoding specialization, which is not yet on by default. The encoding implements the interface methods, that it can propagate the request from the executable target to the host. I.e., the host can allocate exact storage buffer for the tensor encoding. Once we turn the pass on by default, we no longer need the round_dims_to field in the encoding. Then the next question is that what information we want to encode in the encodings. I think the answer is the iteration size of each dimension. On CPU, it can generate more efficient code if we recognize that there is a narrow matrix (e.g., matvec/vecmat/etc). Today, we abuse the round_dims_to field to provide such information (which is bad). If we are going to deprecate the round_dims_to field, we'll need to introduce iteration_sizes field to carry the information.

Note, the task depends on the encoding specialization pass. We should implement this after we make the encoding specialization pass on by default.

@hanhanW hanhanW added the codegen Shared code generation infrastructure and dialects label Feb 4, 2025
@hanhanW hanhanW self-assigned this Feb 4, 2025
@hanhanW hanhanW added enhancement ➕ New feature or request good first issue 🌱 Good for newcomers labels Feb 4, 2025
@pashu123
Copy link
Contributor

pashu123 commented Feb 5, 2025

@hanhanW Shall I work on this?

@hanhanW
Copy link
Contributor Author

hanhanW commented Feb 5, 2025

@hanhanW Shall I work on this?

SGTM, but you need to wait a bit. I think I can make the specialization on by default next week. Let's also chat more once I have my data-tiling RFC ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen Shared code generation infrastructure and dialects enhancement ➕ New feature or request good first issue 🌱 Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants