sycl: refactor quantization to q8_1 #14815
Open
+169
−170
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current implementation of how some mul_mats do 8-bit quantization is not very flexible. While exploring other possibilities for a different gemv kernel, I run into the necessity of having a q8_1 tensor in a slightly different format, and that wasn't supported with the current
convert_src1_to_q8_1
bool.The PR refactors quantization kernels to a separate header and:
sycl::nd_item<1>
quantize_q8_1
to have the same structure as thereorder q8_1
kernelPerformance is unaffected.
Pinging @AD2605 as author of the reorder q8_1 kernel.