-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Velocity SIMD CPU Runtime #891
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
m4rs-mt
force-pushed
the
velocity
branch
2 times, most recently
from
December 3, 2022 03:05
35e7f72
to
7b0d4dd
Compare
m4rs-mt
force-pushed
the
velocity
branch
6 times, most recently
from
January 26, 2023 21:24
872582e
to
353fbfd
Compare
…y-specific operations.
m4rs-mt
force-pushed
the
velocity
branch
2 times, most recently
from
March 2, 2023 19:30
298e489
to
699fb91
Compare
…velocity accelerators.
…argeted by Velocity kernels.
…y generated parameter instances to Velocity kernels.
Closing this PR in favor of a newly designed Velocity backend. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds the announced SIMD-based CPU runtime fully implemented in managed code. The new
Velocity
accelerator supports most GPU kernels (except those using dynamic shared memory) and is able to utilize SIMD hardware acceleration on modern CPUs, allowing you to run your ILGPU kernels efficiently on then CPU by leveraging the implemented automatic vectorization engine.It supports the following hardware configurations:
128bit
-basedX64 SSE
andARM64 Neon
instructions (also supportsM1
Macs - Mac M Series Support #769)256bit
-basedX64 AVX
instructions512bit
-basedX64 AVX2
instructions (limited feature set; some functions will fallback to256bit
registers)Note that this PR is a preview of the current development state that will be extended with additional features in the near future.