shared memory accumulation #298
Answered
by
daedalus5
toaster-robotics
asked this question in
Q&A
Replies: 1 comment 1 reply
-
This is one way to handle the 2D case. Under the hood, all 1D, 2D, 3D etc arrays are treated as linear arrays. The multi-dimensional arrays just provide a convenient way to organize thread indices, but you are always free to compute your own 1D <--> ND conversions. |
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
shi-eric
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
hello,
would anyone have code examples of how to use
func_native
to speed up atomic add? I've read that block-wise accumulation using shared memory, then using atomic add between blocks can speed things up vs just usingatomic_add
. However I'm unsure how to implement this using what's available in the docs. I have the following questions after reading through that example code:A[i, j]
?this is roughly what my code looks like
Beta Was this translation helpful? Give feedback.
All reactions