Skip to content

Conversation

jacobbaron
Copy link

No description provided.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you compare this to the code in the link above, you'll see that an if statement is missing.
This causes 2 issues:

  1. Some threads may access data that is beyond the end of the input array.
  2. The output may differ because in some iteration, given two threads A and B (where B_id = A_id + offset), B may update its value, and then A will read the new value.
    This may also explain why you saw an error when using more than 32 threads, or when the global size was larger than 4096.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants