Reduction operation being executed in parallel with read-modify-write operation in onesource::generatePDF() #197

hdante · 2024-09-03T21:10:21Z

Hello, there's a reduction loop that's being executed in parallel, but it's covering the whole vector, including the data of all threads, per thread. Instead the reduction should, for example, be executed in a single thread (or maybe reduced with a tree). I think this also means that there's a race because the loop does a read-modify-write sequence on the chi2 and ind vectors.

The parallelized reduction causes every thread to execute the same code and, if confirmed, a race condition would cause the chi2 vector not being completely minimized. The race condition might be confirmed with a dataset that exposes the race and then comparing a multi-threaded library with a single-threaded one.

lephare/src/lib/onesource.cpp

Line 1002 in dbe015b

for (int i = 0; i < dimzg; i++) {

Before submitting
Please check the following:

I have described the situation in which the bug arose, including what code was executed, information about my environment, and any applicable data others will need to reproduce the problem.
I have included available evidence of the unexpected behavior (including error messages, screenshots, and/or plots) as well as a descriprion of what I expected instead.
If I have a solution in mind, I have provided an explanation and/or pseudocode and/or task list.

The text was updated successfully, but these errors were encountered:

hdante · 2024-09-04T19:49:55Z

Hello, I imagine there are 2 ways to fix the reduction operation, one is using OpenMP's single threaded loop:

#pragma omp single
      for (int i = 0; i < dimzg; i++) {
(...)

The second is moving the loop outside the omp parallel region and executing a standard C++ single threaded loop.

I'm not sure how to compare the performance impact of either fix, it might be easier to try both.

hdante added the bug Something isn't working label Sep 3, 2024

johannct linked a pull request Sep 12, 2024 that will close this issue

196 Potential race condition in read-modify-write PDF vectors in parallel loop in onesource::generatePDF() #206

Closed

21 tasks

This was referenced Sep 12, 2024

196 Potential race condition in read-modify-write PDF vectors in parallel loop in onesource::generatePDF() #206

Closed

197 reduction operation being executed in parallel with read modify write operation in onesourcegeneratepdf #209

Merged

johannct removed a link to a pull request Sep 17, 2024

196 Potential race condition in read-modify-write PDF vectors in parallel loop in onesource::generatePDF() #206

Closed

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduction operation being executed in parallel with read-modify-write operation in onesource::generatePDF() #197

Reduction operation being executed in parallel with read-modify-write operation in onesource::generatePDF() #197

hdante commented Sep 3, 2024

hdante commented Sep 4, 2024

Reduction operation being executed in parallel with read-modify-write operation in onesource::generatePDF() #197

Reduction operation being executed in parallel with read-modify-write operation in onesource::generatePDF() #197

Comments

hdante commented Sep 3, 2024

hdante commented Sep 4, 2024