Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential race condition in read-modify-write PDF vectors in parallel loop in onesource::generatePDF() #196

Closed
3 tasks done
hdante opened this issue Sep 3, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@hdante
Copy link

hdante commented Sep 3, 2024

Hello, there seems to be some output variables that are being used with a read-modify-write instruction in the auto parallelized loop without being private to each parallel thread, potentially causing a race condition. The result to the code would be some PDF vectors with certain points with lower probability than calculated sequentially (if the race condition really happens in practice). The variables are:

  • PDFldustloc
  • PDFcol1loc
  • PDFcol2loc
  • PDFmrefloc

This could be confirmed with a dataset that makes it suitable for the race to happen and two different versions of the library, one multi-threaded and another single-threaded. (and some way to decide that the floating point results differ significantly).

for (size_t i = 0; i < va.size(); i++) {

lephare version: main branch dbe015b

Before submitting
Please check the following:

  • I have described the situation in which the bug arose, including what code was executed, information about my environment, and any applicable data others will need to reproduce the problem.
  • I have included available evidence of the unexpected behavior (including error messages, screenshots, and/or plots) as well as a descriprion of what I expected instead.
  • If I have a solution in mind, I have provided an explanation and/or pseudocode and/or task list.
@hdante hdante added the bug Something isn't working label Sep 3, 2024
@hdante
Copy link
Author

hdante commented Sep 4, 2024

Hello, looking at the code, I'd say it's a simple matter of adding the variables to the list of private reduction variables automatically supported by the OpenMP loop. The resulting pragma would look like this:

#pragma omp for schedule(static, 10000) reduction(
        + : PDFzloc, PDFzqloc, PDFmassloc, PDFSFRloc, PDFsSFRloc, PDFAgeloc, \ 
            PDFLdustloc, PDFcol1loc, PDFcol2loc, PDFmrefloc ) \
    nowait 

This might cause a slight slowdown in the algorithm due to the extra copies required by the private arrays, but it's required if a race condition really exists.

@hdante
Copy link
Author

hdante commented Sep 4, 2024

A possible way to automatically check for data races is by using the Thread Sanitizer library. It works like the Address Sanitizer, integrated with the compiler. I have never used it, so I can't confirm it always catches the races.

https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual

@johannct
Copy link
Member

I am pretty sure that these have just been forgotten by mistake. I will add them in a branch soon to be merged.

@hdante
Copy link
Author

hdante commented Sep 17, 2024

Fixed by #209

@hdante hdante closed this as completed Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants