forked from llvm-mirror/openmp
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ompt_get_task_info - bug fix #10
Open
vladaindjic
wants to merge
10
commits into
jmellorcrummey:ompt-final-barrier
Choose a base branch
from
vladaindjic:ompt-get-task-info
base: ompt-final-barrier
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
ompt_get_task_info - bug fix #10
vladaindjic
wants to merge
10
commits into
jmellorcrummey:ompt-final-barrier
from
vladaindjic:ompt-get-task-info
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. parallel_data at level 0 present, but not at outer levels (1, 2, etc.), even though there are few levels of nested regions. 2. The same region descriptor may be passed multiple times to ompt_callback_parallel_end.
1. ompt_state_wait_barrier_implicit_parallel has been used instead of deprecated ompt_state_wait_barrier_implicit 2. In function __ompt_get_task_info_internal, check if prev_team exists before getting master_id.
…urrently active task doesn't match the team of the currently active region.
…ould be the value of the thread_num argument when thread tries get the information about the outer region in which it is not involved.
…region, then assert thread_num == 0.
…a, before proclaiming the taskdata as th_current_task.
…alid. This is the hack that will make hpcrun work, since that tool is not interested to know the specific thread_num, but whether the thread is the master of the region at specified level.
…llback_sync_region.
…inside thr. TODO: Avoid using for loop in order to find thread_num in case when worker thread is creating the parallel region.
noted. we'll consider it after the merge. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
According to standard specification, parallel_data should belongs to the region that contains task at specified ancestor level.
However, this is not always true in the current implementation of the ompt_get_task_info entry point.
The following scenario should explain when this can happen:
In order to prevent this, check if the team of the currently active task matches the team of the currently active region. If this is not true, then the previously described scenario occurred, so try to access to the parent region team and use its parallel_data word.
Also, standard isn't clear about the case when the worker thread of the innermost parallel region calls this function with ancestor_level 1 (or any value greater than 0). Since the worker of the innermost parallel region doesn't belong to the outer parallel region, then the question is what should be the return value of thread_num argument? The old implementation would find the master thread of the innermost region and use thread_num of this thread inside outer region. This obiuosly isn't right. The proposal is to use some invalid value, as indication to the tool that this thread is not the part of the region at ancestor_level 1 (or greater).
It should be checked whether the ompt_get_task_info handles properly the case when the task (implicit or explicit) doesn't exists on the specified ancestor_level (e.g there's 3 nested implicit tasks, and thread calls the function with ancestor_level 10).