Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Empty process offset computation in t8_cmesh_gather_treecount_ext is wrong in some cases #1082

Open
holke opened this issue Jun 7, 2024 · 2 comments
Assignees
Labels
Bug For a bug in the Code

Comments

@holke
Copy link
Collaborator

holke commented Jun 7, 2024

When constructing a cmesh.
If t8_cmesh_load is used and the current process is empty, then the cmesh sets its num_local_tree to 0, its first_tree to -1 and first_tree_shared to 0.

This can break the offset computation for empty processes in t8_cmesh_gather_treecount_ext, because t8_offset_next_nonempty_rank may identify an empty rank as nonempty in the case that multiple empty are in a row.

Note, that this behaviour may also happen in potential new cmesh construction methods (that do not exist yet, but use a similar logic to set the empty processes).

Here is how we can fix the behaviour:

  1. Write a test case that covers this beahviour and currently fails (i.e. load (or simulatie a load) a cmesh such that multiple empty processes are in a row).
  2. Do not use an offset value to indicate empty processes in t8_cmesh_gather_treecount_ext. Instead use a second array of bool/int with values 0 and 1 for empty or non-empty processes and Allgather that array as well.
  3. In the empty correction starting from use the new array to determine the next non-empty process instead of the function t8_offset_next_nonempty_rank
  4. Make the execution of this empty correction optional controlled by a function parameter, which set true by cmesh_load but set false by other methods (i.e. uniform_bounds)
  5. Outsource the code starting from in an own function t8_cmesh_offset_empty_correction
@holke holke added the Bug For a bug in the Code label Jun 7, 2024
@holke
Copy link
Collaborator Author

holke commented Jun 7, 2024

@Davknapp @lukasdreyer ping

@Davknapp Davknapp self-assigned this Jun 17, 2024
@lukasdreyer
Copy link
Collaborator

@Davknapp: I think there is a similar problem, for t8_forest_partition_create_tree_offsets after calling t8_forest_partition_given, when you have 2 empty procs next to each other.

In t8_forest_partition_given, the following lines set first and last local tree in a way, that does not fit the offset rule needed to compute num_local_trees correctly at a different place. Therefore, offset is computed to be 0, and not the treeid of the next tree.

This bug can be seen by executing
mpirun -n 8 ./test/t8_forest/t8_gtest_forest_commit --gtest_filter=*prism_cake*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For a bug in the Code
Projects
None yet
Development

No branches or pull requests

3 participants