Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with t8_shmem #1109

Open
DamynChipman opened this issue Jul 2, 2024 · 5 comments · May be fixed by #1176
Open

Issues with t8_shmem #1109

DamynChipman opened this issue Jul 2, 2024 · 5 comments · May be fixed by #1176
Assignees
Labels
priority: high Should be solved as soon as possible workload: low Would take half a day or less

Comments

@DamynChipman
Copy link

When running make test and some of the tutorials, I see repeated errors and warnings about t8_shmem. One of the tests fails and any tutorial example with parallelism outputs a warning. Besides the failed test, it looks like all of the tutorials are successful regardless of the warning. I get these issues when building/running on my laptop and on a Linux cluster.

The main warning I see is the following:

[t8] WARNING: Trying to used shared memory but intranode and internode communicators are not set. You should call t8_shmem_init before initializing a shared memory array.

I guess my question/issue is: does this affect accuracy or performance?

I am reviewing t8code for JOSS: openjournals/joss-reviews#6887

For reference, here is some information on building, testing, and running the tutorials:

Build Info

cmake -B build-main -S . -DCMAKE_INSTALL_PREFIX=./build-main/local -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx

Testing

cd build-main
make -j
make -j test

See the attached test-output.txt for the output of make -j test ARGS="--rerun-failed --output-on-failure"

test-output.txt

Tutorials

Here is the output for the 3rd tutorial:

➜  tutorials git:(main) ✗ mpirun -n 4 ./t8_step2_uniform_forest
[libsc] This is libsc 2.8.5.999
[t8] This is t8 2.0.0-396-g758cb9903
[t8] CPP                      /opt/homebrew/bin/mpicxx
[t8] CPPFLAGS                 -Wall
[t8] CC                       /opt/homebrew/bin/mpicc
[t8] CFLAGS                   -Wall
[t8] LDFLAGS                  
[t8] LIBS                     Not available with CMake builds
[t8]  [step2] 
[t8]  [step2] Hello, this is the step2 example of t8code.
[t8]  [step2] In this example we build our first uniform forest and output it to vtu files.
[t8]  [step2] 
[t8]  [step2] Constructed coarse mesh with 2 prism trees.
[t8] WARNING: Trying to used shared memory but intranode and internode communicators are not set. You should call t8_shmem_init before setting the shmem type.
[t8] WARNING: Trying to used shared memory but intranode and internode communicators are not set. You should call t8_shmem_init before initializing a shared memory array.
[t8] WARNING: Trying to used shared memory but intranode and internode communicators are not set. You should call t8_shmem_init before setting the shmem type.
[t8] WARNING: Trying to used shared memory but intranode and internode communicators are not set. You should call t8_shmem_init before initializing a shared memory array.
[t8] WARNING: Trying to used shared memory but intranode and internode communicators are not set. You should call t8_shmem_init before setting the shmem type.
[t8] WARNING: Trying to used shared memory but intranode and internode communicators are not set. You should call t8_shmem_init before initializing a shared memory array.
[t8] Constructed uniform forest with 1024 global elements.
[t8]  [step2] Created uniform forest.
[t8]  [step2] Refinement level:			3
[t8]  [step2] Local number of elements:		256
[t8]  [step2] Global number of elements:	1024
[t8]  [step2] Wrote forest to vtu files:	t8_step2_uniform_forest*
[t8]  [step2] Destroyed forest.
@holke
Copy link
Collaborator

holke commented Jul 4, 2024

Hi @DamynChipman, thank you for reporting the issue.
are you by any chance using an M1 or M2 Mac processor? Possibly with OpenMPI?
We recently noticed that this combination seems to have issues with the MPI shared memory implementation.

Shared memory not being active does not result in accuracy loss.
It will increase the memory usage. However, the real use of shared memory for us kicks in when running on a cluster on > 1000 CPUs.

@DamynChipman
Copy link
Author

Yeah, my laptop is an M1 MacBook and I have OpenMPI installed. I ran into the same warnings and failed test when building, testing, and running on a Linux cluster as well however.

Sounds good, if something else shows up, I'll let you all know, thanks!

@holke holke reopened this Jul 5, 2024
@holke
Copy link
Collaborator

holke commented Jul 5, 2024

Thanks you.

I want to keep the issue open anyways so that we do not forget about it.
We should investigate and address these warnings in future.

@maelk3
Copy link
Collaborator

maelk3 commented Jul 15, 2024

The combination OpenMPI together with libsc's CMake build system seems to be the culprit for the warnings. Libsc's CMake build system checks for the symbol MPI_COMM_TYPE_SHARED using the function check_symbol_exists in the header file mpi.h. MPICH defines this symbol as a macro but OpenMPI defines it as part of an anonymous enum which check_symbol_exists does not check for. Thus the compile definition SC_ENABLE_MPICOMMSHARED is missing causing the warnings.

@holke
Copy link
Collaborator

holke commented Jul 22, 2024

Next tep Update the sc version to develop.

@holke holke added priority: high Should be solved as soon as possible workload: low Would take half a day or less labels Jul 22, 2024
@maelk3 maelk3 linked a pull request Jul 23, 2024 that will close this issue
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: high Should be solved as soon as possible workload: low Would take half a day or less
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants