-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PCU: Add nompi stub #425
base: develop
Are you sure you want to change the base?
PCU: Add nompi stub #425
Conversation
@cwsmith, build is failing for this PR. I think we should add |
@flagdanger and I are working on a very large update of PCU that should make this all easier by getting rid of the singleton. Pull request for that coming within the next week. |
@bobpaw sorting out a few things to make the pull request ready to merge, but #388 the draft pull request for replacing the singleton in PCU. Although we replaced the interface used in the test cases, this pull request is designed to work with no modification to existing codes by using a global PCU state when the old style interface is used with |
@jacobmerson, we will plan to use your PR for the future of core, but the sponsor wants this functionality soon, so we will use this branch until yours is finished. |
@bobpaw Does |
No, at the moment it only checks if PCU has been initialized using a global state variable. We could check with |
@bobpaw OK. I'm wondering if it will make sense to make the MPI check in that call, and if it fails call MPI_Init. But... handling MPI_Finalize could be a bit more tricky; other libraries may still be running and using MPI beyond the lifetime of PCU. I'd prefer symmetry in the interface so your original suggestion may be best. |
Added pcu_pnompi.c. Added SCOREC_NO_MPI CMake option. Removed direct MPI calls in parma, phasta, and PUMI. Added incomplete reimplementations of those MPI calls.
Add stubs for MPI_Init and MPI_Finalize. Add pcu_pnompi_types.h for MPI typedefs. Add status message about SCOREC_NO_MPI flag. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
Replace MPI_Comm_free with PCU_Comm_Free_One. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
This will avoid object collisions when linking to libraries that already use MPI. - i.e. the intended use case for SCOREC_NO_MPI. Signed-off-by: Aiden Woodruff <[email protected]>
Changed add_nompi_msg to record receiver. - Since there's no MPI, sender == receiver but I want to quiet the warnings. Signed-off-by: Aiden Woodruff <[email protected]>
Add return type for pcu_pmpi_free, pcu_pmpi_split. Signed-off-by: Aiden Woodruff <[email protected]>
Add linked list traversal to find a matching message. Removed error message that did not actually indicate errors. - If no message is found it's perfectly reasonable and in fact correct to return 0 and that's how pmpi knows to return false. Signed-off-by: Aiden Woodruff <[email protected]>
Run smoke tests without mpirun given SCOREC_NO_MPI. Add SCOREC_NO_MPI to the GitHub Actions test matrix. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
I use clock_gettime, which is one of the many methods mpich uses depending on the system. Signed-off-by: Aiden Woodruff <[email protected]>
Also update TriBITS package file. Signed-off-by: Aiden Woodruff <[email protected]>
- Add mpi_test_depends function to replace set_test_properties. This function checks if SCOREC_NO_MPI is defined and assumes nonexistent tests were multiproc ones. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
- parma/group/parma_group.cc(runInGroups): remove PCU/MPI free that will be carried out by pcu::PCU::~PCU. - test/xgc_split.cc: remove direct mpi.h inclusion - test/pumiLoadMesh.cc: remove direct mpi.h inclusion. - test/modelInfo.cc: remove direct #include <mpi.h>. - phasta/phstream.cc(::getTime): use new pcu::Time. - test/fieldReduce.cc: use correct PCUObj.Peers. - pumi/pumi_sys.cc(pumi_sync): use PCU Barrier instead of MPI Barrier. - pcu/PCU.h: make PCU_Comm (holdover) functions extern C and PCU_Wtime. - pumi/pumi_ghost.cc(do_off_part_bridge): fix typo (forgot parens). - pumi/pumi_mesh.cc: capitalize GetMPIComm. - pumi/pumi_numbering(pumi_numbering_print): use pcu::PCU::Barrier instead of WORLD barrier. - pcu/PCU_C.h: optionally include mpi or stub. - pcu/pcu_c.cc: add split stub and include pcu_pmpi.h. - pcu/pcu_pmpi.c: remove leftover conflict markers. - pcu/pcu_pnompi.h: update function signatures, make extern C, and remove obsolete stubs. - pcu/pcu_pnompi.c: remove globals and use pcu_mpi_t*. - clean up formatting. - remove unused or (void)ed argument names. Signed-off-by: Aiden Woodruff <[email protected]>
- pcu/PCU.cc(PCU::~PCU): clean up message order if it exists (valgrind is happy now). Signed-off-by: Aiden Woodruff <[email protected]>
- pcu/pcu_pmpi.c: remove pcu_pmpi_switch, pcu_pmpi_comm which were used by old nompi stubs. - (pcu_pmpi_allgather, pcu_pmpi_allreduce): break long lines and clean up indentation. Signed-off-by: Aiden Woodruff <[email protected]>
- pcu/PCU.h: remove undefined and unused PCU_GetGlobal. - pcu/PCU_C.h: remove undefined and unused PCU_Get_Global_Handle. Signed-off-by: Aiden Woodruff <[email protected]>
@cwsmith I have made the changes you and Jacob requested. Can you review when you get the chance? |
@bobpaw Here is a comment that discusses running the delta wing regression test I mentioned earlier: |
Do I need to worry about about timing? Or should I just compare the quality file? |
We should check timing and quality. |
@cwsmith, adapt completed in 5761.497560 seconds. How do I measure the resulting length_proc.txt and quality_proc.txt files? I see there's a histogram utility but it wants min, max, nbins. |
@bobpaw Hmmm. I'm guessing that was a serial run? If so, I only have prior results for a 32 processes (see #433 (comment)). Regarding quality, if the final entity counts are similar we should be OK. We could also render the adapted mesh to sanity check. |
No, this was a 32proc run on wilbur. |
number of tet 5934317 hex 0 prism 0 pyramid 0 |
But it was a debug build without optimization so let me rerun with release optimizations. |
For the record, wilbur and piglet have the same hardware. |
Wow, I'm surprised how much optimizations improved the runtime: MeshAdapt: mesh adapted in 308.636508 seconds |
That time looks good. Can you please paste/post your cmake config command and the output log? |
config/deltaWing.sh: #!/bin/sh
build_id=deltaWing
if [ $# -gt 0 ]; then
build_id=$1
fi
cmake -S . -B build/$build_id \
-GNinja \
-DCMAKE_INSTALL_PREFIX="$(realpath install)/$build_id" \
-DCMAKE_BUILD_TYPE="Release" \
-DCMAKE_C_COMPILER="mpicc" \
-DCMAKE_CXX_COMPILER="mpicxx" \
-DSCOREC_CXX_SYMBOLS=OFF \
-DSCOREC_CXX_OPTIMIZE=ON \
-DSCOREC_CXX_WARNINGS=ON \
-DBUILD_SHARED_LIBS=ON \
-DBUILD_EXES=ON \
-DIS_TESTING=ON \
-DENABLE_ZOLTAN=ON \
-DENABLE_SIMMETRIX=ON -DSKIP_SIMMETRIX_VERSION_CHECK=ON \
-DENABLE_FIELDSIM=OFF -DSIM_MPI=mpich4.1.1 -DSIM_ACIS=ON -DSIM_PARASOLID=ON \
-DPCU_COMPRESS=ON run_adapt.log
|
The timing and counts look good. Here are the final counts from my old run on piglet:
|
Ok, great! Anything else I need to do before PR is approved? |
All set from my end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thank you. I left a couple comments below.
Do you think I should rename pcu::PCU_Init to pcu::Init? same suggestion for finalize. The C api I would keep the same. I think it's still namespaced and sorter is nicer. |
Either way is fine with me. Another change request: we need to increase the pumi version number to 4.0.0 in CMakeLists.txt ( Line 9 in 17ad033
|
Is there anything we want to bundle with this major release? |
- pcu/PCU.h (pcu::PCU_Init): change return type to void. - remove @return clause. - (pcu::PCU_Finalize): change return type to void. - remove @return clause. - pcu/PCU.cc (pcu::PCU_Init): only call MPI_Init if not previously initialized. - (pcu::PCU_Finalize): only finalize if not previously finalized. Signed-off-by: Aiden Woodruff <[email protected]>
- pcu/PCU.h: rename PCU_Init to Init and PCU_Finalize to Finalize. - pcu/PCU.cc: rename functions. - pcu/PCU_C.h: add PCU_Init and PCU_Finalize to call pcu::Init and pcu::Finalize. - pcu/pcu_c.cc: add PCU_Init and PCU_Finalize definitions. - test/*.cc phasta/*.cc: replace PCU_Init with Init and PCU_Finalize with Finalize. Signed-off-by: Aiden Woodruff <[email protected]>
Not that I can think of. |
/runtests |
@bobpaw Once the pumi version is increased we should be good to go. Thanks for all the work on this. |
Signed-off-by: Aiden Woodruff <[email protected]>
Build Log |
@bobpaw The CI tests caught what I'm guessing is a missing header?
|
Add nompi stub
Add a stub implementation of MPI in
pcu/pcu_pnompi.c
.Remove direct calls to MPI functions.
This adds a
SCOREC_NO_MPI
build option for users running on single-process, single-machine setups that do not want to compile with MPI. Message passing is simulated in memory.