Provide `KokkosComm::initialize`/`finalize` #88

dssgabriel · 2024-06-14T09:09:23Z

This PR implements a very basic initialization and finalization for KokkosComm (closes #82).

Users should now only call KokkosComm::initialize and KokkosComm::finalize, no more manual MPI and Kokkos initialization. Using these functions ensures MPI is correctly initialized/finalized (with multiple thread support) before Kokkos is initialized/finalized.

To-do before merging:

add documentation on these functions
make it clear to users that they should only call KokkosComm when initializing/finalizing a Kokkos + MPI application

cedricchevalier19

Some thoughts, and be precise in the documentation.

src/KokkosComm.hpp

aprokop

I think it would be nice to have KokkosComm::ScopeGuard that is similar to Kokkos::ScopeGuard in addition to finalize and initialize. That way, a user won't have to remember to call finalize.

dssgabriel · 2024-06-14T11:14:34Z

I think it would be nice to have KokkosComm::ScopeGuard that is similar to Kokkos::ScopeGuard

Should this and other utilities (e.g. is_initialized) be added here or in another PR?

aprokop · 2024-06-14T11:53:25Z

Should this and other utilities (e.g. is_initialized) be added here or in another PR?

Easier in a different PR to keep the PR size small.

src/KokkosComm.hpp

KokkosComm should now be initialized (and finalized) these new functions. Users now must _not_ initialize/finalize MPI, nor Kokkos by themselves.

Also keep a version without required MPI thread level support for simpler usage

src/KokkosComm.hpp

dssgabriel · 2024-06-14T13:49:20Z

Last thing we need to think about is: do we abort if we failed to get the desired thread support level?

src/KokkosComm.hpp

devreal · 2024-06-14T13:57:11Z

src/KokkosComm.hpp

+  MPI_Initialized(&flag);
+  // Eagerly abort if MPI has already been initialized
+  if (0 != flag) {
+    MPI_Abort(MPI_COMM_WORLD, -1);


Why not allow MPI to be initialized already as long as the requested thread-level matches?

src/KokkosComm.hpp

…finalize}`

aprokop · 2024-06-14T15:09:39Z

src/KokkosComm.hpp

+  // Forbid calling this function if Kokkos has already been initialized
+  if (Kokkos::is_initialized()) {
+    std::abort();
+  }


You don't need this, Kokkos::initialize() will do that for you.

I am not sure I understand. We want to abort if Kokkos::initialize has already been called. Why try to initialize MPI and Kokkos if we can eagerly prove we got illegally called?

Because Kokkos will do that for you, I believe. I'm not sure MPI will.

I find it weird that a subproject subsumes the initialization of the main project. I should be able to initialize main Kokkos and KokkosComm from my application. The KokkosComm initializer should initialize main Kokkos if it's not already initialized and otherwise handle that gracefully. Same with MPI above.

I don't understand your words about "handle that gracefully". Kokkos initialization will abort if it has been initialized/finalized previously:
https://github.com/kokkos/kokkos/blob/892e13c8c49fa9d4ef9f8dcb69f90d525a6baa58/core/src/impl/Kokkos_Core.cpp#L1056-L1061.
The code inside KokkosComm::initialize() checking Kokkos initialization/finalization does not add anything, unless you decide to skip initializing Kokkos, which I argue is a wrong thing to do.

Kokkos::initialize does some special checks for MPI

Are you sure about this? I only see detection of the local MPI rank based on environment variables, but not calls to any MPI routines.

Kokkos docs mention one should call MPI_Init before Kokkos::initialize: https://kokkos.org/kokkos-core-wiki/ProgrammingGuide/Initialization.html?highlight=initialize#initialization

We do not prevent users from doing things manually.

You are telling people not to initialize your project if they initialize its dependencies manually. That is flawed.

However, the herein proposed API provides a straightforward and foolproof way of initializing/finalizing their KokkosComm application.

I am not arguing to take that away. If users want to shift all initialization to KokkosComm they can do so.

Moreover, I don’t understand why we would let people write fundamentally wrong code and silence it.

Why is that fundamentally wrong? You can still bail if they did it wrong but there is no reason to always bail even if the configuration they chose is correct (e.g., MPI was initialized with the same thread-level.)

Finally, I'd like to point out that this is only a first step towards KokkosComm initialization/finalization. Once #68 gets merged, we will have an isolated environment (thanks to MPI Sessions) that lets us lift most of — if not all — the constraints required for now.

Then let's not tell them one thing now and another then. We can get close to the semantics we will have then and users will not have to change their code. Keep in mind that what you are proposing now forces users to change their code such that later when KokkosComm uses Sessions the WPM will be broken and no communication outside of KokkosComm will work anymore. Or KokkosComm will be broken because users stuck to manually initializing MPI to get the WPM. We can do better than that.

This PR does not have the state, only free functions. Introducing the state has its own challenges, I think.

The approach I would use is an inline accessor to avoid explicit global variables that need to go into a compilation unit. In some header file:

inline bool& get_state_ref() { static bool state = false; return state; }

This PR aimed at unifying two calls — MPI_{Init,Init_thread} and Kokkos::initialize — into a single one.

You are telling people not to initialize your project if they initialize its dependencies manually. That is flawed.

It means that instead of having two calls, users may now need a third one.

Why is that fundamentally wrong? You can still bail if they did it wrong but there is no reason to always bail even if the configuration they chose is correct (e.g., MPI was initialized with the same thread-level.)

OK, that's fair.
Once MPI has been initialized, is it possible to retrieve the current thread support level? I can't find anything in OpenMPI's docs, but this is needed to check that the users' requirements are met in KokkosComm::initialize and error out if needed, as you've suggested.

Keep in mind that what you are proposing now forces users to change their code such that later when KokkosComm uses Sessions the WPM will be broken and no communication outside of KokkosComm will work anymore.

Once #68 gets merged, existing code will most likely have to change anyway. E.g. all of our tests use MPI_COMM_WORLD, which won't be valid in the SPM. Besides, we will want to provide some object (KokkosComm::Context or Handle) that wraps the MPI_Comm, Kokkos::ExecutionSpace, and maybe the associated stream if comms happen on a device. Our API will then use this object instead of a raw MPI_Comm.
All of our current interfaces are still subject to change. IMO it's ok if we tweak the semantics for now, although I agree it's best to get this PR closer to what we're aiming for in the future. 👍

This PR does not have the state, only free functions. Introducing the state has its own challenges, I think.

The approach I would use is an inline accessor to avoid explicit global variables that need to go into a compilation unit.

I don't think we need to carry any state for the moment, we're only calling MPI_Init_thread and Kokkos::initialize. Adding KokkosComm::is_{initiliazed,finalized} should be enough for now I think.

Once MPI has been initialized, is it possible to retrieve the current thread support level?

The function you're looking for is MPI_QUERY_THREAD :)

It means that instead of having two calls, users may now need a third one.

I don't see the problem, esp if parts of my application rely on the WPM anyway and I have to ensure that the WPM is initialized. I cannot rely on KokkosComm to do that if I know that at some point the behavior will change.

Anyway, maybe this PR isn't ready yet and needs to be tied to #68. Let's discuss this during the Monday call, I will try to join (the MPI Forum potentially conflicts with the call).

aprokop · 2024-06-14T15:10:06Z

src/KokkosComm.hpp

+  // Forbid calling this function if Kokkos has already been finalized or isn't yet initialized
+  if (Kokkos::is_finalized() || !Kokkos::is_initialized()) {
+    MPI_Abort(MPI_COMM_WORLD, -1);
+  }


Don't need it, Kokkos::finalize() will do that for you.

Same but for finalization. Also, this lets us cleanly abort on all ranks.

cedricchevalier19 · 2024-06-24T14:54:40Z

It was decided not to continue and to provide stubs for users to initialize correctly.

dssgabriel added the enhancement New feature or request label Jun 14, 2024

dssgabriel requested review from masterleinad, devreal and cedricchevalier19 June 14, 2024 09:09

dssgabriel self-assigned this Jun 14, 2024

cedricchevalier19 reviewed Jun 14, 2024

View reviewed changes

src/KokkosComm.hpp Outdated Show resolved Hide resolved

src/KokkosComm.hpp Outdated Show resolved Hide resolved

src/KokkosComm.hpp Outdated Show resolved Hide resolved

src/KokkosComm.hpp Outdated Show resolved Hide resolved

aprokop reviewed Jun 14, 2024

View reviewed changes

dssgabriel force-pushed the simple-init branch from 40f7aae to 64c4fd4 Compare June 14, 2024 12:01

aprokop reviewed Jun 14, 2024

View reviewed changes

src/KokkosComm.hpp Outdated Show resolved Hide resolved

src/KokkosComm.hpp Outdated Show resolved Hide resolved

src/KokkosComm.hpp Outdated Show resolved Hide resolved

src/KokkosComm.hpp Outdated Show resolved Hide resolved

dssgabriel added 8 commits June 14, 2024 15:23

feat(init/fini): add basic initialization and finalization

6167b91

KokkosComm should now be initialized (and finalized) these new functions. Users now must _not_ initialize/finalize MPI, nor Kokkos by themselves.

test(init/fini): update tests with new initialization/finalization APIs

e055c01

docs(init/fini): add documentation + tiny fixes

d098a72

fix(init): remove checks for printing stuff

0baa4ca

feat(init): let users provide their thread level + simplify signature

5c5ed7e

Also keep a version without required MPI thread level support for simpler usage

fix(init): fix tests to match new function prototype

8987ce5

fix: only filter all --kokkos- flags + format

a68902e

docs(init): update with new prototypes

4868c87

dssgabriel force-pushed the simple-init branch from 2cf2cbc to 4868c87 Compare June 14, 2024 13:25

cedricchevalier19 reviewed Jun 14, 2024

View reviewed changes

src/KokkosComm.hpp Outdated Show resolved Hide resolved

devreal reviewed Jun 14, 2024

View reviewed changes

aprokop reviewed Jun 14, 2024

View reviewed changes

src/KokkosComm.hpp Outdated Show resolved Hide resolved

dssgabriel added 5 commits June 14, 2024 17:03

feat: add wrapper for MPI_THREAD_* support levels

f3c838c

fix: revert filter on --kokkos-* to only filter --kokkos-help flags

cfdd01e

fix: prevent overlap w/ MPI_{Init,Finalize} & `Kokkos::{initialize,…

ee700c6

…finalize}`

docs: clarify valid usage and document ThreadSupportLevel wrapper

cfefa4e

fix: suggestion from Andrey Prokopenko (@aprokop)

d3b00de

aprokop reviewed Jun 14, 2024

View reviewed changes

dssgabriel requested a review from cedricchevalier19 June 14, 2024 18:45

dssgabriel marked this pull request as draft June 17, 2024 16:39

cedricchevalier19 closed this Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide `KokkosComm::initialize`/`finalize` #88

Provide `KokkosComm::initialize`/`finalize` #88

dssgabriel commented Jun 14, 2024 •

edited

Loading

cedricchevalier19 left a comment

aprokop left a comment

dssgabriel commented Jun 14, 2024

aprokop commented Jun 14, 2024

dssgabriel commented Jun 14, 2024

devreal Jun 14, 2024

aprokop Jun 14, 2024

dssgabriel Jun 14, 2024

aprokop Jun 14, 2024

devreal Jun 14, 2024

aprokop Jun 14, 2024

aprokop Jun 15, 2024 •

edited

Loading

dssgabriel Jun 15, 2024

devreal Jun 15, 2024

dssgabriel Jun 15, 2024 •

edited

Loading

devreal Jun 16, 2024

aprokop Jun 14, 2024

dssgabriel Jun 14, 2024

cedricchevalier19 commented Jun 24, 2024

Provide KokkosComm::initialize/finalize #88

Provide KokkosComm::initialize/finalize #88

Conversation

dssgabriel commented Jun 14, 2024 • edited Loading

cedricchevalier19 left a comment

Choose a reason for hiding this comment

aprokop left a comment

Choose a reason for hiding this comment

dssgabriel commented Jun 14, 2024

aprokop commented Jun 14, 2024

dssgabriel commented Jun 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aprokop Jun 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dssgabriel Jun 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cedricchevalier19 commented Jun 24, 2024

Provide `KokkosComm::initialize`/`finalize` #88

Provide `KokkosComm::initialize`/`finalize` #88

dssgabriel commented Jun 14, 2024 •

edited

Loading

aprokop Jun 15, 2024 •

edited

Loading

dssgabriel Jun 15, 2024 •

edited

Loading