Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem exchanging margo addresses in SLURM PMI2 #154

Open
adammoody opened this issue Aug 13, 2021 · 1 comment
Open

Problem exchanging margo addresses in SLURM PMI2 #154

adammoody opened this issue Aug 13, 2021 · 1 comment

Comments

@adammoody
Copy link

I was curious whether you have seen this.

LLNL/UnifyFS#626 (comment)

I think that SLURM PMI2 may be internally using the ; character to split a string into a list of key/value pairs. Margo address use the ; character, so then a put like:

PMI2_KVS_Put("unifyfs.margo-svr", "ofi+tcp;ofi_rxm://123.123.123.123:55555")

to exchange margo addresses via SLURM PMI2 produces an error message like:

slurmstepd: error: mpi/pmi2: no value for key ;ofi_rxm://123.123.123.123:55555; in req

I worked around this with a hack to replace ; with ! before submitting the key/values to PMI2, and then converting back after pulling the value back out of PMI2.

If that's what is going on, it's more a SLURM issue, but I just bring it up since other margo users might hit it.

@mdorier
Copy link
Contributor

mdorier commented Aug 13, 2021

You should be able to replace the whole ofi+tcp;ofi_rxm part with just tcp and Mercury is going to figure out the rest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants