Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TPU] Bug: Metadata path does not exist when using gs: #1652

Open
bjenik opened this issue Mar 4, 2025 · 0 comments
Open

[TPU] Bug: Metadata path does not exist when using gs: #1652

bjenik opened this issue Mar 4, 2025 · 0 comments
Labels
checkpoint type:bug Something isn't working

Comments

@bjenik
Copy link

bjenik commented Mar 4, 2025

I am getting a number of errors checking for folder existence in _src/metadata/checkpoint.py e.g. _src/metadata/checkpoint.py", line 45, in _sanitize_metadata_path raise FileNotFoundError(f'Path does not exist: {path}') when trying to create a gs: checkpoint on TPU (v6e, v2-alpha-tpuv6e). For some reason the error does not happen elsewhere (e.g. locally on a Mac).

Sample code:

import jax.numpy as jnp
import numpy as np
import orbax.checkpoint as ocp
path = "somebucket/somepath"
checkpointer = ocp.AsyncCheckpointer(ocp.PyTreeCheckpointHandler())
checkpointer.save(f"gs://{path}", (jnp.ones(10),), force=True)
print(checkpointer.restore(f"gs://{path}",(ocp.RestoreArgs(restore_type=np.ndarray),)))

Disabling those checks (patch attached) seems to resolve the issue, but is obviously more of a bandaid.

orbax.patch

@rajasekharporeddy rajasekharporeddy added type:bug Something isn't working checkpoint labels Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
checkpoint type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants