Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

null cache #768

Open
weidezhang opened this issue Aug 16, 2022 · 4 comments
Open

null cache #768

weidezhang opened this issue Aug 16, 2022 · 4 comments

Comments

@weidezhang
Copy link

weidezhang commented Aug 16, 2022

Hi,
I wonder if I can use alluxio to replace null cache as petastorm cache. do you see performance gain by using alluxio ?

@weidezhang
Copy link
Author

weidezhang commented Aug 16, 2022

also i wonder if the cache is shared across all trainer processes and across all epoches ? will the cache need to be reset after each training epoch ?

@selitvin
Copy link
Collaborator

I am not familiar with alluxio. Implementing a new cache version and plugging it in should be pretty easy. The cache is not automatically shared across processes. However, if you are using a shareabe storage (e.g. local disk for processes running on the same machine), then the cache will be shared (as long as the storage/cache-class implementation) properly handles race conditions.

@selitvin
Copy link
Collaborator

If you'd like to take a shot implementing an alluxio based cache, I would be happy to help with a review.

@weidezhang
Copy link
Author

thanks for the info. it's helpful. we will do some investigation and contribute back to community if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants