Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is multiprocessing forced to spawn processes rather than forking #779

Open
mbsabath opened this issue Jan 9, 2025 · 2 comments
Open
Labels
type/question An issue that's a question

Comments

@mbsabath
Copy link
Contributor

mbsabath commented Jan 9, 2025

❓ The question

Hi all, I'm working on a change to OLMo to support use of additional Pytorch Dataset Classes in our fork of OLMO, and I'm getting some OOM errors due to the use of process spawning rather than forming. I'm considering making process start method configurable, but wanted to understand more about the reasons for forcing all multiprocessing to be done with spawn before I went ahead with the change.

@mbsabath mbsabath added the type/question An issue that's a question label Jan 9, 2025
@aman-17
Copy link
Member

aman-17 commented Jan 28, 2025

The OOM errors you’re encountering might stem from increased memory usage associated with spawn due to the duplication of resources when processes are initialized(not 100% sure). We used Memmap implementation to minimize memory storage. Maybe you can reduce the workers/batch-size to ease up.

@dirkgr
Copy link
Member

dirkgr commented Jan 31, 2025

The main reason is that torch isn't safe when you fork the process. It should be, but it is not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/question An issue that's a question
Projects
None yet
Development

No branches or pull requests

3 participants