Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish the Llama2 sparsified models #30

Open
egeor opened this issue Dec 5, 2023 · 4 comments
Open

Publish the Llama2 sparsified models #30

egeor opened this issue Dec 5, 2023 · 4 comments

Comments

@egeor
Copy link

egeor commented Dec 5, 2023

Hi,

I was wondering if you plan to put in a public domain the sparsified Llama2 models. In particular I am interested in the Llama2-70B with 50% unstructured sparsity.

Thanks!

@Eric-mingjie
Copy link
Collaborator

The size of Llama2-70b is big. I think running our code repo on the llama2 model released on huggingface would take within minutes. Is there a reason for requesting a pruned model from our side?

@egeor
Copy link
Author

egeor commented Dec 5, 2023

The main reason is the resources that would be required for the actual pruning of the largest Llama2-70b model... is it a modern GPU with large memory? Or a DGX box? In either case such resources are in scarcity these days...

@Eric-mingjie
Copy link
Collaborator

Eric-mingjie commented Dec 5, 2023

Okay, i see. For LLaMA-2-70B, we used 5 or 6 (the exact number i don't recall) A6000 GPUs to load the model in fp16. There is a workaround if you only have one GPU with limited memory, you can load the model in CPU with fp16. Only when a layer/block is pruned, load it into the GPU. I think this is what SparseGPT did.

I can try to see if it is possible to release the pruned LLaMA-2-70b models. Not sure if there might be some licensing issues. Stay tuned.

@egeor
Copy link
Author

egeor commented Dec 5, 2023

Thanks a lot, please let me know when/if you are able to release the LLaMA-2-70b models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants