-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory and time requirements for Mistral-7B #68
Comments
Update --- The error comes from initializing torch.zeros(), below is the tracestack.
Upon debugging further, here are the values for
So basically the sequence length of Mistral is very large which doesn't allow for creation of tensor. Are there any suggestions to overcome this error? P.S: I think this issue is similar to #51 i.e support to Mistral models |
Hi,
I am trying to prune Mistral 7B (https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) and while I was able to successfully run the commands for magnitude pruning, I was facing issues with SparseGPT and Wanda.
Commands used:
python main.py --model 'mistralai/Mistral-7B-Instruct-v0.2' --prune_method sparsegpt --sparsity_ratio 0.1 --sparsity_type unstructured --save out/mistral_7b/unstructured/sparsegpt/0.1/ --save_model out/mistral_7b/unstructured/sparsegpt/0.1/
python main.py --model 'mistralai/Mistral-7B-Instruct-v0.2' --prune_method wanda --sparsity_ratio 0.1 --sparsity_type unstructured --save out/mistral_7b/unstructured/wanda/0.1/ --save_model out/mistral_7b/unstructured/wanda/0.1/
Any help here would be greatly appreciated :), tagging authors - @liuzhuang13 , @Eric-mingjie and @eltociear
The text was updated successfully, but these errors were encountered: