Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: RuntimeError: Error while trying to find names to remove to save state dict #340

Open
hoore opened this issue Nov 8, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@hoore
Copy link

hoore commented Nov 8, 2024

File "/Users/me/Documents/Amphion/models/tts/maskgct/gradio_demo.py", line 298, in load_models
safetensors.torch.load_model(codec_decoder, codec_decoder_ckpt)
File "/Users/me/mambaforge/envs/mg/lib/python3.10/site-packages/safetensors/torch.py", line 204, in load_model
to_removes = _remove_duplicate_names(model_state_dict, preferred_names=state_dict.keys())
File "/Users/me/mambaforge/envs/mg/lib/python3.10/site-packages/safetensors/torch.py", line 102, in _remove_duplicate_names
raise RuntimeError(
RuntimeError: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'model.head.istft.window'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue.

safetensors.torch.load_model(codec_decoder, codec_decoder_ckpt)
If I comment out this code, there will be no error. I have repeatedly confirmed that the downloaded model file is correct and complete. The same error occurs on both Windows and Mac systems.

@hoore hoore added the bug Something isn't working label Nov 8, 2024
@hoore
Copy link
Author

hoore commented Nov 8, 2024

Please check if there is any problem with the model file? thx

@hoore hoore changed the title [BUG]: [BUG]: RuntimeError: Error while trying to find names to remove to save state dict Nov 8, 2024
@TKsavy
Copy link

TKsavy commented Nov 8, 2024

@hoore, can you try downloading the model files from Hugging Face directly instead of using the hf_hub_download function? I think the file was not downloaded properly. Try this approach and see whether it can solve your issue.

@keepingitneil
Copy link

I'm getting the same issue using a direct download. Encoder worked, just the decoder has this issue

@hoore
Copy link
Author

hoore commented Nov 9, 2024

@hoore, can you try downloading the model files from Hugging Face directly instead of using the hf_hub_download function? I think the file was not downloaded properly. Try this approach and see whether it can solve your issue.

yes,I also tried to download and overwrite manually, and sha 265 is also correct.

@yuantuo666
Copy link
Collaborator

Hi, the MaskGCT is built in a Linux environment. For a better coding experience, it is recommended that Linux be used to reproduce.

Besides, according to the web page indicated in the error message, maybe you can try:

from safetensors.torch import load_model

load_model(model, "model.safetensors")
# Instead of model.load_state_dict(load_file("model.safetensors"))

@GalenMarek14
Copy link

TKsavy and yuantuo666, I can run the project on my Windows 11, but I couldn't reproduce the demo page examples. What could be the problem? For example, the whisper voice example on the demo page: I downloaded the sample from there and generated the same text, but it always outputs something between a whisper and a low voice, whereas the demo page examples are successful clones. My generations are generally of lower quality, regardless of the steps I take; I've tried up to 100 iterations.

I've also tried every version, including this one, the Windows fork, and Google Colab (to try it on a Linux environment), but all of them produce inferior results compared to your examples. Are the shared models from a previous training point, by any chance? Are you able to reproduce those results with the current shared models?

This was my issue for this matter with detailed logs and outputs: #334

@hoore
Copy link
Author

hoore commented Nov 12, 2024

Hi, the MaskGCT is built in a Linux environment. For a better coding experience, it is recommended that Linux be used to reproduce.

Besides, according to the web page indicated in the error message, maybe you can try:

from safetensors.torch import load_model

load_model(model, "model.safetensors")
# Instead of model.load_state_dict(load_file("model.safetensors"))

I have the same problem on both Mac and Ubuntu, but my chip is ARM. Is it related to this?

@hoore
Copy link
Author

hoore commented Nov 12, 2024

I modified the code for loading the model, and now it can run on Macbook pro m3

replace load model:

from accelerate import load_checkpoint_and_dispatch

load_checkpoint_and_dispatch(semantic_codec, semantic_code_ckpt, device_map={"": "cpu"})
load_checkpoint_and_dispatch(codec_encoder, codec_encoder_ckpt, device_map={"": "cpu"})
load_checkpoint_and_dispatch(codec_decoder, codec_decoder_ckpt, device_map={"": "cpu"})

load_checkpoint_and_dispatch(t2s_model, t2s_model_ckpt, device_map={"": "cpu"})
load_checkpoint_and_dispatch(s2a_model_1layer, s2a_1layer_ckpt, device_map={"": "cpu"})
load_checkpoint_and_dispatch(s2a_model_full, s2a_full_ckpt, device_map={"": "cpu"})

@iqrairfan100
Copy link

@hoore Thanks, modifying the gradio_demo.py file and load_model function with that code works on my M2 macbook air.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants