[BUG]: RuntimeError: Error while trying to find names to remove to save state dict #340

hoore · 2024-11-08T05:17:56Z

File "/Users/me/Documents/Amphion/models/tts/maskgct/gradio_demo.py", line 298, in load_models
safetensors.torch.load_model(codec_decoder, codec_decoder_ckpt)
File "/Users/me/mambaforge/envs/mg/lib/python3.10/site-packages/safetensors/torch.py", line 204, in load_model
to_removes = _remove_duplicate_names(model_state_dict, preferred_names=state_dict.keys())
File "/Users/me/mambaforge/envs/mg/lib/python3.10/site-packages/safetensors/torch.py", line 102, in _remove_duplicate_names
raise RuntimeError(
RuntimeError: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'model.head.istft.window'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue.

safetensors.torch.load_model(codec_decoder, codec_decoder_ckpt)
If I comment out this code, there will be no error. I have repeatedly confirmed that the downloaded model file is correct and complete. The same error occurs on both Windows and Mac systems.

hoore · 2024-11-08T05:18:57Z

Please check if there is any problem with the model file? thx

TKsavy · 2024-11-08T09:22:24Z

@hoore, can you try downloading the model files from Hugging Face directly instead of using the hf_hub_download function? I think the file was not downloaded properly. Try this approach and see whether it can solve your issue.

keepingitneil · 2024-11-09T05:02:02Z

I'm getting the same issue using a direct download. Encoder worked, just the decoder has this issue

hoore · 2024-11-09T07:46:03Z

@hoore, can you try downloading the model files from Hugging Face directly instead of using the hf_hub_download function? I think the file was not downloaded properly. Try this approach and see whether it can solve your issue.

yes，I also tried to download and overwrite manually, and sha 265 is also correct.

yuantuo666 · 2024-11-09T10:02:00Z

Hi, the MaskGCT is built in a Linux environment. For a better coding experience, it is recommended that Linux be used to reproduce.

Besides, according to the web page indicated in the error message, maybe you can try:

from safetensors.torch import load_model

load_model(model, "model.safetensors")
# Instead of model.load_state_dict(load_file("model.safetensors"))

GalenMarek14 · 2024-11-09T12:17:54Z

TKsavy and yuantuo666, I can run the project on my Windows 11, but I couldn't reproduce the demo page examples. What could be the problem? For example, the whisper voice example on the demo page: I downloaded the sample from there and generated the same text, but it always outputs something between a whisper and a low voice, whereas the demo page examples are successful clones. My generations are generally of lower quality, regardless of the steps I take; I've tried up to 100 iterations.

I've also tried every version, including this one, the Windows fork, and Google Colab (to try it on a Linux environment), but all of them produce inferior results compared to your examples. Are the shared models from a previous training point, by any chance? Are you able to reproduce those results with the current shared models?

This was my issue for this matter with detailed logs and outputs: #334

hoore · 2024-11-12T05:22:59Z

Hi, the MaskGCT is built in a Linux environment. For a better coding experience, it is recommended that Linux be used to reproduce.

Besides, according to the web page indicated in the error message, maybe you can try:
from safetensors.torch import load_model

load_model(model, "model.safetensors")
# Instead of model.load_state_dict(load_file("model.safetensors"))

I have the same problem on both Mac and Ubuntu, but my chip is ARM. Is it related to this?

hoore · 2024-11-12T16:32:51Z

I modified the code for loading the model, and now it can run on Macbook pro m3

replace load model:

from accelerate import load_checkpoint_and_dispatch

load_checkpoint_and_dispatch(semantic_codec, semantic_code_ckpt, device_map={"": "cpu"})
load_checkpoint_and_dispatch(codec_encoder, codec_encoder_ckpt, device_map={"": "cpu"})
load_checkpoint_and_dispatch(codec_decoder, codec_decoder_ckpt, device_map={"": "cpu"})

load_checkpoint_and_dispatch(t2s_model, t2s_model_ckpt, device_map={"": "cpu"})
load_checkpoint_and_dispatch(s2a_model_1layer, s2a_1layer_ckpt, device_map={"": "cpu"})
load_checkpoint_and_dispatch(s2a_model_full, s2a_full_ckpt, device_map={"": "cpu"})

iqrairfan100 · 2024-11-12T23:05:06Z

@hoore Thanks, modifying the gradio_demo.py file and load_model function with that code works on my M2 macbook air.

hoore added the bug Something isn't working label Nov 8, 2024

hoore changed the title ~~[BUG]:~~ [BUG]: RuntimeError: Error while trying to find names to remove to save state dict Nov 8, 2024

yuantuo666 mentioned this issue Nov 9, 2024

[BUG]: Lower quality than the examples on the demo page #334

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: RuntimeError: Error while trying to find names to remove to save state dict #340

[BUG]: RuntimeError: Error while trying to find names to remove to save state dict #340

hoore commented Nov 8, 2024

hoore commented Nov 8, 2024

TKsavy commented Nov 8, 2024

keepingitneil commented Nov 9, 2024

hoore commented Nov 9, 2024

yuantuo666 commented Nov 9, 2024

GalenMarek14 commented Nov 9, 2024

hoore commented Nov 12, 2024

hoore commented Nov 12, 2024

iqrairfan100 commented Nov 12, 2024

[BUG]: RuntimeError: Error while trying to find names to remove to save state dict #340

[BUG]: RuntimeError: Error while trying to find names to remove to save state dict #340

Comments

hoore commented Nov 8, 2024

hoore commented Nov 8, 2024

TKsavy commented Nov 8, 2024

keepingitneil commented Nov 9, 2024

hoore commented Nov 9, 2024

yuantuo666 commented Nov 9, 2024

GalenMarek14 commented Nov 9, 2024

hoore commented Nov 12, 2024

hoore commented Nov 12, 2024

iqrairfan100 commented Nov 12, 2024