Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The loss suddenly becomes NAN and the model score becomes 0 during the training process #87

Open
ZZX10082 opened this issue Oct 25, 2024 · 1 comment

Comments

@ZZX10082
Copy link

ZZX10082 commented Oct 25, 2024

Thank you very much for your work. In the process of reproducing your work, I first used a single 4090GPU to train on the Kitti dataset for 25 epochs to ensure that the established training environment could run normally. Afterwards, I replaced the dataset with the Nuscenes dataset. Although the performance decreased after training for 25 epochs, it was still within an acceptable range. However, during the retraining process, it was found that my a1 parameter in Wanb changed to 0 during evaluation from the fourth to the fifth epoch, and when I retrained on the Kitti dataset, the same situation still occurred. May I ask if you have encountered any relevant situations and if you can give me some advice. @shariqfarooq123
{D957E408-3C40-41cb-B044-362ACFAB0A0E}

@ZZX10082
Copy link
Author

I referred to the method in the issue you replied to and modified the loss so that it cannot be 0, but it still doesn't work,Can you give me some advice? @shariqfarooq123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant