You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
I've been using this repo on my own dataset and I have encountered the problem with the loss suddenly hitting nan, even though it was converging nicely before (as in #198 )
After printing some things in the tensorflow graph I'm quite sure the error comes from weird values on box width and height, but I haven't managed to pinpoint it.
To check it I thought I'd try running the program eagerly with tf.compat.v1.enable_eager_execution() but it results in the error 'get_session' is not available when TensorFlow is executing eagerly.
Is it either possible to run it eagerly in some way or has anyone figured out the reason for the sudden nan-loss?
The text was updated successfully, but these errors were encountered:
If someone else runs into this issue, I found the nan-loss coming from the tf.sqrt gradient diverging close to zero (see this post ). I tackled this by adding a small epsilon value 1e-7 in dummy_loss in yolo.py.
Hi!
I've been using this repo on my own dataset and I have encountered the problem with the loss suddenly hitting nan, even though it was converging nicely before (as in #198 )
After printing some things in the tensorflow graph I'm quite sure the error comes from weird values on box width and height, but I haven't managed to pinpoint it.
To check it I thought I'd try running the program eagerly with
tf.compat.v1.enable_eager_execution()
but it results in the error'get_session' is not available when TensorFlow is executing eagerly.
Is it either possible to run it eagerly in some way or has anyone figured out the reason for the sudden nan-loss?
The text was updated successfully, but these errors were encountered: