You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following the paper, the above should be replaced by loss += -1 * log_prob * (reward.detach() * (self.mask[:, t].float()))
Not having a .detach() on the reward here provides another source of gradients to the feature regression module in addition to the feature loss, the only difference being these gradients are scaled by the log-probs, which does not seem to mean anything intuitively.
The text was updated successfully, but these errors were encountered:
visdial-rl/visdial/models/decoders/gen.py
Line 243 in 1fb7e88
Following the paper, the above should be replaced by
loss += -1 * log_prob * (reward.detach() * (self.mask[:, t].float()))
Not having a
.detach()
on the reward here provides another source of gradients to the feature regression module in addition to the feature loss, the only difference being these gradients are scaled by the log-probs, which does not seem to mean anything intuitively.The text was updated successfully, but these errors were encountered: