Implementation of hinge loss does not match the paper #7

Yingdong-Hu · 2021-02-07T10:16:49Z

Hello Kipf, I find there is a discrepancy between the loss mentioned in the paper.

According to Eq(5) in paper, for negative samples, you calculate the Euclidean distance between negative state sample at timestep t and state at timestep t+1.

However, in the code below, state and neg_state are both at timestep t.

self.neg_loss = torch.max(
    zeros, self.hinge - self.energy(
        state, action, neg_state, no_trans=True)).mean()

I noticed that the same question was also asked here.

I want to know if this is a bug ? Does the discrepancy affect the final performance ?

The text was updated successfully, but these errors were encountered:

tkipf · 2021-02-07T15:50:41Z

Thanks for your comment. This is indeed a (small) difference and it should not affect performance in any significant way. I expect that one could show that both formulations are equivalent for a system where the probability of visiting any state in the MDP is the same (in expectation) -- which should be the case for the randomly initialized block pushing tasks and the physics simulation). For the Atari games this assumption is violated however and there could be (minor) differences in performance between either of the two implementations, but I would expect them to be smaller than the variance you get from the parameter initialization. If you happen to test this experimentally, please share your results here :)

…

On Sun, Feb 7, 2021 at 11:17 AM Alex ***@***.***> wrote: Hello Kipf, I find there is a discrepancy between the loss mentioned in the paper. According to Eq(5) in paper, for negative samples, you calculate the Euclidean distance between negative state sample at timestep t and state at timestep t+1. However, in the code below, state and neg_state are both at timestep t. self.neg_loss = torch.max( zeros, self.hinge - self.energy( state, action, neg_state, no_trans=True)).mean() I noticed that the same question was also asked here <#3>. I want to know if this is a bug ? Does the discrepancy affect the final performance ? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABYBYYCJMHLDCPCGIYKGDL3S5ZSCBANCNFSM4XHJKOYQ> .

Yingdong-Hu · 2021-02-24T03:34:45Z

Hi, thanks for your answer.
I understand that that both formulations are equivalent.
But I have another question:
For the Atari games, even if I set the same random seed, multiple executions of the application behave differently.
According to my understanding, when I set the same seed, multiple calls to the program will produce the same result.
Are there other nondeterministic operations in the code that cause the randomness ?

tkipf · 2021-02-24T11:37:01Z

Yes, I believe either the frame_skip between the individual actions (or for how many frames an action is applied) is not fully deterministic in the gym environment that our codebase uses. If you happen to have more insight into this, please feel free to share it here.

…

On Wed, Feb 24, 2021 at 4:34 AM Alex ***@***.***> wrote: Hi, thanks for your answer. I now understand that that both formulations are equivalent. But I have another question: For the Atari games, even if I set the same random seed, multiple executions of the application behave differently. According to my understanding, when I set the same seed, multiple calls to the program will produce the same result. Are there other nondeterministic operations in the code that cause the randomness ? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABYBYYCQ3FYYXDUHQ75IVWLTARXWDANCNFSM4XHJKOYQ> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of hinge loss does not match the paper #7

Implementation of hinge loss does not match the paper #7

Yingdong-Hu commented Feb 7, 2021

tkipf commented Feb 7, 2021 via email •

edited

Loading

Yingdong-Hu commented Feb 24, 2021 •

edited

Loading

tkipf commented Feb 24, 2021 via email

Implementation of hinge loss does not match the paper #7

Implementation of hinge loss does not match the paper #7

Comments

Yingdong-Hu commented Feb 7, 2021

tkipf commented Feb 7, 2021 via email • edited Loading

Yingdong-Hu commented Feb 24, 2021 • edited Loading

tkipf commented Feb 24, 2021 via email

tkipf commented Feb 7, 2021 via email •

edited

Loading

Yingdong-Hu commented Feb 24, 2021 •

edited

Loading