Different strategies during training and inference #8

King4819 · 2024-05-15T02:28:17Z

Seems that the code utilizes F.gumbel_softmax during training, but torch.argmin during inference

I want to ask that why using argmin instead of argmax ? I think the mask true should correspond to larger probability, so it should use argmax ?

Also, I have found that your code actually utilizes gumbel softmax at both training and inference stage, since "self.inference" argument is false at both training and inference stage.

Also, I discovered that when I modified to use argmin at inference stage, the performance will drastically drop.

Hope to get your response, thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different strategies during training and inference #8

Different strategies during training and inference #8

King4819 commented May 15, 2024 •

edited

Loading

Different strategies during training and inference #8

Different strategies during training and inference #8

Comments

King4819 commented May 15, 2024 • edited Loading

King4819 commented May 15, 2024 •

edited

Loading