A possible classifier-free guidance inconsistency - missing log_softmax before interpolation #494

avihu111 · 2024-10-09T16:27:34Z

Hi, thanks for a great repo!

In the AudioGen paper, the linear interpolation is done on the log probabilities.

In the code, however, it is done on the logits:

audiocraft/audiocraft/models/lm.py

Line 362 in adf0b04

logits = uncond_logits + (cond_logits - uncond_logits) * self.cfg_coef

If I understand correctly, logits and log_probs are not equivalent, as log_probs must satisfy:
torch.exp(log_probs).sum()==1 which is equivalent to torch.logsumexp(log_probs)=0
To get the log probabilities from logits we need to apply log_softmax, which ensures this property.

uncond_log_probs = torch.log_softmax(uncond_logits, dim=-1)
cond_log_probs = torch.log_softmax(cond_logits, dim=-1)
logits = uncond_log_probs + (cond_log_probs - uncond_log_probs) * self.cfg_coef

Using either logits or log_probs works quite well.
My tests show some benefits for applying log_softmax before interpolation.
Avihu

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A possible classifier-free guidance inconsistency - missing log_softmax before interpolation #494

A possible classifier-free guidance inconsistency - missing log_softmax before interpolation #494

avihu111 commented Oct 9, 2024

A possible classifier-free guidance inconsistency - missing log_softmax before interpolation #494

A possible classifier-free guidance inconsistency - missing log_softmax before interpolation #494

Comments

avihu111 commented Oct 9, 2024