Add KL loss #11

borisdayma · 2022-07-25T16:56:02Z

KL loss can avoid using a codebook (with associated quantization loss).

borisdayma · 2022-07-31T23:29:44Z

Advantage of using codebook:

can be used for AR models
smaller pre-encoding (only small sequence of int)

Advantage of KL loss:

no quantization loss from codebook
no hyper-parameter for codebook size
no issue with unused codes
if we have a small codebook embedding, pre-encoding may also be feasible

What do you think @patil-suraj

patil-suraj · 2022-08-01T12:37:52Z

We can add KL loss for sure. But then it depends how do we want to use this model. If we are going to do diffusion, then we can train the KL model. But for auto-regressive model like dalle-mini we will need the VQ part.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add KL loss #11

Add KL loss #11

borisdayma commented Jul 25, 2022 •

edited

Loading

borisdayma commented Jul 31, 2022 •

edited

Loading

patil-suraj commented Aug 1, 2022

Add KL loss #11

Add KL loss #11

Comments

borisdayma commented Jul 25, 2022 • edited Loading

borisdayma commented Jul 31, 2022 • edited Loading

patil-suraj commented Aug 1, 2022

Add KL loss #11

Add KL loss #11

borisdayma commented Jul 25, 2022 •

edited

Loading

borisdayma commented Jul 31, 2022 •

edited

Loading