Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FID scores spike up at the beginning of training #61

Open
ghost opened this issue Apr 5, 2021 · 9 comments
Open

FID scores spike up at the beginning of training #61

ghost opened this issue Apr 5, 2021 · 9 comments

Comments

@ghost
Copy link

ghost commented Apr 5, 2021

This is a plot of FID scores over time in ticks:

image

I'm wondering what is causing the initial spike upward.

I've tried a bunch of things to avoid it and the only thing that seems to work is setting the learning rate ridiculously low (1e-7).

The FID score increase appears to be associated with black patches in the images and a massive loss of filters in accordance with https://arxiv.org/abs/1908.03265.

The authors of the linked paper suggest a warmup schedule to avoid this but no warmup schedule seems to prevent the FID score increase.

Please advise.

@zsyzzsoft
Copy link
Collaborator

I think that this is a normal phase in training when the generator's initial progress cannot be reflected by the FID score.

@ghost
Copy link
Author

ghost commented Apr 5, 2021

I don't think the black patches associated with the FID increase are normal. Also, initially, FID scores go down (baseline FID for this is 375):

network-snapshot-000000        time 1m 29s       fid5k-train 329.7226
network-snapshot-000001        time 1m 28s       fid5k-train 407.1180

@zsyzzsoft
Copy link
Collaborator

zsyzzsoft commented Apr 5, 2021

How are the black patches? It can also be that the discriminator does not learn well at the beginning of training, so you can try training only the discriminator for a few iterations.

@ghost
Copy link
Author

ghost commented Apr 5, 2021

initially:
image
black patchs:
image
and afterwards what looks like early mode collapse to me:
image

@zsyzzsoft
Copy link
Collaborator

Hard to say the exact reason... But I feel that the short spike won't affect the performance as I think it understandable that training can be pretty random at the beginning.

@ghost
Copy link
Author

ghost commented Apr 5, 2021

My final FID scores are too high and my final images are too blurry on this dataset. On a different dataset I did not experience this issue at all and the images came out great. So I really do think I'm losing most of my filters.

@zsyzzsoft
Copy link
Collaborator

How does your dataset look like? It seems to me a more severe discriminator overfitting issue.

@ghost
Copy link
Author

ghost commented Apr 5, 2021

Each image looks something like this except the heights of the models are all the same:

image

And yes, I agree, it does seem like a severe discriminator overfitting issue -- how do I prevent it from overfitting?

@zsyzzsoft
Copy link
Collaborator

Looks like a challenging dataset... I think the model will not learn well if there are only hundreds of such images. DiffAugment can reduce discriminator overfitting by some degree, while it is possible to further reduce the problem by making the augmentations stronger and adding some other augmentations (e.g. resize).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant