Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can u finish the TODO: add checkpoints? #63

Open
tingxingdong opened this issue Aug 31, 2020 · 4 comments
Open

Can u finish the TODO: add checkpoints? #63

tingxingdong opened this issue Aug 31, 2020 · 4 comments

Comments

@tingxingdong
Copy link

https://github.com/krasserm/super-resolution/blob/master/train.py#L132

The GAN model training is long, and easily dies. Without checkpoints, everything has to restart. Thanks.

@Fmstrat
Copy link

Fmstrat commented Apr 13, 2021

This is probably going to be a must-have for #82

@krasserm
Copy link
Owner

@Fmstrat you can start with saving model weights as described here. EDSR+SRGAN training should stable enough i.e. I cannot confirm that it

easily dies

I have other high priorities at the moment and will implement it when I have more bandwidth. Hope that helps for the moment.

@dflateau
Copy link

dflateau commented Jul 27, 2021

If we were to hack through it ourselves, would we be trying to minimize discriminator or perceptual loss as a criteria for saving a checkpoint?

As it stands now, when training is complete, the weights that reside in the model is just the result of the last step taken?

@tvelk
Copy link

tvelk commented Jul 5, 2023

@krasserm I am trying to piece together the SrganTrainer checkpoint criteria and would greatly appreciate your feedback.

  1. Perceptual loss: This is sum of content loss and adversarial loss.
    • Content loss: Uses VGG for perceptual similarity instead of pixel-wise losses. Result always positive. It is the generator's goal to minimize.
    • Adversarial loss: (10^-3)(SUM(-log([Probability image is natural HR image]))). Result always positive. The higher the probability the images are HR, the smaller the adversarial loss. It is the generator's goal to minimize.
  2. Discriminator loss: Result always positive. It is the discriminator's goal to minimize.

Current implementation of train.py does not see either of these values interact, and are used solely for their respective gradient generation.

With the goal of setting criteria for creation of a checkpoint, it would seem we want to look for a low perceptual loss, and simultaneously high discriminator loss, which is not quite straightforward. Other posts I've seen state that the goal is equilibrium. In which case, a decrease in deviation by x of last y points might be a route to go?

Any help or insight would be greatly appreciated.
@dflateau Did you have any luck on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants