You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a reason why the default for eps in the adam optimizer is so high? Currently, it is 1e-3 [line 104 in shared_optim.py]. Usually, it's around 1e-08. Just wanted to see if this was done intentionally (e.g., it works better than when it is lower) or not.
The text was updated successfully, but these errors were encountered:
The epsilon value 1e-3 is actually often my default choice for adam optimizer and I find it helps with with stability. Although 1e-08 is often listed as default for adam its not a strongly suggested best choice and its commonly known to not be the best choice in many cases and in my experience has never been best choice in my various use cases.
The default value of 1e-8 for epsilon might not be a good default in general. For example, when training an Inception network on ImageNet a current good choice is 1.0 or 0.1.
Is there a reason why the default for eps in the adam optimizer is so high? Currently, it is 1e-3 [line 104 in shared_optim.py]. Usually, it's around 1e-08. Just wanted to see if this was done intentionally (e.g., it works better than when it is lower) or not.
The text was updated successfully, but these errors were encountered: