You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ClashLuke opened this issue
May 17, 2022
· 0 comments
Labels
coreImproves core model while keeping core idea intactMLRequires machine-learning knowledge (can be built up on the fly)researchCreative project that might fail but could give high returns
Some have suggested that adding gradient noise helps deep models converge and generalise. Other works, such as DDPG, showed that this is the case even for shallow networks of a different domain. That's why it could be interesting for us to explore gradient noise as an option to improve generalisation and with that convergence by avoiding overfitting and other local minima during training.
One option to further improve gradient noise would be to combine it with #35, by adding different noise to each optimiser. This change would allow us to create combinations like Adam#Adam, where each optimiser sees slightly different noise at each step.
This issue tracks the progress of such a scheme.
The text was updated successfully, but these errors were encountered:
ClashLuke
added
research
Creative project that might fail but could give high returns
ML
Requires machine-learning knowledge (can be built up on the fly)
core
Improves core model while keeping core idea intact
labels
May 17, 2022
coreImproves core model while keeping core idea intactMLRequires machine-learning knowledge (can be built up on the fly)researchCreative project that might fail but could give high returns
Some have suggested that adding gradient noise helps deep models converge and generalise. Other works, such as DDPG, showed that this is the case even for shallow networks of a different domain. That's why it could be interesting for us to explore gradient noise as an option to improve generalisation and with that convergence by avoiding overfitting and other local minima during training.
One option to further improve gradient noise would be to combine it with #35, by adding different noise to each optimiser. This change would allow us to create combinations like Adam#Adam, where each optimiser sees slightly different noise at each step.
This issue tracks the progress of such a scheme.
The text was updated successfully, but these errors were encountered: