-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial state in RNNs should not be learnable by default #807
Comments
Has there been any updates on this? When I'm using RNNs with |
@jeremiedb has this been fixed? |
Treating the initial state as learnable parameters is still the default behavior for RNN, nothing was changed in the latest PR. My position on the subject is however that the initial state should continue to be treated as learnable parameters. It's debatable whether one case is more prevalent to the other, on my end, for NLP or time-series, learnable has been the desired case. I could perhaps add a quick section in the docs about that initial state handling, given that I skipped discussing explicitly that question. |
Actually, an option to make learnable parameters a more first class citizen could be to use to same approach taken with the |
Unless the are some cases in which one needs a non-learnable non-zero initial state, it seems a good solution. # Non-trainable state0 for all RNN cell
trainable(m::RNNcell) = (m.Wi, m.Wh, m.b)
# Or in alternative
# exclude state0 from params only for a specific cell
ps = Flux.params(m)
delete!(ps, m.state0) should solve the problem |
Currently, the RNN cells are initialised as
param
's. (e.g. here and here). This causes the initial state to be modified during the backprop, which can in turn affect the model whenreset!
is called.The default behaviour of the initial cell state should be for it to stay constant and not to be affected by the backprop. Having it learned, as per now, is still useful in some contexts, so this should stay as an option
The text was updated successfully, but these errors were encountered: