-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to a standard Random class #51
Comments
Btw shouldn't the weights be initialized by random numbers close to 0 and in the range [-1;1]? For me when I run I could fix it by adding with following snippet for the fully connected layer. But how is it supposed to be initialized?
|
They are actually (edit: within the [1,1] range). randomFullyConnected = do
s1 <- getRandom
s2 <- getRandom
let wB = randomVector s1 Uniform * 2 - 1
wN = uniformSample s2 (-1) 1
bm = konst 0
mm = konst 0
return $ FullyConnected (FullyConnected' wB wN) (FullyConnected' bm mm) The |
But yes, they could be scaled down a bit to not be uniform, and there could also be some more normalisation done. This is one of the reasons I am interested in being a bit smarter about the initialisation. |
True, you're right. I investigated the observed problem a little more. It happens to me when I use multiple |
So, I'm quite busy for another 2-3 weeks. After that I could do the ticket if that's ok for everyone. But I will have to look into literature on how to properly initialize the weights first. |
I think that should be fine. |
Hi @HuwCampbell You can find the current version here: https://github.com/schnecki/grenade Before going into intialization: I implemented a Coming back to Weight Initialization, see Grenade.Core.WeightInitialization and Grenade.Core.Layer. All layers now ask the Grenade.Core.WeightInitialization module for a random vector/matrix when initializing. Therefore, the actual generation of random data is encapsulated in that module and thus adding a new weight initialization method just requires changes in that module. Btw a simple experiment showed that weight initialization makes a huge difference, see the feedforwardweightinit.hs example and test different settings. My goal so far was to have backward compatibility, which worked out quite nicely. I moved the The class P.S.: Btw what training algorithm is implemented, Adam? |
At the moment the random initialisation is baked into the UpdateLayer class.
We could replace this with either
Variates
fromSystem.Random.MWC
orRandom
.The text was updated successfully, but these errors were encountered: