-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why initialize the first line of embedding matrix as 0? t.zeros(1, self.embedding_size) #6
Comments
Uh, I wanted to set the line on |
@theeluwin Oh, I just do not understand your initialization. I suppose that there must be some reasons which I do not know, so I want to ask about it. And I rewrote a copy of your own code but it doesn't work, do you mind to look at my code help me check the error,thank you for your replay. |
|
Hi @theeluwin, here the padding word you mean is, of course, the "unknown word" right? And is it normal to initialize it by zeros? BTW, thanks for sharing your git. I am a newbie in language stuff and it really helps me a lot! :) |
@GyeongJoHwang Normally, the "unknown word", or UNK token denotes rare word rather than padding, so making it 0 might be a solution but usually we use random vector or mean vector of all vocabulary. Padding tokens are like, when you want to make some sentences to have equal length (for batch training or so), for example, to make 13-length-ed sentence into 20-length-ed, you should "pad" 7 tokens with, probably zero. |
No description provided.
The text was updated successfully, but these errors were encountered: