You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for sharing your implementation. This helps me a lot.
I just wonder the way you initialize the hidden state for the question second question onward. Precisely, in the "def train_data(mini_batch, targets, word_attn_model, sent_attn_model, word_optimizer, sent_optimizer, criterion):" function (in the "attention_model_validation_experiments" notebook), you currently use a loop over the sentence: "_s, state_word, _ = word_attn_model(mini_batch[i,:,:].transpose(0,1), state_word)". That means, both "the forward and backward states of the last word in the sentence i" are used for initializing the forward and backward states of sentence i+1. I can understand the case for forward state as the two sentence are consecutive, but the backward state initialization seems not very reasonable.
Can you please explain this in more detail? Thanks.
The text was updated successfully, but these errors were encountered:
I'm sorry that I don't understand what you are asking for. You will have to initialise both the forward and the backward states initially to start the training process.
Please revert back to me with a bit more clarity so that I will be able to help you out.
Lets use last_h_S = (last_h_forward, last_h_backward) to denote the hidden states of the last word in sentence number S, and use init_h_[S+1] to denote the init hidden states of the sentence number S + 1.
From the code, I understand that you assign init_h_[S+1] = last_h_S = (last_h_forward, last_h_backward) (am I right?). Should it be more reasonable to set init_h_[S+1] = (last_h_forward, 0) ?
Hi,
Thanks for sharing your implementation. This helps me a lot.
I just wonder the way you initialize the hidden state for the question second question onward. Precisely, in the "def train_data(mini_batch, targets, word_attn_model, sent_attn_model, word_optimizer, sent_optimizer, criterion):" function (in the "attention_model_validation_experiments" notebook), you currently use a loop over the sentence: "_s, state_word, _ = word_attn_model(mini_batch[i,:,:].transpose(0,1), state_word)". That means, both "the forward and backward states of the last word in the sentence i" are used for initializing the forward and backward states of sentence i+1. I can understand the case for forward state as the two sentence are consecutive, but the backward state initialization seems not very reasonable.
Can you please explain this in more detail? Thanks.
The text was updated successfully, but these errors were encountered: