Where is the model-dot-fit function? #15

JayPhate · 2017-01-31T10:29:09Z

I want to use Encoder-Decoder model for some other data. I am trying to understand this code. But I couldn't find the fit method in train.ipynb. After padding of description and heading, how to use these vector for training the model. What is the dimension for X and Y in model-dot-fit? The dimension of X may be #descriptions x 50 and the dimension of Y may be #headings x 50. And #descriptions equals to #headings.

Below is the command I used to fit the model.
model_fit = model.fit(nxTrain, nyTrain, nb_epoch=1, batch_size=64, verbose=2)
The dimensions of X and Y of model.fit method.
xTrain.shape
(17853, 50)

yTrain.shape
(17853, 25)

But I got below error.
Exception: Error when checking model target: expected activation_1 to have 3 dimensions, but got array with shape (17853, 25)

Please check the model summary.
print(model.summary())

Layer (type) Output Shape Param # Connected to

embedding_1 (Embedding) (None, 50, 100) 4000000 embedding_input_1[0][0]

lstm_1 (LSTM) (None, 50, 512) 1255424 embedding_1[0][0]

dropout_1 (Dropout) (None, 50, 512) 0 lstm_1[0][0]

lstm_2 (LSTM) (None, 50, 512) 2099200 dropout_1[0][0]

dropout_2 (Dropout) (None, 50, 512) 0 lstm_2[0][0]

lstm_3 (LSTM) (None, 50, 512) 2099200 dropout_2[0][0]

dropout_3 (Dropout) (None, 50, 512) 0 lstm_3[0][0]

simplecontext_1 (SimpleContext) (None, 25, 944) 0 dropout_3[0][0]

timedistributed_1 (TimeDistribut (None, 25, 40000) 37800000 simplecontext_1[0][0]

activation_1 (Activation) (None, 25, 40000) 0 timedistributed_1[0][0]

Total params: 47253824

None

I used the same model as explained in train.ipynb. I am not getting what's wrong here?

The text was updated successfully, but these errors were encountered:

udibr · 2017-01-31T14:03:34Z

nyTrain should be renamed to yTrain because it is the matrix itself not its size (same for nxTrain)
the notebook is using the inefficient loss categorical_crossentropy which needs the label of the words to be expanded to the size of the vocabulary. Look for usage of np_utils.to_categorical in the notebook if you want you can switch to sparse_categorical_crossentropy which does not require the huge memory needed to keep the yTrain in an expanded one-hot form. However I think that in that case you need to add an extra dimension of size 1 to yTrain. For example by doing np.expand_dims(yTrain,-1)

HTH, Udi

udibr · 2017-01-31T14:04:37Z

you are welcomed to do the changes and send a PR

JayPhate · 2017-02-01T07:11:30Z

@udibr I am very new to Keras and NN. Why do we need to use np_utils.to_categorical? Because I already converted all the vocab words to indices. So I am not using words for training and I am using only indices.
I am trying to build Many to Many sequence labeling model. The fifth model from left in below image.

My issue is very similar to https://github.com/fchollet/keras/issues/2654 . From this link, I didn't understand the input and output shapes of the data.

Again you suggested to add an extra dimension of size 1 to yTrain, Can you elaborate why? Now the shape of yTrain is => (17853, 25), what will be the shape after adding an extra dimension?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where is the model-dot-fit function? #15

Where is the model-dot-fit function? #15

JayPhate commented Jan 31, 2017 •

edited

Loading

udibr commented Jan 31, 2017

udibr commented Jan 31, 2017

JayPhate commented Feb 1, 2017 •

edited

Loading

Where is the model-dot-fit function? #15

Where is the model-dot-fit function? #15

Comments

JayPhate commented Jan 31, 2017 • edited Loading

Layer (type) Output Shape Param # Connected to

activation_1 (Activation) (None, 25, 40000) 0 timedistributed_1[0][0]

udibr commented Jan 31, 2017

udibr commented Jan 31, 2017

JayPhate commented Feb 1, 2017 • edited Loading

JayPhate commented Jan 31, 2017 •

edited

Loading

JayPhate commented Feb 1, 2017 •

edited

Loading