Invalid outputs #30

nishant260190 · 2018-07-05T07:12:18Z

I have followed all the steps as described by you everything is running perfectly but output results are not correct, may be because I have used very small data set. But still for the exact conversation it should give correct result like "hello" in response of "hi" or "fine" in response of "how are you" which I have given as input while training.

adeshpande3 · 2018-07-05T11:26:26Z

Just because something is in the training set doesn't mean that the network will learn to output that. The problem could be a couple of different things: small dataset, too complex network architecture, improperly tuned hyperparameters, not enough variety in dataset, etc. It's hard to pinpoint which one it could be. One exercise that may be useful is just using a very small dataset (a couple of input-output pairs) and using a very small network and seeing if the network can at least learn those mappings. Once it can, then slowly increase the size of the dataset as well as the complexity of the network.

nishant260190 · 2018-07-07T05:10:14Z

@adeshpande3 : Thanks for the early response. I am new to this so I am not able to understand how to increase/decrease the complexity of network. I have not changed the code, it is same as given in this repository.
And one more thing on what basis I have to set hyperparameters.

Word2Vec :
wordVecDimensions = 100
batchSize = 128
numNegativeSample = 64
windowSize = 5
numIterations = 100000

numTrainingExamples : 919210 vocabSize : 5850

Seq2Seq :

batchSize = 24
maxEncoderLength = 15
maxDecoderLength = maxEncoderLength
lstmUnits = 112
embeddingDim = lstmUnits
numLayersLSTM = 3
numIterations = 70000

adeshpande3 · 2018-07-08T12:48:31Z

By decreasing the complexity of the network, I mean decreasing the number of LSTM units or the number of LSTM layers

nishant260190 · 2018-07-09T12:08:00Z

@adeshpande3 : Can you please help me out in understanding that on what basis we have to define parameters

adeshpande3 · 2018-07-09T12:31:38Z

There isn't really an easy answer to that question. It's highly dependent on what task you're trying to solve (question/answering in our case), the type of model you're trying to create, and the amount of data/compute power you have. All these things will affect the parameter values you choose. I'd recommend watching CS 224 to get a better understanding.

adeshpande3 mentioned this issue Jul 19, 2018

How To Know If The Model Is Trained? #31

Closed

adeshpande3 mentioned this issue May 29, 2019

Same reply for all questions #44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid outputs #30

Invalid outputs #30

nishant260190 commented Jul 5, 2018

adeshpande3 commented Jul 5, 2018

nishant260190 commented Jul 7, 2018

adeshpande3 commented Jul 8, 2018

nishant260190 commented Jul 9, 2018

adeshpande3 commented Jul 9, 2018

Invalid outputs #30

Invalid outputs #30

Comments

nishant260190 commented Jul 5, 2018

adeshpande3 commented Jul 5, 2018

nishant260190 commented Jul 7, 2018

adeshpande3 commented Jul 8, 2018

nishant260190 commented Jul 9, 2018

adeshpande3 commented Jul 9, 2018