Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid outputs #30

Open
nishant260190 opened this issue Jul 5, 2018 · 5 comments
Open

Invalid outputs #30

nishant260190 opened this issue Jul 5, 2018 · 5 comments

Comments

@nishant260190
Copy link

I have followed all the steps as described by you everything is running perfectly but output results are not correct, may be because I have used very small data set. But still for the exact conversation it should give correct result like "hello" in response of "hi" or "fine" in response of "how are you" which I have given as input while training.

@adeshpande3
Copy link
Owner

Just because something is in the training set doesn't mean that the network will learn to output that. The problem could be a couple of different things: small dataset, too complex network architecture, improperly tuned hyperparameters, not enough variety in dataset, etc. It's hard to pinpoint which one it could be. One exercise that may be useful is just using a very small dataset (a couple of input-output pairs) and using a very small network and seeing if the network can at least learn those mappings. Once it can, then slowly increase the size of the dataset as well as the complexity of the network.

@nishant260190
Copy link
Author

@adeshpande3 : Thanks for the early response. I am new to this so I am not able to understand how to increase/decrease the complexity of network. I have not changed the code, it is same as given in this repository.
And one more thing on what basis I have to set hyperparameters.

Word2Vec :
wordVecDimensions = 100
batchSize = 128
numNegativeSample = 64
windowSize = 5
numIterations = 100000

numTrainingExamples : 919210 vocabSize : 5850

Seq2Seq :

batchSize = 24
maxEncoderLength = 15
maxDecoderLength = maxEncoderLength
lstmUnits = 112
embeddingDim = lstmUnits
numLayersLSTM = 3
numIterations = 70000

@adeshpande3
Copy link
Owner

By decreasing the complexity of the network, I mean decreasing the number of LSTM units or the number of LSTM layers

@nishant260190
Copy link
Author

@adeshpande3 : Can you please help me out in understanding that on what basis we have to define parameters

@adeshpande3
Copy link
Owner

There isn't really an easy answer to that question. It's highly dependent on what task you're trying to solve (question/answering in our case), the type of model you're trying to create, and the amount of data/compute power you have. All these things will affect the parameter values you choose. I'd recommend watching CS 224 to get a better understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants