-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid outputs #30
Comments
Just because something is in the training set doesn't mean that the network will learn to output that. The problem could be a couple of different things: small dataset, too complex network architecture, improperly tuned hyperparameters, not enough variety in dataset, etc. It's hard to pinpoint which one it could be. One exercise that may be useful is just using a very small dataset (a couple of input-output pairs) and using a very small network and seeing if the network can at least learn those mappings. Once it can, then slowly increase the size of the dataset as well as the complexity of the network. |
@adeshpande3 : Thanks for the early response. I am new to this so I am not able to understand how to increase/decrease the complexity of network. I have not changed the code, it is same as given in this repository. Word2Vec : numTrainingExamples : 919210 vocabSize : 5850 Seq2Seq : batchSize = 24 |
By decreasing the complexity of the network, I mean decreasing the number of LSTM units or the number of LSTM layers |
@adeshpande3 : Can you please help me out in understanding that on what basis we have to define parameters |
There isn't really an easy answer to that question. It's highly dependent on what task you're trying to solve (question/answering in our case), the type of model you're trying to create, and the amount of data/compute power you have. All these things will affect the parameter values you choose. I'd recommend watching CS 224 to get a better understanding. |
I have followed all the steps as described by you everything is running perfectly but output results are not correct, may be because I have used very small data set. But still for the exact conversation it should give correct result like "hello" in response of "hi" or "fine" in response of "how are you" which I have given as input while training.
The text was updated successfully, but these errors were encountered: