Chinese-QASystem

Chinese question answering system based on BLSTM and CRF.

Requirement

tensorflow 1.5
numpy
thulac
scikit-learn
matplotlib

DataSet

百度的中文问答数据集WebQA，非常感谢该链接的作者对数据的整理。

How to get start?

    1.Download the raw data and extract it to the folder where the source code is located.
    2.python3 make_tfrecords.py.Processing the raw data to generate the tfrecord files for training and validating.
      In this experiment,I used 200,000 corpus to train and validate the accuracy of the model on 5000 corpus.
    3.python3 train.py.All the training results as shown below.It is not hard to find that the model eventually 
      achieved an accuracy of 0.6050 on the validation set.

Note

In the future, I will write a blog to introduce this work and you will learn how to use tensorflow's tf.while_loop interface to implement conditional random field training and Viterbi decoding.

References

Li P, Li W, He Z, et al. Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering[J]. 2016.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
picture		picture
.gitignore		.gitignore
README.md		README.md
make_tfrecords.py		make_tfrecords.py
modules.py		modules.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chinese-QASystem

Requirement

DataSet

How to get start?

Note

References

About

Releases

Packages

Languages

YeliangLi/Chinese-QASystem

Folders and files

Latest commit

History

Repository files navigation

Chinese-QASystem

Requirement

DataSet

How to get start?

Note

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages