Designing a question answering system to answer questions collected from the Canadian Broadcasting Corporation webpage for kids.
This program would take the input file path as a command line argument (The input.txt file is under testing folder). After running the instructions below, the output file will be writen under the testing folder.
- spaCy – https://spacy.io/
- scikit-learn – http://scikit-learn.org/stable/
- At most five minutes
(NOTE: If this script file does not work properly, please follow the direction to set it up without a script file below)
./qa.sh
pip install --user virtualenv
mkdir env
python -m virtualenv env -p /home/u1141153/python/bin/python3.5
source env/bin/activate.csh
pip install spacy
python -m spacy download en
pip install -U scikit-learn
python qa.py testing/input.txt
- CADE Lab Machine: LAB2–29
- Problems: the shell script for installing spacy and scikit-learn does not work correctly all the time. So if that happens, please follow the instructions provided above.
FINAL RESULTS
AVERAGE RECALL = 0.4853 (244.60 / 504)
AVERAGE PRECISION = 0.3229 (140.14 / 434)
AVERAGE F-MEASURE = 0.3878
- Tarun Sunkaraneni – load document files, format response file, find the best sentence list, and extract out the answer from the list of best sentence list
- Mia Ngo – write the README, write the shell script, set up the virtual machine environment for external applications, and extract out the answer from the list of best sentence list