JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering
This repo provides the source code & data of our paper: JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering (NAACL 2022).
For convenience, all data, checkpoints and codes can be downloaded from my Baidu Netdisk.
Run the following commands to create a conda environment (assuming CUDA11):
conda create -n jointlk python=3.7
source activate jointlk
pip install torch==1.7.1+cu110 -f https://download.pytorch.org/whl/torch_stable.html
pip install transformers==3.2.0
pip install nltk spacy==2.1.6
python -m spacy download en
# for torch-geometric
pip install torch-cluster==1.5.9 -f https://pytorch-geometric.com/whl/torch-1.7.1+cu110.html
pip install torch-spline-conv==1.2.1 -f https://pytorch-geometric.com/whl/torch-1.7.1+cu110.html
pip install torch-scatter==2.0.6 -f https://pytorch-geometric.com/whl/torch-1.7.1+cu110.html
pip install torch-sparse==0.6.9 -f https://pytorch-geometric.com/whl/torch-1.7.1+cu110.html
pip install torch-geometric==1.6.3 -f https://pytorch-geometric.com/whl/torch-1.7.1+cu110.html
See the file env.yaml for all environment dependencies.
We use preprocessed data from the QA-GNN repository, which can also be downloaded from my Baidu Netdisk.
The data file structure will look like:
.
├── data/
├── cpnet/ (prerocessed ConceptNet)
├── csqa/
├── train_rand_split.jsonl
├── dev_rand_split.jsonl
├── test_rand_split_no_answers.jsonl
├── statement/ (converted statements)
├── grounded/ (grounded entities)
├── graphs/ (extracted subgraphs)
├── ...
├── obqa/
├── medqa_usmle/
└── ddb/
(Assuming slurm job scheduling system)
For CommonsenseQA, run
sbatch sbatch_run_jointlk__csqa.sh
For OpenBookQA, run
sbatch sbatch_run_jointlk__obqa.sh
CommonsenseQA
Trained model | In-house Dev acc. | In-house Test acc. |
---|---|---|
RoBERTa-large + JointLK [link] | 77.6 | 75.3 |
RoBERTa-large + JointLK [link] | 78.4 | 74.2 |
OpenBookQA
Trained model | Dev acc. | Test acc. |
---|---|---|
RoBERTa-large + JointLK [link] | 68.8 | 70.4 |
AristoRoBERTa-large + JointLK [link] | 79.2 | 85.6 |
For CommonsenseQA, run
sbatch sbatch_run_jointlk__csqa_test.sh
For OpenBookQA, run
sbatch sbatch_run_jointlk__obqa_test.sh
This repo is built upon the following work:
QA-GNN: Question Answering using Language Models and Knowledge Graphs
https://github.com/michiyasunaga/qagnn
Many thanks to the authors and developers!
We noticed that the QA-GNN repository added test results on the MedQA dataset. To facilitate future researchers to compare different models, we also test the performance of JointLK on MedQA.
For training MedQA, run
sbatch sbatch_run_jointlk__medqa_usmle.sh
for testing MedQA, run
sbatch sbatch_run_jointlk__medqa_usmle_test.sh
A pretrained model checkpoint
Trained model | Dev acc. | Test acc. |
---|---|---|
SapBERT-base + JointLK [link] | 38.0 | 39.8 |