Concept-aware Training

This repository contains training and evaluation sources to train in-context few-shot learners to utilize concepts in prediction.

Before reproducing the training, note that we make the CoAT-trained models publicly available. If you simply want to reproduce our results, proceed to the Evaluation section below and pick the model of your interest.

Training

The training of concept-aware model can be reproduced by running the following scripts.

git clone {this_repo}
cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r training/requirements.txt
pip install -r evaluation/requirements.txt

cd training
chmod 777 download_teaberac_data.sh
./download_teaberac_data.sh
cd ..

CUDA_VISIBLE_DEVICES=0 python training/train_mt5_teabreac+qa_coat.py

The script intentionally contains all parameters fixed, but if you need to change something, e.g. due to the environment restrictions, do not hesitate to adjust AdaptationArguments or evaluations within the code.

The training scripts include evaluations on SuperGLUE and various TeaBReaC concepts.

Baseline: Random Demonstrations Selection Training

In the sequence above, replace the python script path with train_mt5_teabreac+qa_random.py.

CUDA_VISIBLE_DEVICES=0 python training/train_mt5_teabreac+qa_random.py

Evaluations

We make the following pre-trained models from the paper publicly available:

Tk-CoAT-1B corresponds to authoranonymous321/mt5_large-teabreac-AQA_CoAT
Tk-CoAT-3B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_CoAT
Tk-Random-1B corresponds to authoranonymous321/mt5_large-teabreac-AQA_random
Tk-CoAT-1B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_random
Tk-Info-3B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_informative

Concept-learning ability evaluation

To extract the concepts from explanations as proposed in the paper, and run the Concept-learning evaluation on a selected model, run sensitivity_evaluator.py script:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r evaluation/requirements.txt
spacy download en_core_web_sm  # For OpenBookQA concepts extraction

CUDA_VISIBLE_DEVICES=0 python evaluation/sensitivity_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT \
    --bootstrap True \
    --metric ROUGE \
    --tasks glue/mnli,openbookqa/additional,hotpot_qa/fullwiki,worldtree \

All resources and concepts extractions should be resolved automatically.

If you evaluate using --bootstrapping True, collect the stdout to a file and analyse the results using this notebook.

Semantic priors evaluation

To evaluate models' reliance on their semantic representation of labels, run the semantic_priors_evaluator.py script:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r evaluation/requirements.txt

CUDA_VISIBLE_DEVICES=0 python evaluation/semantic_priors_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT \
    --bootstrap True \
    --aggregate_results True \
    --metric ROUGE \
    --tasks axb,boolq,cb,wsc,multirc,rte,wic,axg \
    --firstn 100

With --bootstrap True and --aggregate_results False, the results can be vizualized using this notebook. To assess the results directly, use --aggregate_results True instead. To evaluate on full datasets, set --firstn 0.

End tasks evaluation

To reproduce our evaluation on SuperGLUE, run the following:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
CUDA_VISIBLE_DEVICES=0 python evaluation/superglue_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT,allenai/tk-instruct-large-def-pos \
    --metric ROUGE \
    --tasks axb,boolq,cb,wsc,copa,multirc,rte,wic,record,axg

All resources should be resolved automatically.

Citation

If you use Concept-learning Evaluation in scientific work, please cite this work as follows:

@inproceedings{stefanik2023incontext,
               author = {{{\v{S}}tef{\'a}nik}, Michal and {Kadl{\v{c}}{\'\i}k}, Marek},
               title={Can In-context Learners Learn a Reasoning Concept from Demonstrations?}, 
               booktitle = {Proceedings of ACL 2023: Natural Language Reasoning and Structured Explanations (NLRSE)},
               publisher = {ACL},
               numpages = {6},
               year={2023},
               url = {https://arxiv.org/abs/2212.01692},
}

If you'd like to reference Concept-Aware Training, please cite other paper that introduces it:

@article{stefanik2023conceptaware,
         title={Concept-aware Training Improves In-context Learning Ability of Language Models}, 
         author={{{\v{S}}tef{\'a}nik}, Michal and {Kadl{\v{c}}{\'\i}k}, Marek},
         year={2023},
         eprint={2305.13775},
         archivePrefix={arXiv},
         primaryClass={cs.CL},
         url = {https://arxiv.org/abs/2305.13775},
}

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
analyses		analyses
common		common
evaluation		evaluation
training		training
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Concept-aware Training

Training

Baseline: Random Demonstrations Selection Training

Evaluations

Concept-learning ability evaluation

Semantic priors evaluation

End tasks evaluation

Citation

About

Releases

Packages

Languages

License

MIR-MU/CoAT

Folders and files

Latest commit

History

Repository files navigation

Concept-aware Training

Training

Baseline: Random Demonstrations Selection Training

Evaluations

Concept-learning ability evaluation

Semantic priors evaluation

End tasks evaluation

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages