In summary:
- WikiFactDiff is a factual knowledge update dataset for LLMs
- It describes the evolution of factual knowledge between two dates
$T_{old}$ and$T_{new}$ in the form of semantic triples (sample below). - The triples are verbalized with the help of templates (examples below).
-
This repository can be used for two purposes:
-
build an instance of WikiFactDiff given two dates
$T_{old}$ and$T_{new}$ . - evaluate knowledge update algorithms (ROME, MEMIT, MEND, ...) on WikiFactDiff.
-
build an instance of WikiFactDiff given two dates
- The build process was designed to be easy-to-use. All you have to do is provide
$T_{old}$ ,$T_{new}$ , and a folder where to store intermediate files (More details in 1. How to build WikiFactDiff?). - All required resources to perform knowledge update and its evaluation are provided in WikiFactDiff including neighbors for each fact to account for bleedover.
- More details can be found in huggingface and in our paper.
WikiFactDiff sample (triples only) | Templates used for verbalization |
---|---|
We release the WikiFactDiff dataset in huggingface for
Build process
Click here!
Prerequisites:
Software
- OS : Ubuntu 22.04 (not tested on Windows)
- conda (version used : 23.10.0)
- MongoDB (version used : 7.0.3)
Setup environment
Create and activate the conda environment wfd_build
bash setup_env/wfd_build.sh
conda activate wfd_build
Configure
- Specify the folder where all intermediate files will be stored in
build/config.py
by setting the variableSTORAGE_FOLDER
. - List the available dates and choose two distinct dates from the output to be
$T_{old}$ and$T_{new}$ :python build/wikidata_scripts/build_wikidata_dumps_index.py
- Specify these two dates in
build/config.py
(using OLD_WIKIDATA_DATE and NEW_WIKIDATA_DATE) and the MongoDB URL
NOTE : Make sure you have the necessary read/write permissions for the storage folder.
Build WikiFactDiff
Execute this single command to build WikiFactDiff:
python build/wikifactdiff_builder.py
It is recommended to run this command in tmux or screen as it is a very long process.
Assuming the necessary files have already been downloaded, expect 18 hours for this whole process to finish using a machine with 32 CPU cores, 128GB of RAM, and SSD storage. You need 210GB of disk space for the storage folder and 200GB for MongoDB.
The dataset will be stored in the specified storage folder and it will be named : wikifactdiff.jsonl
.
Details of the build process (step-by-step)
This part breaks down, step-by-step, the internal process of the command python build/wikifactdiff_builder.py
-
Download Wikidata dumps
python build/wikidata_scripts/download_dump.py --version old python build/wikidata_scripts/download_dump.py --version new
Expected download speed : ~1MB/s from Internet Archive (old dumps) and ~4MB/s from Wikidata dumps (recent dumps).
Dump size : 50-80GB
RAM : Negligeable
-
Collect Wikipedia views statistics: These statistics are pushed in MongoDB
python build/wikidata_scripts/create_database_wikipedia_statistics.py --version new python build/wikidata_scripts/create_database_wikipedia_statistics.py --version old
-
Push Wikidata to MongoDB:
python build/wikidata_scripts/process_json_dump.py --version new python build/wikidata_scripts/process_json_dump.py --version new
-
Preprocess Wikidata dumps:
python build/wikidata_scripts/preprocess_dump.py --version old python build/wikidata_scripts/preprocess_dump.py --version new
-
Compute the difference between the two Wikidata versions
python build/wikidata_scripts/compute_diff.py
-
Compute the popularity of each entity
python build/wikidata_scripts/compute_importance.py --version old python build/wikidata_scripts/compute_importance.py --version new
-
Create WikiFactDiff (triples only)
python build/wikidata_scripts/create_wikifactdiff_triples.py
-
Setup KNearestTriples
python build/wikidata_scripts/setup_knn.py
-
Incorporate verbalizations and KNearestTriples in WikiFactDiff
python build/verbalize_wikifactdiff/verbalize_wikifactdiff.py --ann_method sparse
The evaluation source code (located in evaluate
) is based on the MEMIT github repository published following the research paper on MEMIT. Its corresponding MIT licence is located in evaluate/LICENCE
.
Click here!
A 24GB VRAM GPU (e.g. RTX 3090) is required to run experiments on GPT-J.Create and activate the conda environment wfd_build
bash setup_env/wfd_eval.sh
conda activate wfd_eval
For instance, to evaluate ROME on WikiFactDiff using the GPT-J model, run the following command:
cd evaluate
PYTHONPATH="./" python experiments/evaluate_wfd.py
--alg_name ROME
--model_name EleutherAI/gpt-j-6B
--hparams_fname EleutherAI_gpt-j-6B.json
--dataset_path WIKIFACTDIFF_PATH
--results_dir RESULT_PATH
Specify the path to the WikiFactDiff dataset WIKIFACTDIFF_PATH
and the desired result folder RESULT_PATH
. If --dataset_path argument is not provided, the default huggingface version of WikiFactDiff is loaded.
NOTE: Only replacement updates are evaluated since existing algorithms can only handle this update scenario (no oblivion, entity insertion, etc.).
@inproceedings{ammar-khodja-etal-2024-wikifactdiff-large,
title = "{W}iki{F}act{D}iff: A Large, Realistic, and Temporally Adaptable Dataset for Atomic Factual Knowledge Update in Causal Language Models",
author = "Ammar Khodja, Hichem and
Bechet, Frederic and
Brabant, Quentin and
Nasr, Alexis and
Lecorv{\'e}, Gw{\'e}nol{\'e}",
editor = "Calzolari, Nicoletta and
Kan, Min-Yen and
Hoste, Veronique and
Lenci, Alessandro and
Sakti, Sakriani and
Xue, Nianwen",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
month = may,
year = "2024",
address = "Torino, Italia",
publisher = "ELRA and ICCL",
url = "https://aclanthology.org/2024.lrec-main.1532",
pages = "17614--17624",
}
Please let us know by opening an issue ;)