The Codeflaws benchmark is a collection of C programs with 3902 defects. Each defect are where the dataset is crawled from Codeforces
Codeflaws is available for download at tar-link
- Install the time command:
sudo apt-get install time
- Download codeflaws:
wget http://www.comp.nus.edu.sg/~release/codeflaws/codeflaws.tar.gz
tar xf codeflaws.tar.gz
- Download individual repair tool
- Select defects (there are in general 3902 defects) to run by creating a file
filename
that is a copy of thecodeflaws/all-script/codeflaws-defect-detail-info.txt
file. - Modify ./run-version-.sh by setting the appropriate variables. For example, in the
run-version-genprog.sh
file, you need to modify the following variables:
rundir="$rootdir/genprog-run" # directory in which genprog is called from, a temporary output directory where everything will be copied to during the repair
versiondir="$rootdir/codeflaws" #directory where the codeflaws.tar.gz is extracted
filename="$rootdir/run1" #should be a copy of the codeflaws-defect-detail-info.txt
genprog="/home/ubuntu/genprog-source-v3.0/src/repair"
- run the script:
./run-version-genprog.sh
All the subject programs are in the benchmark directory. Each subject folder is named using the following convention: <contestid>-<problem>-bug-<buggy-submisionid>-<accepted-submissionid> Each folder contains:
- Buggy submission with name <contestid>-<problem>-<buggy-submisionid>.c
- Accepted submission with name <contestid>-<problem>-<accepted-submisionid>.c
- Two sets of test scripts:
- Repair Test script (test suite given to repair tools for generating repair): test-genprog.sh is for search-based repair tools (GenProg, SPR, Prophet), test-angelix.sh is for Angelix as it requires inserting special instrumentation.
- Test script for patch validation (held-out test suite): test-valid.sh is for validating the correctness of patches
- Test input files: input[0-9]+ file used by Test suite (i), and heldout-input[0-9]+ file used by Test suite (ii)
- Test output files: output[0-9]+ file used by Test suite (i), and heldout-output[0-9]+ file used by Test suite (ii)
- Makefile for compiling the buggy submission. This contains the CFLAGS options recommended by Codeforces. To compile the accepted submission, use the command
make FILENAME=10-A-13543524
- Makefile.genprog for compiling the buggy submission using cilly. This is for GenProg experiments as GenProg works on CIL representation.
- Test configuration for SPR that specify the name for pass/fail test: <contestid>-<problem>-<buggy-submisionid>.c.revlog
All the files mentioned below are stored in the all-script directory
Use the following files for running Angelix:
- File for running Angelix: run-version-angelix.sh
Use the following files for running GenProg:
- File for GenProg general configuration: configuration-default
- File for compilation configuration: compile.pl
- File for running GenProg: run-version-genprog.sh
- File for validating patches generated by GenProg (This script is called from run-version-genprog.sh): validate-fix-genprog.sh
Use the following files for running SPR:
- File for compilation configuration: code-build.py
- File for test configuration: run-test.py
- File for running SPR: run-version-spr.sh
- File for validating patches generated by SPR (This script is called from run-version-spr.sh): validate-fix-spr.sh
Use the following files for running Prophet:
- File for compilation configuration: code-build.py
- File for test configuration: run-test.py
- File for running Prophet: run-version-prophet.sh
- File for validating patches generated by Prophet (This script is called from run-version-prophet.sh): validate-fix-prophet.sh
- Parameter file with learned model (From Prophet original experiment): para-rext-all.out
If you use Codeflaws in an academic work, we would be really glad if you cite our paper using the following bibtex:
@inproceedings{Tancodeflaws,
author = {Tan, Shin Hwei and Yi, Jooyong and Yulis and Mechtaev, Sergey and Roychoudhury,Abhik},
title = {Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools},
booktitle = {2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C)},
year = {2017},
pages = {180-182}
}
For more information/questions about the benchmark, refer to the following website: https://codeflaws.github.io/