Note. Although closely related, this repository is not to be confused with the trainbenchmark
repository which is a benchmark framework for comparing various database management tools, including triplestores, relational databases and graph databases.
The paper/trainbenchmark-ttc.pdf
file contains the case description, while the latest compile is also available.
- 64-bit operating system (Ubuntu-based Linux systems are recommended)
- Oracle JDK 7+
- Maven 3.0+
- Install Maven 3 and make sure it is on your path (check with
mvn --version
). - Make sure you have Python 3 installed.
The scripts
directory contains the run.py
script which is used for the following purposes:
run.py -b
-- builds the projectsrun.py -b -s
-- builds the projects without testingrun.py -g
-- generates the instance modelsrun.py -m
-- runs the benchmarkrun.py -v
-- visualizes the results of the latest benchmark
The config
directory contains the configuration for the scripts:
config.json
-- configuration for the model generation and the benchmarkreporting.json
-- configuration for the visualization
Set the maxSize
variable to the desired value and run the run.py -g
script. With enough memory (-Xmx2G
or more), the models from size 1
to size 512
are generated in about 5 minutes.
The script runs the benchmark for the given number of runs, for the specified tools, queries and sizes.
The benchmark results are stored in a TSV (tab-separated values) file. The header for the TSV file is stored in the output/header.tsv
file.
Make sure you read the README.md
file in the reporting
directory and install all the requirements for R.
It is recommended to start with an Eclipse distribution tailored to developing EMF-based applications, e.g. Eclipse Modeling.
If you'd like to try the EMF-IncQuery implementation, it is recommended to use Eclipse Luna. There are two ways to resolve the dependencies:
- Maven dependencies (
pom.xml
files). This requires the m2e Eclipse plugin (this is included in Eclipse for Java developers but is not included in Modeling distribution). The m2e plugin can be installed from the the update site of your release (Kepler/Luna). - Plug-in dependencies (
MANIFEST.MF
files).
-
Use the Orbit update site for your release (http://download.eclipse.org/tools/orbit/downloads/) to install the Apache Commons CLI and the Guava: Google Core Libraries for Java Source plug-ins.
-
If you wish to run the EMF-IncQuery implementation, install EMF-IncQuery from http://download.eclipse.org/incquery/updates/release.
In general, we recommend to stick to your proven build solution, else you may spend a lot of time tinkering with the build. In theory, you can build Eclipse plug-ins with the Tycho Maven plug-in, however, it has a steep learning curve and is tricky to debug. For reference, see https://github.com/FTSRG/cheat-sheets/wiki/Maven-and-Eclipse.
To implement a tool, it is recommended to start from an existing implementation. Please implement your own benchmark logic and benchmark case factory which instantiates the classes for each query defined in the benchmark.
In order to make the fewest assumptions on the specific implementations, the pattern matches are stored in the variable matches
declared as a Collection<Object>
(see the AbstractBenchmarkCase class). The framework requires the matches
collection to be unique.
To enable a consistent ordering of the matches, the framework requires a comparator class. Section 2.4.2 ("Ordering of the Match Set") in the case description defines the rules of the ordering.
For implementing a match comparator, we recommend two approaches:
- If the matches are represented in a tuple-like collection, they can be compared by iterating through the collection and comparing each elements in the tuple. Example: the EMFIncQueryBenchmarkComparator class.
- Use ComparisonChain class in Google Guava to compare the model elements in the matches. Example: the JavaRouteSensorMatchComparator class.
To avoid confusion between the different implementations, we decided to use the Smurf Naming convention (see #21). This way, the classes in the Java implementation are named JavaBenchmarkCase
, JavaPosLength
, JavaPosLengthMatch
, JavaPosLengthTransformation
, while the classes in the EMF-IncQuery implementation are named EMFIncQueryBenchmarkCase
, EMFIncQueryPosLength
, etc. We found that relying on the package names to differentiate class names like
hu.bme.mit.trainbenchmark.ttc.benchmark.java.BenchmarkCase
andhu.bme.mit.trainbenchmark.ttc.benchmark.emfincquery.BenchmarkCase
is error-prone and should be avoided.
- Problem: if not running with Oracle JDK7, both the generation and the benchmarking freezes sometimes.
- Solution: see this issue for details.