The official TPC-DS tools can be found at tpc.org.
This version is based on v2.10.0 and has been modified to:
- Allow compilation under macOS (commit 2ec45c5)
- Address obvious query template bugs like
- query22a: #31
- query77a
- Add more query variants
- query2a
- query12a
- query16a
- query20a
- query21a
- query23a
- query32a
- query37a
- query40a
- query49a
- query54a
- query77b
- query82a
- query92a
- query94a
- query95a
- query98a
To see all modifications, diff the files in the master branch to the version branch. Eg: master
vs v2.10.0
.
Make sure the required development tools are installed:
Ubuntu:
sudo apt-get install gcc make flex bison byacc git
CentOS/RHEL:
sudo yum install gcc make flex bison byacc git
Then run the following commands to clone the repo and build the tools:
git clone https://github.com/gregrahn/tpcds-kit.git
cd tpcds-kit/tools
make OS=LINUX
Make sure the required development tools are installed:
xcode-select --install
Then run the following commands to clone the repo and build the tools:
git clone https://github.com/gregrahn/tpcds-kit.git
cd tpcds-kit/tools
make OS=MACOS
Data generation is done via dsdgen
. See dsdgen --help
for all options. If you do not run dsdgen
from the tools/
directory then you will need to use the option -DISTRIBUTIONS /.../tpcds-kit/tools/tpcds.idx
.
Query generation is done via dsqgen
. See dsqgen --help
for all options.
The following command can be used to generate all 99 queries in numerical order (-QUALIFY
) for the 10TB scale factor (-SCALE
) using the Netezza dialect template (-DIALECT
) with the output going to /tmp/query_0.sql
(-OUTPUT_DIR
).
dsqgen \
-DIRECTORY ../query_templates \
-INPUT ../query_templates/templates.lst \
-VERBOSE Y \
-QUALIFY Y \
-SCALE 10000 \
-DIALECT netezza \
-OUTPUT_DIR /tmp