Collection of Machine learning frameworks for DNNs, Regressions, Adversaries, CNNs and Others (DRACO)
For readability there is a README for each step (this here is only for an overview!):
When working with the combination of
ROOT.__version__ = 6.14/04
keras.__version__ = 2.2.4
pandas.__version__ = 0.23.4
segmentation violations appear when opening hdf-files withpandas.read_hdf()
orpandas.HDFStore()
. To circumvent this, first import everything related toROOT
, only then start importing things fromkeras
.
CMSSW_9_4_9
or newer- uproot
pip install --user uproot==3.2.7
pip install --user uproot-methods==0.2.6
pip install --user awkward==0.4.2
- TensorFlow backend for KERAS
export KERAS_BACKEND=tensorflow
- Preprocess input data:
- to create input features you need data in the ntuple format, these get converted into hdf5 dataframes
- create a list of input features in
variable_sets/
(look at the README for structural advice) - adjust settings in
preprocessing/root2pandas/preprocessing.py
, check README - execute it on the NAF (tested with
CMSSW_9_4_9
) - this creates one dataset for each event category you specified in
preprocessing.py
- Setup training:
- move to directory
train_scripts/
and create a config script for your training (for simple DNN training adjusttrain_scripts/train_template.py
) - look at the README for further instructions
- move to directory
- Execute training
- execute the trainig script with your jet-tag category as option (further parser options described in README)
- after completion of the script you can look at results in the specified output directory
collection of scripts for preprocessing of ntuples files to hdf5 files as input for DRACOs
root2pandas
: generate hdf5 files from ntuples + MEM files
collection of top-level scripts for training
collection of scripts for plotting
plot_configs
: collection of scripts for configuring plots
lists of input variables used for dnn trainings
collections of frameworks for training and evaluation of different machine learning frameworks