From 9dc0a9e751bca543798d3013922311f05a54aea8 Mon Sep 17 00:00:00 2001 From: knc6 Date: Sun, 11 Jul 2021 12:16:45 -0400 Subject: [PATCH] Minor text update. --- README.md | 11 +++++------ alignn/scripts/train_folder.py | 6 +++--- 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 0a813fc8..912c3ecd 100644 --- a/README.md +++ b/README.md @@ -33,23 +33,22 @@ Examples Users can keep their structure files in POSCAR, .cif, or .xyz files in a directory. In the examples below we will use POSCAR format files. In the same directory, there should be id_prop.csv file. In this id_prop.csv, the filenames, and correponding target values are kept in comma separated values (csv) format. Here is an example of training OptB88vdw bandgaps of 50 materials from JARVIS-DFT. The example is created using the script provided in the script folder. -Users can modify the script more than 50 data, or make their own dataset in this format. The dataset in split in 80:10:10 as training-validation-test set. -With the configuration parameters given in config_example_regrssion.json, the model is trained. +Users can modify the script more than 50 data, or make their own dataset in this format. The dataset in split in 80:10:10 as training-validation-test set. To change the split proportion and other parameters, change the config_example.json file. Sometimes, we want to train on certain sets and val/test on another dataset. For such cases, set n_train, n_val, n_test manually in the config_example.json and also set keep_data_order as True there. With the configuration parameters given in config_example.json, the model is trained. ``` -python alignn/scripts/train_folder.py --root_dir "alignn/examples/sample_data" --config "alignn/examples/sample_data/config_example_regrssion.json" +python alignn/scripts/train_folder.py --root_dir "alignn/examples/sample_data" --config "alignn/examples/sample_data/config_example.json" ``` While the above example is for regression, the follwoing example shows a classification task for metal/non-metal based on the above bandgap values. We transform the dataset into 1 or 0 based on a threshold of 0.01 eV (controlled by the parameter, 'classification_threshold') and train a similar classification model. ``` -python alignn/scripts/train_folder.py --root_dir "alignn/examples/sample_data" --config "alignn/examples/sample_data/config_example_classification.json" +python alignn/scripts/train_folder.py --root_dir "alignn/examples/sample_data" --config "alignn/examples/sample_data/config_example.json" ``` While the above example regression was for single-output values, we can train multi-output regression models as well. An example is given below for training formation energy per atom, bandgap and total energy per atom simulataneously. The script to generate the example data is provided in the script folder of the sample_data_multi_prop. Another example of training electron and phonon density of states is provided also. ``` -python alignn/scripts/train_folder.py --root_dir "alignn/examples/sample_data_multi_prop" --config "alignn/examples/sample_data/config_example_regrssion.json" +python alignn/scripts/train_folder.py --root_dir "alignn/examples/sample_data_multi_prop" --config "alignn/examples/sample_data/config_example.json" ``` -You can also try multiple example scripts to run multiple dataset training. Look into the 'scripts' folder. +Users can also try multiple example scripts to run multiple dataset training. Look into the 'alignn/scripts' folder. These scripts automatically download datasets from jarvis.db.fighshare module in JARVIS-Tools and train several models. Make sure you specify your specific queuing system details in the scripts. diff --git a/alignn/scripts/train_folder.py b/alignn/scripts/train_folder.py index 20fe9a08..c382b4eb 100644 --- a/alignn/scripts/train_folder.py +++ b/alignn/scripts/train_folder.py @@ -18,7 +18,7 @@ parser.add_argument( "--root_dir", default="./", - help="Folder with id_props.csv, poscars and config*.json", + help="Folder with id_props.csv, poscars", ) parser.add_argument( "--config_name", @@ -29,13 +29,13 @@ parser.add_argument( "--keep_data_order", default=False, - help="Whether to randomly shuffle samples", + help="Whether to randomly shuffle samples, True/False", ) parser.add_argument( "--classification_threshold", default=None, - help="Threshold for converting into 0/1 class" + help="Floating point threshold for converting into 0/1 class" + ", use only for classification tasks", )