-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add more notebooks assessing training effect
- Loading branch information
Showing
10 changed files
with
788 additions
and
483 deletions.
There are no files selected for viewing
467 changes: 0 additions & 467 deletions
467
scripts/evaluate-generated-mofs/0_effect-of-training.ipynb
This file was deleted.
Oops, something went wrong.
467 changes: 467 additions & 0 deletions
467
scripts/evaluate-generated-mofs/0_summarize-model-outcomes.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
105 changes: 89 additions & 16 deletions
105
scripts/evaluate-generated-mofs/1_effect-of-scale.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
232 changes: 232 additions & 0 deletions
232
scripts/evaluate-generated-mofs/2_effect-of-training.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,232 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5211ca75-c0ed-4e81-ae37-2a21533656f1", | ||
"metadata": {}, | ||
"source": [ | ||
"# Evaluate the Effect of Training \n", | ||
"We can assess whether retraining Difflinker leads to improved performance in two ways:\n", | ||
"1. Evaluate how much the success rate improves with re-training\n", | ||
"2. The difference between the total number of stable MOFs found w/ and w/o a closed loop" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 9, | ||
"id": "e0c564c2-ec6d-4d55-bc8e-944a80d35598", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from itertools import chain\n", | ||
"from scipy.interpolate import interp1d\n", | ||
"from pathlib import Path\n", | ||
"import pandas as pd" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c56bd56d-a264-4e11-a7ec-e3bb62268e42", | ||
"metadata": {}, | ||
"source": [ | ||
"## Route 1: Measure Success Rate by Model Generation" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c8bb1426-af24-4159-b1c4-55e9c203bfde", | ||
"metadata": {}, | ||
"source": [ | ||
"## Round 2: Assess workflow outcomes w/o retraining\n", | ||
"Show that it gets better" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "10478466-dfd8-41c0-bf53-20dcb394fb91", | ||
"metadata": {}, | ||
"source": [ | ||
"### Get the \"Stable Found\" at 90 minutes\n", | ||
"Loop over all runs and store: scale, if retrained or not, and the number of stable found after 90 minutes. \n", | ||
"The 450-node run switches how it trained DiffLinker at around 90 minutes, and we don't want to study that effect yet." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 33, | ||
"id": "aa46b598-7a7f-4f13-8a7c-c991ef4e2013", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"hours = 1.5" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 34, | ||
"id": "98400659-f017-4b68-8ac7-1e9643c10c65", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"success_data = []\n", | ||
"for path in chain(Path('summaries').glob('*-nodes.csv.gz'), Path('summaries').glob('*-nodes_repeat-*.csv.gz'), Path('summaries').glob('*no-retrain*.csv.gz')):\n", | ||
" # Get metadata\n", | ||
" count = int(path.name.split(\"-\")[0])\n", | ||
" retrain = 'no-retrain' not in path.name\n", | ||
"\n", | ||
" # Pull the success rate\n", | ||
" mofs = pd.read_csv(path)\n", | ||
" num_found = interp1d(mofs['walltime'], mofs['cumulative_found'], kind='previous')(hours * 3600).item()\n", | ||
"\n", | ||
" success_data.append({\n", | ||
" 'nodes': count,\n", | ||
" 'retrain': retrain,\n", | ||
" 'found': num_found,\n", | ||
" 'found_node-hr': num_found / (count * hours)\n", | ||
" })\n", | ||
"success_data = pd.DataFrame(success_data)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 35, | ||
"id": "ac0c533f-f071-4e74-95d1-77fb0a9e17d9", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/html": [ | ||
"<div>\n", | ||
"<style scoped>\n", | ||
" .dataframe tbody tr th:only-of-type {\n", | ||
" vertical-align: middle;\n", | ||
" }\n", | ||
"\n", | ||
" .dataframe tbody tr th {\n", | ||
" vertical-align: top;\n", | ||
" }\n", | ||
"\n", | ||
" .dataframe thead th {\n", | ||
" text-align: right;\n", | ||
" }\n", | ||
"</style>\n", | ||
"<table border=\"1\" class=\"dataframe\">\n", | ||
" <thead>\n", | ||
" <tr style=\"text-align: right;\">\n", | ||
" <th></th>\n", | ||
" <th></th>\n", | ||
" <th>found</th>\n", | ||
" <th>found_node-hr</th>\n", | ||
" </tr>\n", | ||
" <tr>\n", | ||
" <th>nodes</th>\n", | ||
" <th>retrain</th>\n", | ||
" <th></th>\n", | ||
" <th></th>\n", | ||
" </tr>\n", | ||
" </thead>\n", | ||
" <tbody>\n", | ||
" <tr>\n", | ||
" <th rowspan=\"2\" valign=\"top\">32</th>\n", | ||
" <th>False</th>\n", | ||
" <td>133.0</td>\n", | ||
" <td>2.770833</td>\n", | ||
" </tr>\n", | ||
" <tr>\n", | ||
" <th>True</th>\n", | ||
" <td>313.0</td>\n", | ||
" <td>6.520833</td>\n", | ||
" </tr>\n", | ||
" <tr>\n", | ||
" <th rowspan=\"2\" valign=\"top\">64</th>\n", | ||
" <th>False</th>\n", | ||
" <td>426.5</td>\n", | ||
" <td>4.442708</td>\n", | ||
" </tr>\n", | ||
" <tr>\n", | ||
" <th>True</th>\n", | ||
" <td>641.0</td>\n", | ||
" <td>6.677083</td>\n", | ||
" </tr>\n", | ||
" <tr>\n", | ||
" <th>128</th>\n", | ||
" <th>True</th>\n", | ||
" <td>1622.0</td>\n", | ||
" <td>8.447917</td>\n", | ||
" </tr>\n", | ||
" <tr>\n", | ||
" <th>256</th>\n", | ||
" <th>True</th>\n", | ||
" <td>3633.0</td>\n", | ||
" <td>9.460938</td>\n", | ||
" </tr>\n", | ||
" <tr>\n", | ||
" <th>450</th>\n", | ||
" <th>True</th>\n", | ||
" <td>6554.0</td>\n", | ||
" <td>9.709630</td>\n", | ||
" </tr>\n", | ||
" </tbody>\n", | ||
"</table>\n", | ||
"</div>" | ||
], | ||
"text/plain": [ | ||
" found found_node-hr\n", | ||
"nodes retrain \n", | ||
"32 False 133.0 2.770833\n", | ||
" True 313.0 6.520833\n", | ||
"64 False 426.5 4.442708\n", | ||
" True 641.0 6.677083\n", | ||
"128 True 1622.0 8.447917\n", | ||
"256 True 3633.0 9.460938\n", | ||
"450 True 6554.0 9.709630" | ||
] | ||
}, | ||
"execution_count": 35, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"success_data.groupby(['nodes', 'retrain']).mean()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "a973a7cb-e161-429e-9707-4f6b6a608bd2", | ||
"metadata": {}, | ||
"source": [ | ||
"TBD: Make a plot" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "269787d5-5319-47f7-83ce-1a807bf14583", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.8" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Binary file modified
BIN
-10.8 KB
(85%)
scripts/evaluate-generated-mofs/figures/stability-over-time-step.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified
BIN
-94.5 KB
(45%)
scripts/evaluate-generated-mofs/figures/stability-over-time.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified
BIN
+2.88 KB
(100%)
scripts/evaluate-generated-mofs/figures/stable-found-per-node-hour.pdf
Binary file not shown.
Binary file modified
BIN
+17.2 KB
(120%)
scripts/evaluate-generated-mofs/figures/stable-found-per-node-hour.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.