-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding new notebook for using fairchem models with NEBs without CatTSunami enumeration #764
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov ReportAll modified and coverable lines are covered by tests ✅ |
mshuaibii
approved these changes
Jul 16, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - this should add in nicely!
@mshuaibii Note that the notebook does not actually build (https://github.com/FAIR-Chem/fairchem/actions/runs/9961322115/job/27522563768) @brookwander can you fix the notebook in a new PR? |
Closed
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 19, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * updating approach to path to work with ipython * adding seed to NRR example which randomly had not configs on last push * Update docs/tutorials/NRR/NRR_example.md * adding file :(|) * skip neb execution * status check on merge queue --------- Co-authored-by: zulissimeta <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 19, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * trigger docs build only on PR review submit --------- Co-authored-by: Brook Wander <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 19, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * fix dataset config logic * add empty val/test if not defined * add empty dicts for all missing datasets --------- Co-authored-by: Brook Wander <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]> Co-authored-by: zulissimeta <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 19, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * OCP->FAIRChem + paper list --------- Co-authored-by: Brook Wander <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 22, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * add expandable segments var * add note --------- Co-authored-by: Brook Wander <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Aug 2, 2024
* Update BalancedBatchSampler to use datasets' `data_sizes` method Replace BalancedBatchSampler's `force_balancing` and `throw_on_error` parameters with `on_error` * Remove python 3.10 syntax * Documentation * Added set_epoch method * Format * Changed "resolved dataset" message to be a debug log to reduce log spam * clean up batchsampler and tests * base dataset class * move lin_ref to base dataset * inherit basedataset for ase dataset * filter indices prop * added create_dataset fn * yaml load fix * create dataset function instead of filtering in base * remove filtered_indices * make create_dataset and LMDBDatabase importable from datasets * create_dataset cleanup * test create_dataset * use metadata.natoms directly and add it to subset * use self.indices to handle shard * rename _data_sizes * fix Subset of metadata * minor change to metadata, added full path option * import updates * implement get_metadata for datasets; add tests for max_atoms and balanced partitioning * a[:len(a)+1] does not throw error, change to check for this * off by one fix * fixing tests * plug create_dataset into trainer * remove datasetwithsizes; fix base dataset integration; replace close_db with __del__ * lint * add/fix test; * adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * Add extra test case for local batch size = 1 * fix example * fix test case * reorg changes * remove metadata_has_sizes in favor of basedataset function metadata_hasattr * fix data_parallel typo * fix up some tests * rename get_metadata to sample_property_metadata * add slow get_metadata for ase; add tests for get_metadata (ase+lmdb); add test for make lmdb metadata sizes * add support for different backends and ddp in pytest * fix tests and balanced batch sampler * make default dataset lmdb * lint * fix tests * test with world_size=0 by default * fix tests * fix tests.. * remove subsample from oc22 dataset * remove old datasets; add test for noddp * remove load balancing from docs * fix docs; add train_split_settings and test for this --------- Co-authored-by: Nima Shoghi <[email protected]> Co-authored-by: Nima Shoghi <[email protected]> Co-authored-by: lbluque <[email protected]> Co-authored-by: Brandon <[email protected]> Co-authored-by: Brook Wander <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Aug 5, 2024
* denorm targets in _forward only * linear reference class * atomref in normalizer * raise input error * clean up normalizer interface * add element refs * add element refs correctly * ruff * fix save_checkpoint * reference and dereference * 2xnorm linref trainer add * clean-up * otf linear reference fit * fix tensor device * otf element references and normalizers * use only present elements when fitting * lint * _forward norm and derefd values * fix list of paths in src * total mean and std * fitted flag to avoid refitting normalizers/references on rerun * allow passing lstsq driver * element ref unit tests * remove superfluous type * lint fix * allow setting batch_size explicitly * test applying element refs * normalizer tests * increase distributed timeout * save normalizers and linear refs in otf_fit * remove debug code * fix removing refs * swap otf_fit for fit, and save all normalizers in one file * log loading and saving normalizers * fit references and normalizer scripts * lint fixes * allow absent optim key in config * lin-ref description * read files based on extension * pass seed * rename dataset fixture * check if file is none * pass generator correctly * separate method for norms and refs * add normalizer code back * fix Generator construction * import order * log warnings if multiple inputs are passed * raise Error if duplicate references or norms are set * use len batch * assert element reference targets are scalar * fix name and rename method * load and save norms and refs using same logic * fix creating normalizer * remove print statements * adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * warn instead of error when duplicate norm/ref target names * allow timeout to be read from config * test seed noseed ref fits * lotsa refactoring * lotsa fixing * more fixing... * num_workers zero to prevent mp issues * add otf norms smoke test and fixes * allow overriding normalization fit values * update tests * fix normalizer loading * use rmsd instead of only stdev * fix tests * correct rmsd calc and fix loading * clean up norm loading and log values * logg linear reference metrics * load element references state dict * fix loading and tests * fix imports in scripts * fix test? * fix test * use numpy as default to fit references * minor fixes * rm torch_tempdir fixture --------- Co-authored-by: Brook Wander <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]>
lbluque
pushed a commit
that referenced
this pull request
Aug 7, 2024
* Update BalancedBatchSampler to use datasets' `data_sizes` method Replace BalancedBatchSampler's `force_balancing` and `throw_on_error` parameters with `on_error` * Remove python 3.10 syntax * Documentation * Added set_epoch method * Format * Changed "resolved dataset" message to be a debug log to reduce log spam * clean up batchsampler and tests * base dataset class * move lin_ref to base dataset * inherit basedataset for ase dataset * filter indices prop * added create_dataset fn * yaml load fix * create dataset function instead of filtering in base * remove filtered_indices * make create_dataset and LMDBDatabase importable from datasets * create_dataset cleanup * test create_dataset * use metadata.natoms directly and add it to subset * use self.indices to handle shard * rename _data_sizes * fix Subset of metadata * minor change to metadata, added full path option * import updates * implement get_metadata for datasets; add tests for max_atoms and balanced partitioning * a[:len(a)+1] does not throw error, change to check for this * off by one fix * fixing tests * plug create_dataset into trainer * remove datasetwithsizes; fix base dataset integration; replace close_db with __del__ * lint * add/fix test; * adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * Add extra test case for local batch size = 1 * fix example * fix test case * reorg changes * remove metadata_has_sizes in favor of basedataset function metadata_hasattr * fix data_parallel typo * fix up some tests * rename get_metadata to sample_property_metadata * add slow get_metadata for ase; add tests for get_metadata (ase+lmdb); add test for make lmdb metadata sizes * add support for different backends and ddp in pytest * fix tests and balanced batch sampler * make default dataset lmdb * lint * fix tests * test with world_size=0 by default * fix tests * fix tests.. * remove subsample from oc22 dataset * remove old datasets; add test for noddp * remove load balancing from docs * fix docs; add train_split_settings and test for this --------- Co-authored-by: Nima Shoghi <[email protected]> Co-authored-by: Nima Shoghi <[email protected]> Co-authored-by: lbluque <[email protected]> Co-authored-by: Brandon <[email protected]> Co-authored-by: Brook Wander <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]> (cherry picked from commit 04a69b0)
github-merge-queue bot
pushed a commit
that referenced
this pull request
Aug 9, 2024
* Add solvent interface placement code * circular import * adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * packmol install fix * add packmol to github actions path * speed up random slab generation * support for pymatgen update * save ion id when randomly sampled * vasp 6.3 ml flags are slightly different * add vdw to bulks * add lasph * ncore=4 * typing, docstring, cleanup geometry * typing --------- Co-authored-by: Brook Wander <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Aug 21, 2024
* Update BalancedBatchSampler to use datasets' `data_sizes` method Replace BalancedBatchSampler's `force_balancing` and `throw_on_error` parameters with `on_error` * Remove python 3.10 syntax * Documentation * Added set_epoch method * Format * Changed "resolved dataset" message to be a debug log to reduce log spam * Minor changes to support multitask * add in pickle data set; add in stat functions for combining mean and variance * checksums for equiformer * detach compute metrics and add checksum function for linear layer * change name to dataset_configs * add seed option * remove pickle dataset * remove pickle dataset * add experimental datatransform to ase_dataset * clean up batchsampler and tests * base dataset class * move lin_ref to base dataset * inherit basedataset for ase dataset * filter indices prop * updated import for ase dataset * added create_dataset fn * yaml load fix * create dataset function instead of filtering in base * remove filtered_indices * make create_dataset and LMDBDatabase importable from datasets * create_dataset cleanup * test create_dataset * use metadata.natoms directly and add it to subset * use self.indices to handle shard * rename _data_sizes * fix Subset of metadata * fix up to be mergeable * merge in monorepo * small fix for import and keyerror * minor change to metadata, added full path option * import updates * minor fix to base dataset * skip force_balance and seed * adding get_metadata to base_dataset * implement get_metadata for datasets; add tests for max_atoms and balanced partitioning * a[:len(a)+1] does not throw error, change to check for this * bug fix for base_dataset * max atoms branch * fix typo * do pbc per system * add option to use single system pbc * add multiple mapping * lint and github workflow fixes * track parent checkpoint for logger grouping * add generator to basedataset * check path relative to yaml file * add load and exit flag to base_trainer * add in merge mean and std code to utils * add log when passing through mean or computing; check other paths for includes * add qos flag * use slurm_qos instead of qos * fix includes * fix set init * adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764) * adding new notebook for using fairchem models with NEBs * adding md tutorials * blocking code cells that arent needed or take too long * remove files with diff whitespace * add resolution flag to escn * try to revert oxides * revert typing * remove white space * extra line never reached * move out of fmv4 into dev * move avg num nodes * optional import from experimental * fix lint * add comments, refactor common trainer args in a single dictionary * add comments, refactor common trainer args in a single dictionary * remove parent --------- Co-authored-by: Nima Shoghi <[email protected]> Co-authored-by: Nima Shoghi <[email protected]> Co-authored-by: Abhishek Das <[email protected]> Co-authored-by: lbluque <[email protected]> Co-authored-by: Brandon <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]> Co-authored-by: Ray Gao <[email protected]> Co-authored-by: Brook Wander <[email protected]> Co-authored-by: Muhammed Shuaibi <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.