Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding new notebook for using fairchem models with NEBs without CatTSunami enumeration #764

Merged
merged 3 commits into from
Jul 16, 2024

Conversation

brookwander
Copy link
Collaborator

No description provided.

@brookwander brookwander requested a review from mshuaibii July 15, 2024 14:53
Copy link

codecov bot commented Jul 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Copy link
Collaborator

@mshuaibii mshuaibii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - this should add in nicely!

@mshuaibii mshuaibii added this pull request to the merge queue Jul 16, 2024
Merged via the queue into main with commit 5743a59 Jul 16, 2024
6 of 7 checks passed
@mshuaibii mshuaibii deleted the adding-workbook-cattsunami branch July 16, 2024 17:21
@zulissimeta
Copy link
Collaborator

@mshuaibii Note that the notebook does not actually build (https://github.com/FAIR-Chem/fairchem/actions/runs/9961322115/job/27522563768)

@brookwander can you fix the notebook in a new PR?

@zulissimeta zulissimeta mentioned this pull request Jul 18, 2024
@mshuaibii mshuaibii restored the adding-workbook-cattsunami branch July 18, 2024 23:29
github-merge-queue bot pushed a commit that referenced this pull request Jul 19, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* updating approach to path to work with ipython

* adding seed to NRR example which randomly had not configs on last push

* Update docs/tutorials/NRR/NRR_example.md

* adding file :(|)

* skip neb execution

* status check on merge queue

---------

Co-authored-by: zulissimeta <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Jul 19, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* trigger docs build only on PR review submit

---------

Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
misko added a commit that referenced this pull request Jul 19, 2024
…ut CatTSunami enumeration (#764)"

This reverts commit 5743a59.
github-merge-queue bot pushed a commit that referenced this pull request Jul 19, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* fix dataset config logic

* add empty val/test if not defined

* add empty dicts for all missing datasets

---------

Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
Co-authored-by: zulissimeta <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Jul 19, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* OCP->FAIRChem + paper list

---------

Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Jul 22, 2024
* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* add expandable segments var

* add note

---------

Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Aug 2, 2024
* Update BalancedBatchSampler to use datasets' `data_sizes` method
Replace BalancedBatchSampler's `force_balancing` and `throw_on_error` parameters with `on_error`

* Remove python 3.10 syntax

* Documentation

* Added set_epoch method

* Format

* Changed "resolved dataset" message to be a debug log to reduce log spam

* clean up batchsampler and tests

* base dataset class

* move lin_ref to base dataset

* inherit basedataset for ase dataset

* filter indices prop

* added create_dataset fn

* yaml load fix

* create dataset function instead of filtering in base

* remove filtered_indices

* make create_dataset and LMDBDatabase importable from datasets

* create_dataset cleanup

* test create_dataset

* use metadata.natoms directly and add it to subset

* use self.indices to handle shard

* rename _data_sizes

* fix Subset of metadata

* minor change to metadata, added full path option

* import updates

* implement get_metadata for datasets; add tests for max_atoms and balanced partitioning

* a[:len(a)+1] does not throw error, change to check for this

* off by one fix

* fixing tests

* plug create_dataset into trainer

* remove datasetwithsizes; fix base dataset integration; replace close_db with __del__

* lint

* add/fix test;

* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* Add extra test case for local batch size = 1

* fix example

* fix test case

* reorg changes

* remove metadata_has_sizes in favor of basedataset function metadata_hasattr

* fix data_parallel typo

* fix up some tests

* rename get_metadata to sample_property_metadata

* add slow get_metadata for ase; add tests for get_metadata (ase+lmdb); add test for make lmdb metadata sizes

* add support for different backends and ddp in pytest

* fix tests and balanced batch sampler

* make default dataset lmdb

* lint

* fix tests

* test with world_size=0 by default

* fix tests

* fix tests..

* remove subsample from oc22 dataset

* remove old datasets; add test for noddp

* remove load balancing from docs

* fix docs; add train_split_settings and test for this

---------

Co-authored-by: Nima Shoghi <[email protected]>
Co-authored-by: Nima Shoghi <[email protected]>
Co-authored-by: lbluque <[email protected]>
Co-authored-by: Brandon <[email protected]>
Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Aug 5, 2024
* denorm targets in _forward only

* linear reference class

* atomref in normalizer

* raise input error

* clean up normalizer interface

* add element refs

* add element refs correctly

* ruff

* fix save_checkpoint

* reference and dereference

* 2xnorm linref trainer add

* clean-up

* otf linear reference fit

* fix tensor device

* otf element references and normalizers

* use only present elements when fitting

* lint

* _forward norm and derefd values

* fix list of paths in src

* total mean and std

* fitted flag to avoid refitting normalizers/references on rerun

* allow passing lstsq driver

* element ref unit tests

* remove superfluous type

* lint fix

* allow setting batch_size explicitly

* test applying element refs

* normalizer tests

* increase distributed timeout

* save normalizers and linear refs in otf_fit

* remove debug code

* fix removing refs

* swap otf_fit for fit, and save all normalizers in one file

* log loading and saving normalizers

* fit references and normalizer scripts

* lint fixes

* allow absent optim key in config

* lin-ref description

* read files based on extension

* pass seed

* rename dataset fixture

* check if file is none

* pass generator correctly

* separate method for norms and refs

* add normalizer code back

* fix Generator construction

* import order

* log warnings if multiple inputs are passed

* raise Error if duplicate references or norms are set

* use len batch

* assert element reference targets are scalar

* fix name and rename method

* load and save norms and refs using same logic

* fix creating normalizer

* remove print statements

* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* warn instead of error when duplicate norm/ref target names

* allow timeout to be read from config

* test seed noseed ref fits

* lotsa refactoring

* lotsa fixing

* more fixing...

* num_workers zero to prevent mp issues

* add otf norms smoke test and fixes

* allow overriding normalization fit values

* update tests

* fix normalizer loading

* use rmsd instead of only stdev

* fix tests

* correct rmsd calc and fix loading

* clean up norm loading and log values

* logg linear reference metrics

* load element references state dict

* fix loading and tests

* fix imports in scripts

* fix test?

* fix test

* use numpy as default to fit references

* minor fixes

* rm torch_tempdir fixture

---------

Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
lbluque pushed a commit that referenced this pull request Aug 7, 2024
* Update BalancedBatchSampler to use datasets' `data_sizes` method
Replace BalancedBatchSampler's `force_balancing` and `throw_on_error` parameters with `on_error`

* Remove python 3.10 syntax

* Documentation

* Added set_epoch method

* Format

* Changed "resolved dataset" message to be a debug log to reduce log spam

* clean up batchsampler and tests

* base dataset class

* move lin_ref to base dataset

* inherit basedataset for ase dataset

* filter indices prop

* added create_dataset fn

* yaml load fix

* create dataset function instead of filtering in base

* remove filtered_indices

* make create_dataset and LMDBDatabase importable from datasets

* create_dataset cleanup

* test create_dataset

* use metadata.natoms directly and add it to subset

* use self.indices to handle shard

* rename _data_sizes

* fix Subset of metadata

* minor change to metadata, added full path option

* import updates

* implement get_metadata for datasets; add tests for max_atoms and balanced partitioning

* a[:len(a)+1] does not throw error, change to check for this

* off by one fix

* fixing tests

* plug create_dataset into trainer

* remove datasetwithsizes; fix base dataset integration; replace close_db with __del__

* lint

* add/fix test;

* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* Add extra test case for local batch size = 1

* fix example

* fix test case

* reorg changes

* remove metadata_has_sizes in favor of basedataset function metadata_hasattr

* fix data_parallel typo

* fix up some tests

* rename get_metadata to sample_property_metadata

* add slow get_metadata for ase; add tests for get_metadata (ase+lmdb); add test for make lmdb metadata sizes

* add support for different backends and ddp in pytest

* fix tests and balanced batch sampler

* make default dataset lmdb

* lint

* fix tests

* test with world_size=0 by default

* fix tests

* fix tests..

* remove subsample from oc22 dataset

* remove old datasets; add test for noddp

* remove load balancing from docs

* fix docs; add train_split_settings and test for this

---------

Co-authored-by: Nima Shoghi <[email protected]>
Co-authored-by: Nima Shoghi <[email protected]>
Co-authored-by: lbluque <[email protected]>
Co-authored-by: Brandon <[email protected]>
Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>

(cherry picked from commit 04a69b0)
github-merge-queue bot pushed a commit that referenced this pull request Aug 9, 2024
* Add solvent interface placement code

* circular import

* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* packmol install fix

* add packmol to github actions path

* speed up random slab generation

* support for pymatgen update

* save ion id when randomly sampled

* vasp 6.3 ml flags are slightly different

* add vdw to bulks

* add lasph

* ncore=4

* typing, docstring, cleanup geometry

* typing

---------

Co-authored-by: Brook Wander <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Aug 21, 2024
* Update BalancedBatchSampler to use datasets' `data_sizes` method
Replace BalancedBatchSampler's `force_balancing` and `throw_on_error` parameters with `on_error`

* Remove python 3.10 syntax

* Documentation

* Added set_epoch method

* Format

* Changed "resolved dataset" message to be a debug log to reduce log spam

* Minor changes to support multitask

* add in pickle data set; add in stat functions for combining mean and variance

* checksums for equiformer

* detach compute metrics and add checksum function for linear layer

* change name to dataset_configs

* add seed option

* remove pickle dataset

* remove pickle dataset

* add experimental datatransform to ase_dataset

* clean up batchsampler and tests

* base dataset class

* move lin_ref to base dataset

* inherit basedataset for ase dataset

* filter indices prop

* updated import for ase dataset

* added create_dataset fn

* yaml load fix

* create dataset function instead of filtering in base

* remove filtered_indices

* make create_dataset and LMDBDatabase importable from datasets

* create_dataset cleanup

* test create_dataset

* use metadata.natoms directly and add it to subset

* use self.indices to handle shard

* rename _data_sizes

* fix Subset of metadata

* fix up to be mergeable

* merge in monorepo

* small fix for import and keyerror

* minor change to metadata, added full path option

* import updates

* minor fix to base dataset

* skip force_balance and seed

* adding get_metadata to base_dataset

* implement get_metadata for datasets; add tests for max_atoms and balanced partitioning

* a[:len(a)+1] does not throw error, change to check for this

* bug fix for base_dataset

* max atoms branch

* fix typo

* do pbc per system

* add option to use single system pbc

* add multiple mapping

* lint and github workflow fixes

* track parent checkpoint for logger grouping

* add generator to basedataset

* check path relative to yaml file

* add load and exit flag to base_trainer

* add in merge mean and std code to utils

* add log when passing through mean or computing; check other paths for includes

* add qos flag

* use slurm_qos instead of qos

* fix includes

* fix set init

* adding new notebook for using fairchem models with NEBs without CatTSunami enumeration (#764)

* adding new notebook for using fairchem models with NEBs

* adding md tutorials

* blocking code cells that arent needed or take too long

* remove files with diff whitespace

* add resolution flag to escn

* try to revert oxides

* revert typing

* remove white space

* extra line never reached

* move out of fmv4 into dev

* move avg num nodes

* optional import from experimental

* fix lint

* add comments, refactor common trainer args in a single dictionary

* add comments, refactor common trainer args in a single dictionary

* remove parent

---------

Co-authored-by: Nima Shoghi <[email protected]>
Co-authored-by: Nima Shoghi <[email protected]>
Co-authored-by: Abhishek Das <[email protected]>
Co-authored-by: lbluque <[email protected]>
Co-authored-by: Brandon <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
Co-authored-by: Ray Gao <[email protected]>
Co-authored-by: Brook Wander <[email protected]>
Co-authored-by: Muhammed Shuaibi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants