Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use xtal2png with imagen-pytorch and matbench-genmetrics #204

Open
sgbaird opened this issue Aug 20, 2022 · 6 comments
Open

use xtal2png with imagen-pytorch and matbench-genmetrics #204

sgbaird opened this issue Aug 20, 2022 · 6 comments

Comments

@sgbaird
Copy link
Member

sgbaird commented Aug 20, 2022

matbench-genmetrics is in a usable state now #12 (comment)

I think imagen-pytorch can be used with TPU, but I'm not sure how much custom configuration is required https://github.com/sparks-baird/xtal2png/blob/main/notebooks/3.1-imagen-pytorch.ipynb

I might just need to try it on Colab, switch to TPU, and see what happens. I think the latest versions uses 🤗 Accelerate library will make it easier to switch over. I'm unsure if I should focus more on hyperparameter tuning or just pick some reasonable defaults and train it for as long as seems reasonable (a week or two, for example). If going with my university HPC instead of TPU time, I can still do checkpointing in either case.

@sgbaird
Copy link
Member Author

sgbaird commented Oct 22, 2022

@ ~2000 epochs (4x4 tile)
image

@kjappelbaum
Copy link
Contributor

@ ~2000 epochs (4x4 tile)

do they decode to some reasonable materials? :D

@sgbaird
Copy link
Member Author

sgbaird commented Oct 28, 2022

do they decode to some reasonable materials? :D

I'm going to go with a pretty confident "no" 😬

xtal2png-imagen-pytorch-epoch=1999-6x5

I think I'm also going to say the metrics need some work (note this is for 1000 generated structures):

{0: {'validity': 0.4092998941577499, 'coverage': 0.0, 'novelty': 1.0, 'uniqueness': 1.0}}

@sgbaird
Copy link
Member Author

sgbaird commented Oct 28, 2022

While I'm sure there's a lot to be done with the hyperparameters, I think I'll take another shot at running CDVAE for comparison.

@HarshaSatyavardhan
Copy link

I am seeing coverage as 0 is it not concerning?. I my self have tried and got coverage as 0 with different variations of ddpm.

@sgbaird
Copy link
Member Author

sgbaird commented Oct 13, 2023

@HarshaSatyavardhan thanks for the great question. Concerning - yes, though the idea of rediscovery is quite difficult. To make the point, see what the authors of PGCGM needed to do before "moving the needle" past 0 in https://www.nature.com/articles/s41524-023-01059-8:

image

Notice how the first bar along the horizontal axis starts at 50*10000 = 500,000. Also, the coverage benchmark from matbench-genmetrics is even more difficult to succeed at than what PGCGM did because it uses time-based splits (i.e., not just can we discover something that was held out, but can we discover something "in the future" based only on training data before some calendar year).

I my self have tried and got coverage as 0 with different variations of ddpm.

Thank you for sharing this.

Open to thoughts or suggestions you have. I think both xtal2png and the benchmarks themselves could be improved. Aside: matbench-genmetrics is under review at openjournals/joss-reviews#5618.

cc @hasan-sayeed @sp8rks @michaeldalverson

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants