Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementing a leaderboard #63

Open
sgbaird opened this issue Aug 24, 2022 · 5 comments
Open

implementing a leaderboard #63

sgbaird opened this issue Aug 24, 2022 · 5 comments

Comments

@sgbaird
Copy link
Member

sgbaird commented Aug 24, 2022

Would probably be good to have a somewhat temporary leaderboard (either in the README or as a separate markdown file that gets displayed prominently on the documentation page) and then in the long-term add it to Matbench. materialsproject/matbench#150 (comment)

For the short-term implementation, maybe just tables with bolding applied to the best values within some tolerance (5% perhaps). Might be nice to be able to see composition vs. structural vs. composition + structure metrics, with the default or most prominently displayed metrics being composition + structure. In other words, display the combined metric where both structure and composition conditions need to be met. The other two (only req is meeting composition condition and only req is meeting structure condition) are instructive and help us understand where certain algorithms are lacking. Right now, I don't think the API tracks the composition and structure conditions independently from each other.

@kjappelbaum
Copy link

Happy to factor out code from https://github.com/kjappelbaum/mofdscribe/blob/main/dev_scripts/update_bench.py if it helps. The idea I use in mofdscribe is that there is one json that is produced.

These docs are compiled to RST which can then be used by a sphinx extension to create the tables. (And you can also embed some interactive plot as HTML).

@sgbaird
Copy link
Member Author

sgbaird commented Oct 21, 2022

Happy to factor out code from kjappelbaum/mofdscribe@main/dev_scripts/update_bench.py if it helps. The idea I use in mofdscribe is that there is one json that is produced.

@kjappelbaum ooh, interesting! I'd like to dig into this some more. mofdscribe really is a batteries-included package 😄 @ardunn + co. and I have been discussing integrating matbench-genmetrics into the Matbench leaderboard in a Matbench 2.0. @ardunn mentioned that it should be pretty straightforward. Rn I'm working on getting some leaderboard results before deciding on the final architecture of the leaderboard, since it might make sense to change some of the metrics depending on how the results are. At minimum I'd like to have a couple of models with xtal2png, one model with FTCP, and one with CDVAE. Rn I have one model trained using imagen-pytorch (ElucidatedImagen) + xtal2png using the default hyperparameters. Ran it on an A100 for a few days. So compute heavy..

@sgbaird
Copy link
Member Author

sgbaird commented Jun 17, 2023

Hi @kjappelbaum, I'm recircling this and planning to submit another JOSS manuscript. I think implementing a simple leaderboard within the repository would be best rather than trying to incorporate it elsewhere. Would you still be willing to factor out the leaderboard code from mofdscribe like you mentioned?

@kjappelbaum
Copy link

kjappelbaum commented Jun 30, 2023

Ah, somehow I didn't get the notification from this issue.

From our email thread

I think we might have a simple config file for such a benchmarking package in which you can set:

—dir in which results are stored
—dir into which the leaderboard pages will be written
—path to conf.py for sphinx
—metrics to be logged

For the abstractions, I think I’d implement it as a workflow in which we have the option for users to add various callbacks, e.g. if they want to customize plotting.

Besides, there are some reusable utils, such as the watermark

https://github.com/kjappelbaum/mofdscribe/blob/main/src/mofdscribe/bench/watermark.py

that we could also ship in such a package.

There would perhaps also be a “start” command/CLI that adds the required configuration to the sphinx configuration file.

If we both agree on this setup, I'll make some time end of next week to do it.

@sgbaird
Copy link
Member Author

sgbaird commented Jul 1, 2023

@kjappelbaum no worries, I think I sent this message concurrently with the email thread. This sounds great to me. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants