Benchmarking Lumol #62

Luthaf · 2016-11-30T15:58:14Z

g-bauer · 2016-11-30T18:20:31Z

For MC, to go to non-rigid molecules or molecules with more than 3 beads, we need a way to sample intramolecular configurations. See #35.

odarbelaeze · 2016-11-30T23:43:51Z

I noticed a few long running tests this morning on the tests folder of the project, I think that would discourage a nice test-on-save policy. It would be beneficial to reevaluate those and see if smaller sizes or move counts would do for those integration tests, and move the ones that require longer periods of time into the benches folder. (I apologise if I'm a little bit off topic here)

Luthaf · 2016-12-01T13:00:05Z

I'd prefer to keep these tests as test, but we can discuss it.

I usually only run the unit tests when developing (cd src/core && cargo test), and only run the integration tests before pushing. I also run the integration tests (the ones in the tests folder) in release mode cargo --test release which cuts a lot of run time here.

My setup is a bit clumsy at the moment, but it will become easier when cargo gets a --all flag (there is a PR for that in cargo repo). At this point, running unit tests will be as simple as cargo test --all --lib from the root directory, and running integration tests will still be cargo test --release.

g-bauer · 2016-12-01T13:39:33Z

You can assign me to the Lennard-Jones cases for NVT and NPT. Before that, I will implement energy cashing for resize_cell_cost. Maybe I manage to add that to the open PR #58 in time.

A practical question: in #58, I added an example case. Somehow, I couldn't manage to make this a #[test]. I compiles, but is not able to find the configuration file. I placed the config in lumol/tests/data/ and the test in lumol/tests. Using cargo test --test my_test --release fails.

I get the same error for existing tests, like mc-helium.rs:

$ cargo test --test mc-helium
# compiling ...
Finished debug [unoptimized + debuginfo] target(s) in 73.96 secs
Running /usr/.../lumol/target/debug/mc_helium-2217974ed8ac85bc

failures:
---- perfect_gaz stdout ----
        thread 'perfect_gaz' panicked at 'called `Result::unwrap()` on an `Err` value: TrajectoryError(Error { kind: NullPtr, message: "Could not open the file data/helium.xyz" })', ../src/libcore/result.rs:799

Any idea what is going wrong or what I am doing wrong here? Do I have to copy the data/ to the target directory?

Luthaf · 2016-12-01T13:43:22Z

Any idea what is going wrong or what I am doing wrong here?

Lets keep this issue on topic, I'll answer you in #58.

Luthaf · 2016-12-09T12:38:30Z

We can use bencher to run the benchmarks using a stable rust compiler, instead of a nightly one.

antoinewdg · 2017-02-09T10:05:15Z

This is a must have if we are to do any kind of optimization on Lumol. I would really like to help on this one, but have no real idea what a reasonable configuration for molecules looks like :( .

g-bauer · 2017-02-09T13:39:16Z

I'd be happy to help you with that. I can create configuration files for all systems mentioned in the OP. To compare Lumol to other codes, we have to make sure that the same force fields are used.

For Lennard-Jones (Argon), we can already cover all cases but the grand-canonical ensemble. Maybe we should also include an atomic mixture (Na, Cl) as intermediate step towards water?

For water and butane it's more difficult. For SPC/E water we can do NVT and NPT comparisons for MC simulations right now. For butane, we'd need #35 for MC simulations, MD should work for force fields that use non-fixed bond lengths.

I'd say we start with comparison benchmarks of Argon.

g-bauer · 2017-02-09T13:42:00Z

I can also perform the simulations in Gromacs, Gromos (maybe DL_Poly) and Cassandra to compare performance.

antoinewdg · 2017-02-09T15:51:13Z

@g-bauer thanks ! I'm OK for starting with argon.
How are we doing this ? You provide me with the input files and I try to wire it all together ?

g-bauer · 2017-02-09T16:20:48Z

How are we doing this ? You provide me with the input files and I try to wire it all together ?

For argon, we already have inputs in the examples/data folder. I'd use the example configuration argon.xyz which has 500 atoms. We also have an example for MD (argon.rs) and for NPT MC (mc_npt_argon.rs).

I think it is more convenient to just use input files instead of rust bins? We can set up an argon force field file (argon.toml) only specifying the potential and then several simulation inputs, like mc_nvt_argon.toml, md_nvt_argon.toml, ... that all make use of the force field and configuration file.

We have to use the same cutoff radius for all simulations (say rc = 2.7 * sigma = 9.1935 Angstrom). Also, we should use the same frequencies to write data to files.

Hopefully, I'll get #94 finished tonight, so that we can use it for the MC part.

Does that make sense? If anything is unclear, feel free to ping me (here or gitter).

Luthaf · 2017-02-09T16:26:59Z

only specifying the potential and then several simulation inputs,

This is the exact use case of this feature 😃 !

Also, we should use the same frequencies to write data to files.

I think it is better not to write anything during the benchmark run. We are not benchmarking how the filesystem behave, and it can have a lot of latency and variations.

antoinewdg · 2017-02-10T09:47:55Z

File organisation is wonky for the benchmarks right now. We need a clear separation between regression benchmarks and comparison benchmarks (right now comparisons are in a other subdirectory inside the regression benchmarks). I agree with @g-bauer that comparison benchmarks should use input files, in my opinion they should use the lumol binary and could even live in a separate repository (they don't have to).

The current way regression benchmarks are done using cargo bench in the benches directory is fine with me. How i would ideally see the file structure for the comparison benchmarks is :

each folder corresponds to a case listed in the OP (ex: mc-butane-nve)
each folder contains one lumol.toml and optionally one input file for each ohter engine (LAMMPS.in, ...) that each corresponds to the same simulation
at the root we have a script (please no bash script) that iterates through all the folders and runs the computations with the different engines
I could even reuse this to profile the code, in dreamland I would even have this run regularly and uploading the results on a website without me having to do anything

Luthaf · 2017-02-10T10:05:17Z

I am OK with this organisation. The current code is pretty old and comes from the very first times of this repository.

please no bash script

What would you use ? I'm all in for Python, or even a rust "script".

I could even reuse this to profile the code, in dreamland I would even have this run regularly and uploading the results on a website without me having to do anything

This is #49. The main documentation is hosted in github pages right now, we could use it for benchmarks too. I was thinking we could write benchmark results to a JSON file, and then load them and plot them using some JS plotting library.

antoinewdg · 2017-02-10T10:20:00Z

@Luthaf yes I'm 100 % in favor of a Python script (I just find bash scripts unreadable).

Plotting the benchmarks would super nice, but in case you guys don't know: the world of JS graph plotting is HELL.

g-bauer · 2017-02-10T10:50:13Z

Sounds good to me.

each folder contains one lumol.toml and optionally one input file for each ohter engine (LAMMPS.in, ...) that each corresponds to the same simulation

Other codes may need plenty of different files (special configuration format, multiple inputs for simulation setup and force fields). Just dropping the files inside a directory will be messy. Maybe another subfolder for every engine?

at the root we have a script (please no bash script) that iterates through all the folders and runs the computations with the different engines

You would also rerun the simulations using other engines? I imagine that is very tedious to set up. To start, could we go with a single run (maybe on different systems), store benchmarks and compare against those?

Plotting the benchmarks would super nice, ...

If we go with python, why not use matplotlib (or jupyter notebooks)? It is easy to use and set up.

antoinewdg · 2017-02-10T11:01:36Z

Other codes may need plenty of different files (special configuration format, multiple inputs for simulation setup and force fields). Just dropping the files inside a directory will be messy. Maybe another subfolder for every engine?

Can the other engines have a single file as input, and have links to other files inside this file ? (as we do with lumol), or do they need everything to be in the same directory ? In the second case, I agree with having a subdirectory for each engine. I don't mean that in the first case we do not have subdirectories, I just mean that we force on each case having the input file, and on a case per case basis we can choose the directory structure.

You would also rerun the simulations using other engines? I imagine that is very tedious to set up. To start, could we go with a single run (maybe on different systems), store benchmarks and compare against those?

What do you mean by tedious ? If you mean long, yeah probably, but that's not really an issue, we expect them to be super long anyways right ? And I don't really know how consistent is performance on Travis CI across builds, so I can see a lot of issues coming from not running them each time (admittedly, Travis CI performance may not be consistent in the same build, that would be an issue).

If we go with python, why not use matplotlib (or jupyter notebooks)? It is easy to use and set up.

How would you integrate them in a web page ?

g-bauer · 2017-02-10T15:05:37Z

Can the other engines have a single file as input, and have links to other files inside this file ?

As far as I know (at least for Gromacs and Cassandra) that is not possible.

What do you mean by tedious ?

I might not understand the whole procedure for the benchmarks but we'd need installations of all codes, right? They often depend on a bunch of additional libraries and are compiled to fit the architecture.

antoinewdg · 2017-02-10T16:25:53Z

I might not understand the whole procedure for the benchmarks but we'd need installations of all codes, right? They often depend on a bunch of additional libraries and are compiled to fit the architecture.

OK, but to benchmark we would have to install all of this on the same machine we want to benchmark Lumol on, so running it each time may not be much more painful than running it once. However installing the other engines on Travis may be a huge pain. Is there something I don't get ?

Luthaf · 2017-02-15T10:13:19Z

If we want to run the benchmarks using a stable compiler, I found two options:

bencher is the standard library benchmark code extracted and adapted;
criterion is a separated library with more setup/measurement options.

g-bauer · 2017-02-15T10:29:13Z

However installing the other engines on Travis may be a huge pain.

That's something I'm not experienced in (Travis). Are there limits of what we can run using Travis? Simulation times, resources, libraries?

Luthaf · 2017-02-15T10:36:54Z

Yes, there are limitations (Travis is intended as a testing service, not a benchmarking service):

Simulation times: I believe the job is killed after 1h
Resources: see here, we get 4GB of memory and 2 cores
Libraries: we can use most of the libraries available in Ubuntu, or build our own (that will take build time, but it can be cached)

I think that we should run specific benchmarks (energy computation for different system, and one complex simulation) on Travis, and run benchmarks comparing with other codes from time to time on our machines, and upload the results.

There might be other providers more suitable for benchmarks too.

antoinewdg · 2017-02-17T09:31:34Z

I didn't know about the 1 hour limit, this can be a problem.
I tried to search for Travis like providers oriented towards performance testing, I didn't find much :(

antoinewdg · 2017-03-22T17:35:24Z

Should we run the current benchmarks on Travis ? From my point of view it would be extremely valuable to be able to run them on a remote machine for each commit: when I run them on my local machine I can basically do nothing else, so my productivity is kinda ruined.

Luthaf added A-Performance A-Tests labels Nov 30, 2016

Luthaf mentioned this issue Nov 30, 2016

Regularly run the benchmarks #49

Open

antoinewdg mentioned this issue Jan 20, 2017

Profiling performance #86

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking Lumol #62

Benchmarking Lumol #62

Luthaf commented Nov 30, 2016 •

edited

Loading

g-bauer commented Nov 30, 2016

odarbelaeze commented Nov 30, 2016 •

edited

Loading

Luthaf commented Dec 1, 2016

g-bauer commented Dec 1, 2016 •

edited

Loading

Luthaf commented Dec 1, 2016

Luthaf commented Dec 9, 2016

antoinewdg commented Feb 9, 2017

g-bauer commented Feb 9, 2017 •

edited

Loading

g-bauer commented Feb 9, 2017

antoinewdg commented Feb 9, 2017

g-bauer commented Feb 9, 2017 •

edited

Loading

Luthaf commented Feb 9, 2017

antoinewdg commented Feb 10, 2017

Luthaf commented Feb 10, 2017

antoinewdg commented Feb 10, 2017

g-bauer commented Feb 10, 2017

antoinewdg commented Feb 10, 2017

g-bauer commented Feb 10, 2017

antoinewdg commented Feb 10, 2017 •

edited

Loading

Luthaf commented Feb 15, 2017

g-bauer commented Feb 15, 2017

Luthaf commented Feb 15, 2017

antoinewdg commented Feb 17, 2017

antoinewdg commented Mar 22, 2017

Benchmarking Lumol #62

Benchmarking Lumol #62

Comments

Luthaf commented Nov 30, 2016 • edited Loading

Regression benchmarks

Simulation benchmarks

g-bauer commented Nov 30, 2016

odarbelaeze commented Nov 30, 2016 • edited Loading

Luthaf commented Dec 1, 2016

g-bauer commented Dec 1, 2016 • edited Loading

Luthaf commented Dec 1, 2016

Luthaf commented Dec 9, 2016

antoinewdg commented Feb 9, 2017

g-bauer commented Feb 9, 2017 • edited Loading

g-bauer commented Feb 9, 2017

antoinewdg commented Feb 9, 2017

g-bauer commented Feb 9, 2017 • edited Loading

Luthaf commented Feb 9, 2017

antoinewdg commented Feb 10, 2017

Luthaf commented Feb 10, 2017

antoinewdg commented Feb 10, 2017

g-bauer commented Feb 10, 2017

antoinewdg commented Feb 10, 2017

g-bauer commented Feb 10, 2017

antoinewdg commented Feb 10, 2017 • edited Loading

Luthaf commented Feb 15, 2017

g-bauer commented Feb 15, 2017

Luthaf commented Feb 15, 2017

antoinewdg commented Feb 17, 2017

antoinewdg commented Mar 22, 2017

Luthaf commented Nov 30, 2016 •

edited

Loading

odarbelaeze commented Nov 30, 2016 •

edited

Loading

g-bauer commented Dec 1, 2016 •

edited

Loading

g-bauer commented Feb 9, 2017 •

edited

Loading

g-bauer commented Feb 9, 2017 •

edited

Loading

antoinewdg commented Feb 10, 2017 •

edited

Loading