Add automated benchmarks #256

GeorgeR227 · 2024-07-08T19:58:43Z

This PR adds scripts to be able to run automated Decapodes benchmarks using SLURM. Close #255.

GeorgeR227 · 2024-07-09T19:05:07Z

Example table produced by this pipeline:

float_type	code_target	resolution	Setup(s)	Mesh(s)	Simulate(s)	Solve(s)
Float32	CPUTarget	1	0.00442026	2.06637	0.0048096	0.407688
Float32	CPUTarget	1	0.00606146	1.25604	0.00592313	0.539786
Float32	CPUTarget	2	0.00604858	0.0633645	0.00154441	0.129122
Float32	CPUTarget	2	0.00476156	0.0485201	0.00120713	0.0989334
Float32	CPUTarget	5	0.00601547	0.00268157	0.00026683	0.0207069
Float32	CPUTarget	5	0.00491691	0.00213588	0.000213745	0.0166049
Float64	CPUTarget	1	0.00578402	2.60628	0.00817521	0.582806
Float64	CPUTarget	1	0.00421326	1.29076	0.00487088	0.423234
Float64	CPUTarget	2	0.00604492	0.0653029	0.00152405	0.120193
Float64	CPUTarget	2	0.00606543	0.0722557	0.0016161	0.130485
Float64	CPUTarget	5	0.00578138	0.00266238	0.000281898	0.0200672
Float64	CPUTarget	5	0.0065762	0.00273011	0.000276936	0.022008

GeorgeR227 · 2024-07-10T19:16:57Z

Tables now include more info, plus fixed bug where the times were getting put in incorrect columns:

Task ID	float_type	code_target	resolution	Setup	Mesh	Simulate	Solve	Steps	Steps/Second
3	Float32	CPUTarget	1	0.00425298	0.400341	0.00475522	0.808238	1554	1922
6	Float32	CPUTarget	1	0.00433072	0.410321	0.00477151	0.840614	1554	1848
2	Float32	CPUTarget	2	0.00409975	0.097131	0.00121753	0.0544297	401	7367
5	Float32	CPUTarget	2	0.00429304	0.0966397	0.0012154	0.0554229	401	7235
1	Float32	CPUTarget	5	0.00414652	0.0156887	0.000215182	0.00250875	78	31091
4	Float32	CPUTarget	5	0.00415102	0.0157135	0.000207889	0.00235587	78	33108
12	Float64	CPUTarget	1	0.00469358	0.463062	0.00565343	1.32324	1549	1170
9	Float64	CPUTarget	1	0.00574905	0.488389	0.00629864	1.24648	1549	1242
11	Float64	CPUTarget	2	0.00573913	0.118075	0.00158401	0.0829361	395	4762
8	Float64	CPUTarget	2	0.00537613	0.117056	0.00159936	0.0816467	395	4837
10	Float64	CPUTarget	5	0.00558345	0.0196298	0.000277577	0.00262849	72	27392
7	Float64	CPUTarget	5	0.00542299	0.0189401	0.000276971	0.00265752	72	27092

GeorgeR227 · 2024-07-18T22:26:12Z

I've just integrated DrWatson.jl into the benchmarking suite. It's ability to handle different input parameters and keep track of file directories is awesome but it seems to have trouble with reading in the data from the saved benchmark table correctly. DrWatson really seems to prefer working with JLD2 but BenchmarkTools saves its results as JSON and also seems to include a lot of extra information.

jpfairbanks · 2024-08-01T14:22:16Z

benchmarks/README.md

+using DrWatson
+@quickactivate :benchmarks
+```
+which auto-activate the project, enable local path handling from DrWatson and provide several helper functions.


I followed these instructions and got this error message

(Decapodes) pkg> instantiate julia> @quickactivate :benchmarks ERROR: ArgumentError: Package benchmarks not found in current path. - Run `import Pkg; Pkg.add("benchmarks")` to install the benchmarks package. Stacktrace: [1] macro expansion @ Base ./loading.jl:1766 [inlined] [2] macro expansion @ Base ./lock.jl:267 [inlined] [3] __require(into::Module, mod::Symbol) @ Base ./loading.jl:1747 [4] #invoke_in_world#3 @ Base ./essentials.jl:921 [inlined] [5] invoke_in_world @ Base ./essentials.jl:918 [inlined] [6] require(into::Module, mod::Symbol) @ Base ./loading.jl:1740 [7] macro expansion @ ~/.julia/packages/DrWatson/qmLuV/src/project_setup.jl:213 [inlined] [8] top-level scope @ REPL[6]:1

(Decapodes) pkg> dev ./benchmarks fixed it. although I got a Circular dependency warning. so now i can't precompile Decapodes and benchmarks.

it seems like the instructions should be

Clone the Decapodes repo

cd to the Decapodes/benchmarks directory

run julia>]dev ..

run julia> include("main.jl")

and this main.jl should run all the benchmarks that are configured in the main_config.toml file. This way it is more like docs builds than like an interactive tool.

maybe we want main.jl to define a function benchmark(name::String, config="main_config.toml") that runs a simulation with the name and benchmark(config="main_config.toml") that runs all the configured benchmarks

Yeah, for the main.jl I can add that. Actually a lot of the code there can probably refactored.

I understand the confusion with the DrWatson instructions, the @quickactivate works by searching upwards for a toml name with benchmark so running it from Decapodes won't work, as you've seen. However, running it from within the benchmarks dir should work. I'll add an instruction asking the user to cd into benchmarks.

benchmarks/src/brussel/brussel.jl

benchmarks/README.md

jpfairbanks · 2024-08-02T01:09:33Z

benchmarks/README.md

+
+Please view ```main_config.toml``` as a guiding example on how to craft your own TOML. 
+
+**Warning**: ```config_generate.jl``` is not called automatically so it is up to you to run the script before launching benchmarks. 


If you don't call this the error message is quite cryptic. maybe the main.jl script should check that the configs exist and if not, then automatically call config_generate for you?

If we do this, we should just always call config_generate to avoid stale configs. I'd like to update the that cryptic error, was the error you referenced the, "No configuration information found for $sim_name"?

Yeah that was the cryptic error. I feel like a makefile approach of conditional compilation is such a good fit for what you are doing. If the individual configs don't exists, then you need to regenerate them. But I think we can get rid of that step entirely by a later comment

jpfairbanks · 2024-08-02T01:24:37Z

benchmarks/README.md

+1. ```setup_benchmark```, which will create the Decapode and run ```eval(gensim())``` on it. Return the evaluated function.
+2. ```create_mesh```, which will create the mesh upon which the simulation will run and also initialize the initial conditions and any constants/parameters. Return the mesh, initial conditions and constants/parameters in that order.
+3. ```create_simulate```, which will take the generated mesh and evaluated function and run the simulate function. Return the resulting function.
+4. ```run_simulation```, which will take the resulting simulation function, initial conditions and constants/parameters and run the solve. Return the result of the solve.


Instead of relying on functions with particular names, we could make a struct in Decapodes that is SimulationConfig that has these 4 functions as fields and then constructors can provide default implementations of these fields if it makes sense to go with a generic implementation.

That would give users a way of encapsulating a simulation into a reusable chunk for their benchmarks, which would live outside this repo.

Yeah, that sounds good. This struct can live in the src and be brought into scope with the @quickactive :benchmarks. I'll also add a function field that takes in the file config information and organizes it into something the simulation can use throughout.

jpfairbanks · 2024-08-02T01:27:10Z

benchmarks/README.md

+
+**Warning**: Note that not all information from the benchmarking run is saved to the result files and any files in ```data/sims/"sim_name"``` will be deleted upon the next benchmark run. On the other hand, result files in the ```autogen``` directory mentioned before will never be deleted by the benchmarking.
+
+An example Markdown file is output in ```data/exp_pro/"sim_name"/"slurm_job_id"``` for user inspection, called ```final.md```.


What does exp_pro stand for? A more informative name would be helpful here.

exp_pro was a default folder provided by DrWatson, meant for "Data from processing experiments" per their docs. But it's fair enough to consider that we don't need to actually use these folders.

benchmarks/README.md

benchmarks/src/helpers/paths.jl

benchmarks/src/helpers/sim_fallback.jl

jpfairbanks · 2024-08-02T01:56:03Z

benchmarks/src/main_config.toml

+[heat.cuda]
+code_target = "CUDATarget"
+float_type = ["Float32", "Float64"]
+resolution = [5, 2, 1]


Now that I understand that:

each simulation still needs code for its set up that needs to go in a folder

the simulations could in theory have different dependencies (because they are in their own folder)

there is a generate_configs script that is copying this data into individual config tomls for each simulation.

It makes sense to me that these config tomls actually live 1 per folder and specify all the configurations of the same simulation that you want to run. But the global main_config.toml is actually just a list of all the configurations that you want to run. Much like in the docs, docs/main.jl has a big list of all the pages in the docs to tell you which markdown files to include.

There is also the possibility to do a makefile with a rule like

for every folder f in benchmarks/src/$f there is a target benchmarks/results/$f/run.out and in order to make that target you need the dependencies benchmarks/src/$f/main.jl and benchmarks/src/$f/config.toml

and then you could do

> make all

and have all the benchmarks run. Or run a specific benchmark with

> make benchmarks/results/f/run.out

A make rule that depends on all the run.out targets could then be used to make the final summary tables in HTML format.

The idea with having different configs per architecture was that a user might want different configurations for each. For example, gpu sims could include larger mesh sizes or longer times. I think this flexibility is nice to have.

I like your idea of the role of the main_config.toml since the data processing is the same, we just change where we read the data. Plus, if we keep separate tomls for each architecture then the user can choose to run just cpu benchmarks or just gpu benchmarks.

Just thinking through the pipeline, the user can just provide their main_config.toml which lists all sims/arches to run. main.jl can then read the main config and auto-run the config generation that looks for the user provided toml for each of these sims and generates a list of configs for tasks. It'll then basically just run the benchmarks as it does now.

We can implement the make all feature by scanning the src dir for valid physics folders and then running through the same pipeline as above.

As for post-processing, the idea now is to collect all the data from the benchmarks into .jld2 files that can be collected into dataframes using DrWatson. I've provided a basic post-processing step in scripts/post_processing/default_out.jl but we can add more scripts for different post-processing as we find the need for it. This could include deciding to output to markdown or HTML or whatever else.

I don't see that post processing script. Did you commit it?

Yeah that's my bad, I just pushed those.

The idea with having different configs per architecture was that a user might want different configurations for each. For example, gpu sims could include larger mesh sizes or longer times. I think this flexibility is nice to have.

Yeah, I think you should be able to have multiple configurations in the same folder and then the main_config.toml should list out the ones you want to run.

So you could have heat.cpu and heat.cuda and heat.cuda-massive all in heat/config.toml and then in benchmarks/config.toml only list out the ones you want to run.

benchmarks/src/helpers/param_parsing.jl

test/benchmarks/metaconfig.csv

benchmarks/scripts/final.sh

benchmarks/scripts/clean.sh

GeorgeR227 and others added 5 commits July 1, 2024 17:58

Added initial benchmarks

109cb92

Heat benchmarks

b353bc9

Moved file logging

9eb3b71

Added a basic pipeline for benchmarks.

d6592ce

Full pipeline to markdown table

1ec16fc

Better log organization and keeps tracks of solver steps

735f6e0

George Rauta and others added 2 commits July 11, 2024 16:56

Added support for multiple sims

22f0dc8

Delete Manifest.toml

2f4b6b1

GeorgeR227 marked this pull request as ready for review July 12, 2024 18:08

GeorgeR227 requested a review from lukem12345 July 12, 2024 18:20

GeorgeR227 and others added 8 commits July 12, 2024 15:35

Added projects to julia scripts

0f200ff

Explicitly match the period char

00e64af

Overhauled config generation

97e76a1

Removed generated files

535e427

Overhauled configuration generation

e7ed6a1

Support for CUDA benchmarking

cfe8f32

Deleted old benchmarking suite

a425781

Integrated DrWatson into benchmarks

7c718d9

GeorgeR227 added 9 commits July 19, 2024 15:47

Stronger config generation

5c375d9

Much stronger data collection

2cc005f

Added full README

e36c10e

Remove my email

9dd5980

Added compat entries

250473b

Moved more code into src

3b94f20

Shuffle main code around

0d3c27f

More config tests

126869a

Clean up result processing

1d0c369

Testing of data aggregation

8e79a9f

GeorgeR227 requested a review from jpfairbanks July 31, 2024 17:46

jpfairbanks requested changes Aug 2, 2024

View reviewed changes

jpfairbanks reviewed Aug 2, 2024

View reviewed changes

benchmarks/src/helpers/paths.jl Outdated Show resolved Hide resolved

jpfairbanks reviewed Aug 2, 2024

View reviewed changes

benchmarks/src/helpers/sim_fallback.jl Outdated Show resolved Hide resolved

jpfairbanks reviewed Aug 2, 2024

View reviewed changes

benchmarks/src/helpers/param_parsing.jl Outdated Show resolved Hide resolved

jpfairbanks reviewed Aug 2, 2024

View reviewed changes

benchmarks/src/helpers/param_parsing.jl Outdated Show resolved Hide resolved

lukem12345 reviewed Aug 2, 2024

View reviewed changes

test/benchmarks/metaconfig.csv Outdated Show resolved Hide resolved

benchmarks/scripts/final.sh Show resolved Hide resolved

benchmarks/scripts/clean.sh Outdated Show resolved Hide resolved

GeorgeR227 and others added 15 commits August 2, 2024 12:09

Add post processing scripts

b4b2b59

Use Brusselator from the Canon

2474975

Add support for running tagged simulations

c6306c0

Temp revert

2967313

Add back brussel canon

6f09ce8

Support full runs and single runs

acfe81d

Update config gen to support tagging

c15f15f

Added automated config generation

6b33d18

Added Cahn-Hilliard files

2130b7e

Cleaning up and more tests

62258f8

Better parsing of config files

8c7052e

Better file accessing, more data collection

76e782f

Support for main args for slurm

c2807cc

Updated readme for new workflow

54f1cdb

Cleaning up

50b0700

GeorgeR227 requested a review from jpfairbanks September 4, 2024 17:24

jpfairbanks approved these changes Sep 4, 2024

View reviewed changes

jpfairbanks merged commit c19742c into main Sep 4, 2024
7 of 8 checks passed

jpfairbanks deleted the gr/bench-suite branch September 4, 2024 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add automated benchmarks #256

Add automated benchmarks #256

GeorgeR227 commented Jul 8, 2024

GeorgeR227 commented Jul 9, 2024

GeorgeR227 commented Jul 10, 2024

GeorgeR227 commented Jul 18, 2024 •

edited

Loading

jpfairbanks Aug 1, 2024

jpfairbanks Aug 1, 2024

jpfairbanks Aug 1, 2024

jpfairbanks Aug 1, 2024

GeorgeR227 Aug 2, 2024

jpfairbanks Aug 2, 2024

GeorgeR227 Aug 2, 2024

jpfairbanks Aug 2, 2024

jpfairbanks Aug 2, 2024

GeorgeR227 Aug 2, 2024

jpfairbanks Aug 2, 2024

GeorgeR227 Aug 2, 2024

jpfairbanks Aug 2, 2024 •

edited

Loading

GeorgeR227 Aug 2, 2024

GeorgeR227 Aug 2, 2024

lukem12345 Aug 2, 2024

GeorgeR227 Aug 2, 2024

jpfairbanks Aug 2, 2024


		Please view ```main_config.toml``` as a guiding example on how to craft your own TOML.

		Warning: ```config_generate.jl``` is not called automatically so it is up to you to run the script before launching benchmarks.


		Warning: Note that not all information from the benchmarking run is saved to the result files and any files in ```data/sims/"sim_name"``` will be deleted upon the next benchmark run. On the other hand, result files in the ```autogen``` directory mentioned before will never be deleted by the benchmarking.

		An example Markdown file is output in ```data/exp_pro/"sim_name"/"slurm_job_id"``` for user inspection, called ```final.md```.

Add automated benchmarks #256

Add automated benchmarks #256

Conversation

GeorgeR227 commented Jul 8, 2024

GeorgeR227 commented Jul 9, 2024

GeorgeR227 commented Jul 10, 2024

GeorgeR227 commented Jul 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpfairbanks Aug 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GeorgeR227 commented Jul 18, 2024 •

edited

Loading

jpfairbanks Aug 2, 2024 •

edited

Loading