-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically set heap size hint on workers #270
Conversation
Benchmark Results
Benchmark PlotsA plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. |
This reverts commit 973e3e6.
ffe7bcc
to
d732afd
Compare
d732afd
to
9a5ad57
Compare
a7281ae
to
bda097e
Compare
4461adc
to
6c5075b
Compare
e160159
to
0becbf4
Compare
[Diff since v0.22.5](v0.22.5...v0.23.0) **Merged pull requests:** - Automatically set heap size hint on workers (#270) (@MilesCranmer) **Closed issues:** - How do I set up a basis function consisting of three different inputs x, y, z? (#268)
Thanks Miles! I am running the news versions on 4 rusty icelake nodes, and find that the mem usage of each node is only 56GB/1TB. This is surprising because I have increased the |
Is it slower at all? It doesn’t actually need the memory. But basically letting it use more memory can make the garbage collection more efficient as it can do it in batches. But if it really doesn’t need all of the RAM it’s not a big issue. |
Previously: 1 node, BTW, I have a simple script to plot the log-log pareto front and its lower convex hull. hall_of_fame_2024-01-03_102321.228.pdf The convex hull shows the tradeoffs in power law forms: I think it'd be cool to have tensorboard tracking these figures over time. |
Forgot to ask: I guess that means julia garbage collection starts to work quite hard even with a little nudge? |
From the settings you described, this scaling sounds fine – I don't think there's any slowdown from the change in this PR. I'm surprised you are only getting 5e4 expr/sec though. Typically on 4 rusty nodes I can get 5e6+ expr/sec for maxsize 50 and 100 datapoints. How many datapoints are you running? Is the CPU load reasonable on all nodes? Do you want to open an issue here or a discussion thread on the PySR forums to debug this? https://github.com/MilesCranmer/PySR/discussions |
Opened an discussion: MilesCranmer/PySR#518
And tensorboard can actually log text: https://www.tensorflow.org/tensorboard/text_summaries |
That’s a great idea! I see there’s a Julia plugin as well: https://github.com/JuliaLogging/TensorBoardLogger.jl |
Ah right, this has to be done in the Julia loop. I see that matplotlib can be called there as well https://github.com/JuliaPy/PyPlot.jl |
Actually it's okay because they are already talking to eachother, so you can totally call Python stuff from the Julia loop. |
Changes:
heap_size_hint_in_bytes
parameter for setting up distributed workers.print
to using@info
when printing out updates to the search process.Hopefully should fix issues like this:
MilesCranmer/PySR#490 (@eelregit and @paulomontero)
which is due to poor garbage collection in Julia when using distributed processing:
JuliaLang/julia#50673
The workaround I have added here is basically to hint to every process what it should choose as its memory limit before aggressive garbage collection. I automatically select that hint based on total memory divided by number of processes. But the user can pass a per-worker hint of their own (such as for multi-node, where the single-node memory is much less than total memory across nodes).
This seems to work well for preventing OOM errors when I tried it.
TODO:
This also removes the explicit precompilation script as I think it didn't help much.