You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Doing some benchmarks, I've noticed that the I/O parts are done in the main loop, which causes an increase in the total runtime.
As an example, we have:
time python -m data_morph --seed 42 --start-shape panda --target-shape star --iterations 10000
13.81s user 0.79s system 109% cpu 13.310 total
but if we remove this block (which is in the loop):
time python -m data_morph --seed 42 --start-shape panda --target-shape star --iterations 10000
5.41s user 0.28s system 137% cpu 4.149 total
I'd propose that instead of doing the I/O in the main loop, the frames that would be written to disk are simply stored in some internal list, and then I/O is done after all the computations. This has several benefits:
if keep_frames=False, we don't output anything other than the final GIF, so we spare the disk from unnecessary writes
if keep_frames=True, since the task is probably I/O bound, we can take advantage of the concurrent.futures module to do it concurrently, so the speed-up is probably still significant
less error-prone since we don't need to find the files. There's also no guarantee that the files won't change on disk while the sim is running
the most obvious one, speed! If using keep_frames=False, which is the default, we just output one file instead of potentially hundreds.
The text was updated successfully, but these errors were encountered:
I don't fully understand what you mean by having just one file because even if we move this logic outside of the main loop, the animation logic currently relies on stitching together the images, meaning they still need to be created. I also don't want to remove the ability to retrieve frames (images or CSVs).
What you plan to store in order to be able to create the animation later? Have you thought about the specifics?
I'm not opposed to this, but I think this needs to be fleshed out more before we proceed.
Doing some benchmarks, I've noticed that the I/O parts are done in the main loop, which causes an increase in the total runtime.
As an example, we have:
but if we remove this block (which is in the loop):
data-morph/src/data_morph/morpher.py
Lines 486 to 490 in c16e3dc
we achieve a major speed-up:
I'd propose that instead of doing the I/O in the main loop, the frames that would be written to disk are simply stored in some internal list, and then I/O is done after all the computations. This has several benefits:
keep_frames=False
, we don't output anything other than the final GIF, so we spare the disk from unnecessary writeskeep_frames=True
, since the task is probably I/O bound, we can take advantage of theconcurrent.futures
module to do it concurrently, so the speed-up is probably still significantkeep_frames=False
, which is the default, we just output one file instead of potentially hundreds.The text was updated successfully, but these errors were encountered: