-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix indenting to prevent repeated calculation in check_positive_jacobian() #1053
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1053 +/- ##
=======================================
Coverage 94.96% 94.96%
=======================================
Files 87 87
Lines 21776 21776
=======================================
Hits 20679 20679
Misses 1097 1097
|
| benchmark_name | dt(%) | dt(s) | t_new(s) | t_old(s) |
| -------------------------------------- | ---------------------- | ---------------------- | ---------------------- | ---------------------- |
test_build_transform_fft_lowres | -1.34 +/- 6.72 | -7.83e-03 +/- 3.94e-02 | 5.78e-01 +/- 3.2e-02 | 5.86e-01 +/- 2.3e-02 |
test_build_transform_fft_midres | -2.90 +/- 1.69 | -1.94e-02 +/- 1.13e-02 | 6.51e-01 +/- 7.7e-03 | 6.71e-01 +/- 8.3e-03 |
test_build_transform_fft_highres | -0.69 +/- 2.61 | -7.30e-03 +/- 2.75e-02 | 1.05e+00 +/- 1.1e-02 | 1.05e+00 +/- 2.5e-02 |
test_equilibrium_init_lowres | -5.27 +/- 6.09 | -2.19e-01 +/- 2.53e-01 | 3.94e+00 +/- 2.1e-01 | 4.16e+00 +/- 1.5e-01 |
test_equilibrium_init_medres | -1.54 +/- 5.36 | -7.06e-02 +/- 2.46e-01 | 4.51e+00 +/- 1.1e-01 | 4.59e+00 +/- 2.2e-01 |
test_equilibrium_init_highres | -3.68 +/- 3.19 | -2.36e-01 +/- 2.04e-01 | 6.17e+00 +/- 1.5e-01 | 6.41e+00 +/- 1.4e-01 |
test_objective_compile_dshape_current | +1.19 +/- 3.77 | +4.81e-02 +/- 1.52e-01 | 4.08e+00 +/- 4.4e-02 | 4.03e+00 +/- 1.5e-01 |
test_objective_compile_atf | -1.04 +/- 5.00 | -9.09e-02 +/- 4.36e-01 | 8.62e+00 +/- 2.6e-01 | 8.71e+00 +/- 3.5e-01 |
test_objective_compute_dshape_current | +1.76 +/- 5.34 | +2.10e-05 +/- 6.38e-05 | 1.21e-03 +/- 4.6e-05 | 1.19e-03 +/- 4.4e-05 |
test_objective_compute_atf | -1.35 +/- 5.66 | -5.79e-05 +/- 2.42e-04 | 4.22e-03 +/- 2.1e-04 | 4.27e-03 +/- 1.2e-04 |
test_objective_jac_dshape_current | -5.59 +/- 11.91 | -2.27e-03 +/- 4.83e-03 | 3.82e-02 +/- 1.6e-03 | 4.05e-02 +/- 4.5e-03 |
test_objective_jac_atf | +1.61 +/- 3.15 | +3.09e-02 +/- 6.03e-02 | 1.94e+00 +/- 4.4e-02 | 1.91e+00 +/- 4.1e-02 |
test_perturb_1 | -2.68 +/- 3.45 | -3.77e-01 +/- 4.85e-01 | 1.37e+01 +/- 4.1e-01 | 1.41e+01 +/- 2.6e-01 |
test_perturb_2 | -0.91 +/- 4.56 | -1.72e-01 +/- 8.66e-01 | 1.88e+01 +/- 5.9e-01 | 1.90e+01 +/- 6.3e-01 |
test_proximal_jac_atf | +0.36 +/- 1.34 | +2.69e-02 +/- 9.98e-02 | 7.46e+00 +/- 5.9e-02 | 7.43e+00 +/- 8.0e-02 |
test_proximal_freeb_compute | +1.55 +/- 1.18 | +2.76e-03 +/- 2.10e-03 | 1.80e-01 +/- 1.5e-03 | 1.78e-01 +/- 1.5e-03 |
test_proximal_freeb_jac | -2.19 +/- 1.36 | -1.67e-01 +/- 1.04e-01 | 7.47e+00 +/- 9.2e-02 | 7.63e+00 +/- 4.7e-02 | |
I merged this branch (since it was clearly a bug, independent of the performance improvement), but the benchmarks caused a question in my mind about their validity. For equilibrium initialization benchmarks, it says that there is no change basically and even the highest equilibrium init takes 6 seconds. I think this is due to the benchmarks creating some very basic surface (the default one when nothing is given to FourierRZToroidalSurface) and this doesn't represent most of the actual cases. For comparison, I was doing GPU profiling and resolution 15 equilibrium takes consistently 25 seconds with this surface surface_2D = FourierRZToroidalSurface(
R_lmn=np.array([10, -1]), # boundary coefficients
Z_lmn=np.array([1]),
modes_R=np.array([[0, 0], [1, 0]]), # [M, N] boundary Fourier modes
modes_Z=np.array([[-1, 0]]),
NFP=5, # number of (toroidal) field periods (does not matter for 2D, but will for 3D solution) This is also very basic actually. Is this again some CPU/GPU difference issue? I was measuring at least 20% improvement on LMN=15 from 25 seconds to 20 seconds. Any thoughts @f0uriest @dpanici ? |
My hunch would be cpu/gpu difference, but im surprised its that much. You can try the regular line profiler on cpu and see where the differences are |
Yes, I can do that tomorrow. Also, I run the test script on my small laptop (the one we use for GPU), it takes only 13 seconds and 9-10 seconds when functions are jitted. This is probably again some for loop being too slow on GPU issue. |
certainly a possibility. If that's the case we should see if there's some small fixes we can do to make it faster on gpu |
Some of the initialization function use a lot of for loop that are extremely slow on GPUs. If we move the execution to CPU, we get at least x5 performance for equilibrium and geometry classes. Since the Github actions are run on CPU, we won't see any speed up, but on GPU interactive sessions, this should speed up plotting etc. Add the ```python with jax.default_device(jax.devices("cpu")[0]): ``` for the parts you believe don't need to be run on GPUs @f0uriest @dpanici @rahulgaur104 @ddudt @kianorr @unalmis This was actually noticed on previous PR #1053 when the benchmark didn't result in speed up, but the GPU eq init got faster by 5 seconds. ![image (1)](https://github.com/PlasmaControl/DESC/assets/102380275/372c8f2a-07bf-4bc3-8238-1c1d91ea1ae1) ![image](https://github.com/PlasmaControl/DESC/assets/102380275/02badace-af44-4801-ae5d-3707f52aaa5d)
No description provided.