-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when running benchmark after rustification #381
Comments
This is for sure related to #369, but in these cases is better to give the exact commit to make the issue reproducible. Rust is pretty much aggressive in preventing memory issues, and Python code on its own can not cause So, most likely the problem is generated in some Line 12 in b7d4e8d
Line 77 in b7d4e8d
Line 132 in b7d4e8d
Line 166 in b7d4e8d
(it's the code used to bridge the Rust callables with the SciPy quadrature integration) |
@tgiani can you please check on master? @niclaurenti @andreab1997 @giacomomagni can someone with a Mac please check this on b7d4e8d? @alecandido we have some hints that this is actually related to this line (which is master) eko/src/eko/evolution_operator/quad_ker.py Line 103 in 2348778
i.e. it seems related to the areas definition; the workflow is as follows:
|
@felixhekhorn I have the same problem on master |
Here there are a bunch of raw pointers around, due to SciPy interface
However, in principle every structure should have a C-compatible representation, and every function should respect the C calling convention. However, I fear it could somehow be a Mac-related problem, or even a Conda-related one, so I'm trying to test on both Linux and MacOS (but I'm using Nix, not Conda, so let's see what will be the outcome). |
@alecandido I m not using conda here, just poetry, so probably that s not the problem |
On 1a43a7d (current #369), I'm actually passing on Linux, while on MacOS I'm getting the following error: benchmarks/lha_paper_bench.py ────────────────────────────────────────
Theories: 1 OCards: 1 PDFs: 1 ext: LHA
────────────────────────────────────────
Computing for theory=88d687f, ocard=8beae75 and pdf=ToyLH ...
Exception ignored in: '<numba.core.cpu.CPUContext object at 0x1656c8cd0>'
Traceback (most recent call last):
File "/Users/alessandro/Projects/pdf/eko/.venv/lib/python3.11/site-packages/numba/cpython/numbers.py", line 1100, in complex_div
raise ZeroDivisionError("complex division by zero")
ZeroDivisionError: complex division by zero
Exception ignored in: '<numba.core.cpu.CPUContext object at 0x1656c8cd0>'
Traceback (most recent call last):
File "/Users/alessandro/Projects/pdf/eko/.venv/lib/python3.11/site-packages/numba/cpython/numbers.py", line 1100, in complex_div
raise ZeroDivisionError("complex division by zero")
ZeroDivisionError: complex division by zero and it's getting stuck. But I've not been able to reproduce the segfault yet. |
if you have suggestions for improvements, you're welcome 🙃 however, communicating between languages using a third-party library will always be a hassle (by the way: the Rust
good! which makes me confident that the code is not completely crazy and ...
I'm afraid so as well - which makes it hard for me to debug 🙈 it is very strange that you get a different error and more over a different error type, i.e. something that hints at a math problem ... (or maybe it is the same and you read some random numbers from somewhere?)
as I fell myself into that trap: remember to debug with a single core |
Well, consider that JSON is capable to represent all the types that you'd like to have (strings, integers, floats, together with composition as unbounded lists and maps) is perfectly represented in the Rust type system, without the need of escaping to a |
Yeah, when it becomes platform-dependent is immediately more complex, not only because of the availability of devices (that's the main motivation behind my laptop, but not sufficient to solve problems in a snap...). [*]: unless we come with a clever intuition (if you know the solution, you can always just apply it)
I had this suspect as well, but unfortunately I tried, changing the value in the benchmarks/lha_paper_bench.py --- Python
118 118 "mugrid": [100],
119 119 "ev_op_iterations": 10,
120 120 "interpolation_xgrid": lambertgrid(60).tolist(),
... 121 "n_integration_cores": 1,
121 122 }
122 123 ],
123 124 ["ToyLH"], |
just to say that it is slightly more complicated 🙃
|
When running benchmarcks after rustification I get a segmentation fault error
To Reproduce
./rustify.sh
poe compile
poe lha -m "ffns and lo and not sv"
Expected behavior
On my machine the code crashes with
Fatal Python error: Segmentation fault
The text was updated successfully, but these errors were encountered: