You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thanks for the great work!
I'm currently trying to wrap a pytorch model into a Flux based training setup.
The training seems to go fine for a few epochs, however seemingly at random, a segmentation fault occurs (see below).
I don't have a great MWE right now (I'll try to make one still), but perhaps we can already make some conclusions based on the stacktrace, which here happened after about seven epochs:
[56770] signal (11.1): Segmentation fault
in expression starting at /home/romeo/Documents/Stanford/google_ood/DisentanglingVAE.jl/scripts/vae_CUB.jl:213
PyErr_Occurred at /usr/lib/libpython3.10.so.1.0 (unknown line)
pyerr_occurred at /home/romeo/.julia/packages/PyCall/twYvK/src/exception.jl:69 [inlined]
pyerr_check at /home/romeo/.julia/packages/PyCall/twYvK/src/exception.jl:75 [inlined]
############# LOOK HERE vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
share at /home/romeo/.julia/packages/DLPack/SUhao/src/pycall.jl:109
#13 at /home/romeo/.julia/packages/PyCallChainRules/YR5iR/src/pytorch.jl:59
#########################^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
unknown function (ip: 0x7ff3e1725d52)
map at ./tuple.jl:292
unknown function (ip: 0x7ff3e1723e23)
_jl_invoke at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/gf.c:2681 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/gf.c:2863
#rrule#12 at /home/romeo/.julia/packages/PyCallChainRules/YR5iR/src/pytorch.jl:59
rrule at /home/romeo/.julia/packages/PyCallChainRules/YR5iR/src/pytorch.jl:56 [inlined]
rrule at /home/romeo/.julia/packages/ChainRulesCore/a4mIA/src/rules.jl:134 [inlined]
chain_rrule at /home/romeo/.julia/packages/Zygote/xGkZ5/src/compiler/chainrules.jl:218 [inlined]
macro expansion at /home/romeo/.julia/packages/Zygote/xGkZ5/src/compiler/interface2.jl:0 [inlined]
_pullback at /home/romeo/.julia/packages/Zygote/xGkZ5/src/compiler/interface2.jl:9
unknown function (ip: 0x7ff3e1723a4d)
_jl_invoke at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/gf.c:2681 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/gf.c:2863
_pullback at /home/romeo/Documents/Stanford/google_ood/DisentanglingVAE.jl/scripts/vae_CUB.jl:166 [inlined]
Here are the referenced code snippets in the stacktrace:
Hello, thanks for the great work!
I'm currently trying to wrap a pytorch model into a Flux based training setup.
The training seems to go fine for a few epochs, however seemingly at random, a segmentation fault occurs (see below).
I don't have a great MWE right now (I'll try to make one still), but perhaps we can already make some conclusions based on the stacktrace, which here happened after about seven epochs:
Here are the referenced code snippets in the stacktrace:
PyCallChainRules.jl/src/pytorch.jl
Lines 56 to 64 in 1723781
and
https://github.com/pabloferz/DLPack.jl/blob/61f48ee6b5e4f56d9b8525fa6ef9b613242160b8/src/pycall.jl#L98-L116
The text was updated successfully, but these errors were encountered: