Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sub_chandra Segfault after time step 1 #477

Closed
andrewsilver1997 opened this issue Aug 30, 2024 · 10 comments
Closed

sub_chandra Segfault after time step 1 #477

andrewsilver1997 opened this issue Aug 30, 2024 · 10 comments

Comments

@andrewsilver1997
Copy link

Hi,

I ran into this issue when running sub_chandra with default setting, with openMP:

Timestep 1 ends with TIME = 0.001504879193 DT = 0.001504879193
Timing summary:
Advection :0.94596652 seconds
MAC Proj :0.294730107 seconds
Nodal Proj :1.105994235 seconds
Reactions :1.267243116 seconds
Misc :0.180863929 seconds
Base State :0.002849272 seconds
Time to advance time step: 3.795111876
Segfault
See Backtrace.0.0 file for details

The simulation stopped at time step 1. Can you help me with it. Thank you!

@zingale
Copy link
Member

zingale commented Aug 30, 2024

can you share the Backtrace.0.0 file and also tell us how you compiled and how you ran the code?

@andrewsilver1997
Copy link
Author

andrewsilver1997 commented Aug 30, 2024

This is Backtrace.0.0:

Host Name: node105
=== If no file names and line numbers are shown below, one can run
addr2line -Cpfie my_exefile my_line_address
to convert my_line_address (e.g., 0x4a6b) into file name and line number.
Or one can use amrex/Tools/Backtrace/parse_bt.py.

=== Please note that the line number reported by addr2line may not be accurate.
One can use
readelf -wl my_exefile | grep my_line_address'
to find out the offset for that line.

0: ./Maestro3d.gnu.OMP.ex() [0x72a1f0]
amrex::BLBackTrace::print_backtrace_info(_IO_FILE*)
/scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/sub_chandra/../../../external/amrex/Src/Base/AMReX_BLBackTrace.cpp:200:36

1: ./Maestro3d.gnu.OMP.ex() [0x73007f]
amrex::BLBackTrace::handler(int)
/scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/sub_chandra/../../../external/amrex/Src/Base/AMReX_BLBackTrace.cpp:100:15

2: /lib64/libc.so.6(+0x4e5b0) [0x7f91a912c5b0]

3: ./Maestro3d.gnu.OMP.ex() [0x4becd7]
std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocatorstd::filesystem::__cxx11::path::_Cmpt >::~vector()
/usr/include/c++/8/bits/stl_vector.h:567:15

4: ./Maestro3d.gnu.OMP.ex() [0x4bb879]
std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_is_local() const inlined at /usr/include/c++/8/bits/basic_string.h:224:6 in Maestro::Evolve()
/usr/include/c++/8/bits/basic_string.h:215:26
std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_dispose()
/usr/include/c++/8/bits/basic_string.h:224:6
std::__cxx11::basic_string<char, std::char_traits, std::allocator >::~basic_string()
/usr/include/c++/8/bits/basic_string.h:661:9
std::filesystem::__cxx11::path::path()
/usr/include/c++/8/bits/fs_path.h:209:5
std::filesystem::__cxx11::path::_Cmpt::
_Cmpt()
/usr/include/c++/8/bits/fs_path.h:644:16
void std::_Destroystd::filesystem::__cxx11::path::_Cmpt(std::filesystem::__cxx11::path::_Cmpt*)
/usr/include/c++/8/bits/stl_construct.h:98:7
void std::_Destroy_aux::__destroystd::filesystem::__cxx11::path::_Cmpt*(std::filesystem::__cxx11::path::_Cmpt*, std::filesystem::__cxx11::path::_Cmpt*)
/usr/include/c++/8/bits/stl_construct.h:108:19
void std::_Destroystd::filesystem::__cxx11::path::_Cmpt*(std::filesystem::__cxx11::path::_Cmpt*, std::filesystem::__cxx11::path::_Cmpt*)
/usr/include/c++/8/bits/stl_construct.h:137:11
void std::_Destroy<std::filesystem::__cxx11::path::_Cmpt*, std::filesystem::__cxx11::path::_Cmpt>(std::filesystem::__cxx11::path::_Cmpt*, std::filesystem::__cxx11::path::_Cmpt*, std::allocatorstd::filesystem::__cxx11::path::_Cmpt&)
/usr/include/c++/8/bits/stl_construct.h:206:15
std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocatorstd::filesystem::__cxx11::path::_Cmpt >::~vector()
/usr/include/c++/8/bits/stl_vector.h:567:15
std::filesystem::__cxx11::path::~path()
/usr/include/c++/8/bits/fs_path.h:209:5
Maestro::Evolve()
/scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/sub_chandra/../../../Source/MaestroEvolve.cpp:124:36

5: ./Maestro3d.gnu.OMP.ex() [0x4250fd]
main
/scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/sub_chandra/../../../Source/main.cpp:63:52

6: /lib64/libc.so.6(__libc_start_main+0xe5) [0x7f91a91187e5]
__libc_start_main
/usr/src/debug/glibc-2.28-251.el8.2.x86_64/csu/../csu/libc-start.c:336:3

7: ./Maestro3d.gnu.OMP.ex() [0x4354be]
_start at ??:?

For compilation, I set USE_OMP in GNUmakefile to True. Then I had Maestro3d.gnu.OMP.ex.
I ran it by ``./Maestro3d.gnu.OMP.ex inputs_3d''

@zingale
Copy link
Member

zingale commented Aug 30, 2024

can you also share the output from the simulation (stdout and stderr)?

@andrewsilver1997
Copy link
Author

where I can find these files? I only have subch_0000000, subch_after_DivuIter, and subch_after_DivuIter as outputs

@zingale
Copy link
Member

zingale commented Aug 30, 2024

just copy and paste everything that was output to the screen.

It would also be useful for you to attach the file: subch_0000000/job_info

@andrewsilver1997
Copy link
Author

This is what I have on screen:

Timestep 1 starts with TIME = 0 DT = 0.001504879193

Cell Count:
Level 0, 884736 cells
inner sponge: r_sp , r_tp : 491406250, 519531250
outer sponge: r_sp_outer, r_tp_outer: 519531250, 550781250
<<< STEP 1 : react state >>>
<<< STEP 2 : make w0 >>>
<<< STEP 3 : create MAC velocities >>>
MLMG: Initial rhs = 1157.025777
MLMG: Initial residual (resid0) = 1157.025777
MLMG: Final Iter. 13 resid, resid/bnorm = 2.565591875e-08, 2.21740252e-11
MLMG: Timers: Solve = 0.162768532 Iter = 0.159139672 Bottom = 0.03801936
<<< STEP 4 : advect base >>>
: density_advance >>>
: tracer_advance >>>
: enthalpy_advance >>>
<<< STEP 4a: thermal conduct >>>
<<< STEP 5 : react state >>>
<<< STEP 6 : make new S and new w0 >>>
<<< STEP 7 : create MAC velocities >>>
MLMG: Initial rhs = 1157.025697
MLMG: Initial residual (resid0) = 2.28587987
MLMG: Final Iter. 7 resid, resid/bnorm = 3.031198567e-08, 2.619819572e-11
MLMG: Timers: Solve = 0.090699183 Iter = 0.087052449 Bottom = 0.021309479
<<< STEP 8 : advect base >>>
: density_advance >>>
: tracer_advance >>>
: enthalpy_advance >>>
<<< STEP 8a: thermal conduct >>>
<<< STEP 9 : react state >>>
<<< STEP 10: make new S >>>
<<< STEP 11: update and project new velocity >>>
Calling nodal solver
MLMG: Initial rhs = 45.44727918
MLMG: Initial residual (resid0) = 45.44727918
MLMG: Final Iter. 29 resid, resid/bnorm = 4.458745595e-10, 9.810808645e-12
MLMG: Timers: Solve = 1.587708921 Iter = 1.583388446 Bottom = 0.101330992
Done calling nodal solver

Timestep 1 ends with TIME = 0.001504879193 DT = 0.001504879193
Timing summary:
Advection :0.919294072 seconds
MAC Proj :0.28522724 seconds
Nodal Proj :1.646425372 seconds
Reactions :1.237415097 seconds
Misc :0.186970583 seconds
Base State :0.002429556 seconds
Time to advance time step: 4.275600464
Segfault

This is job_info:

==============================================================================
MAESTROeX Job Information

job name:

inputs file:

number of MPI processes: 1
number of threads: 32
tile size: 1024000 8 8

CPU time used since start of simulation (CPU-hours): 0.166078

==============================================================================
Plotfile Information

output date / time: Fri Aug 30 15:08:10 2024
output dir: /scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/sub_chandra

==============================================================================
Build Information

build date: 2024-05-15 17:47:43.006564
build machine: Linux node67 4.18.0-513.18.1.el8_9.x86_64 #1 SMP Wed Feb 21 21:34:36 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
build dir: /scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/sub_chandra
AMReX dir: ../../../external/amrex

COMP: gnu
COMP version: 10.2.0

C++ compiler: g++
C++ flags: -Werror=return-type -g -O3 -finline-limit=43210 -std=c++17 -fopenmp -pthread -DBL_USE_OMP -DAMREX_USE_OMP -DBL_NO_FORT -DAMREX_GPU_MAX_THREADS=0 -DBL_SPACEDIM=3 -DAMREX_SPACEDIM=3 -DBL_FORT_USE_UNDERSCORE -DAMREX_FORT_USE_UNDERSCORE -DBL_Linux -DAMREX_Linux -DNDEBUG -DCRSEGRNDOMP -DREACTIONS -DSTRANG -DSCREEN_METHOD=SCREEN_METHOD_screen5 -DCONDUCTIVITY -DNETWORK_HAS_CXX_IMPLEMENTATION -DALLOW_JACOBIAN_CACHING -DRATES -DSCREENING -DNAUX_NET=0 -Itmp_build_dir/s/3d.gnu.OMP.EXE -I. -I. -I../../../Source -I../../../Util/model_parser -I../../../Util/simple_log -I../../../Util/utils -I../../../external/Microphysics/EOS -I../../../external/Microphysics/EOS/helmholtz -I../../../external/Microphysics/networks/triple_alpha_plus_cago -I../../../external/Microphysics/EOS -I../../../external/Microphysics/networks -I../../../external/Microphysics/interfaces -I../../../external/Microphysics/conductivity -I../../../external/Microphysics/conductivity/stellar -I../../../external/Microphysics/constants -I../../../external/amrex/Src/Base -I../../../external/amrex/Src/Base/Parser -I../../../external/amrex/Src/Boundary -I../../../external/amrex/Src/AmrCore -I../../../external/amrex/Src/LinearSolvers/MLMG -I../../../external/Microphysics/util -I../../../external/Microphysics/util/gcem/include -I../../../external/Microphysics/integration/VODE -I../../../external/Microphysics/integration/utils -I../../../external/Microphysics/integration -I../../../external/Microphysics/rates -I../../../external/Microphysics/screening -I. -I../../../Source -I../../../Util/model_parser -I../../../Util/simple_log -I../../../Util/utils -I../../../external/Microphysics/EOS -I../../../external/Microphysics/EOS/helmholtz -I../../../external/Microphysics/networks/triple_alpha_plus_cago -I../../../external/Microphysics/EOS -I../../../external/Microphysics/networks -I../../../external/Microphysics/interfaces -I../../../external/Microphysics/conductivity -I../../../external/Microphysics/conductivity/stellar -I../../../external/Microphysics/constants -Itmp_build_dir/maestroex_sources/3d.gnu.OMP.EXE -I../../../external/amrex/Tools/C_scripts

Fortran comp: gfortran
Fortran flags: -g -O3 -ffree-line-length-none -fno-range-check -fno-second-underscore -fimplicit-none

Link flags: -L.
Libraries:

EOS: ../../../external/Microphysics/EOS/helmholtz
NETWORK: ../../../external/Microphysics/networks/triple_alpha_plus_cago
CONDUCTIVITY: ../../../external/Microphysics/conductivity/stellar
MAESTROeX git describe: 24.05-dirty
AMReX git describe: 24.05
Microphysics git describe: 24.05

==============================================================================
Grid Information

level: 0
number of boxes = 8
maximum zones = 96 96 96

Boundary conditions
-x: symmetry
+x: outflow
-y: symmetry
+y: outflow
-z: symmetry
+z: outflow

==============================================================================
Species Information

index name A Z

 0                          He4              4              2
 1                          C12             12              6
 2                          O16             16              8
 3                         Fe56             56             26

==============================================================================
Inputs File Parameters

[] problem.velpert_amplitude = 100000
[
] problem.velpert_scale = 5e+07
[] problem.velpert_steep = 1e+07
problem.tag_rhomin = 10000
problem.tag_critxhe = 0.01
problem.tag_crittemp = 1.25e+08
eos.use_eos_coulomb = 1
eos.eos_input_is_constant = 1
eos.eos_ttol = 1e-08
eos.eos_dtol = 1e-08
eos.prad_limiter_rho_c = -1
eos.prad_limiter_delta_rho = -1
integrator.atol_spec = 1e-12
integrator.rtol_spec = 1e-12
integrator.atol_enuc = 1e-08
integrator.rtol_enuc = 1e-06
integrator.jacobian = 1
network.small_x = 1e-30
network.use_tables = 0
network.use_c12ag_deboer17 = 0
integrator.X_reject_buffer = 1
integrator.call_eos_in_rhs = 1
integrator.integrate_energy = 1
integrator.burner_verbose = 0
integrator.renormalize_abundances = 0
integrator.SMALL_X_SAFE = 1e-30
integrator.MAX_TEMP = 1e+11
integrator.react_boost = -1
integrator.ode_max_steps = 150000
integrator.ode_max_dt = 1e+30
integrator.use_jacobian_caching = 1
integrator.nonaka_i = 0
integrator.nonaka_j = 0
integrator.nonaka_k = 0
integrator.nonaka_level = 0
integrator.nonaka_file = nonaka_plot.dat
integrator.use_burn_retry = 0
integrator.retry_swap_jacobian = 1
integrator.retry_rtol_spec = 1e-12
integrator.retry_rtol_enuc = 1e-06
integrator.retry_atol_spec = 1e-08
integrator.retry_atol_enuc = 1e-06
integrator.do_species_clip = 1
integrator.use_number_densities = 0
integrator.subtract_internal_energy = 1
integrator.sdc_burn_tol_factor = 1
integrator.scale_system = 0
integrator.nse_iters = 3
integrator.nse_deriv_dt_factor = 0.05
integrator.nse_include_enu_weak = 1
integrator.linalg_do_pivoting = 1
screening.enable_chabrier1998_quantum_corr = 0
maestro.reflux_type = 1
maestro.maestro_verbose = 1
[
] maestro.model_file = sub_chandra.M_WD-1.00.M_He-0.05.hse.C.1280.hotcutoff
maestro.perturb_model = 0
maestro.print_init_hse_diag = 0
maestro.basestate_use_pres_model = 0
[] maestro.stop_time = 300000
[
] maestro.max_step = 100
[] maestro.cfl = 0.9
[
] maestro.init_shrink = 0.1
maestro.small_dt = 1e-10
maestro.max_dt_growth = 1.1
maestro.max_dt = 1e+33
maestro.fixed_dt = -1
maestro.nuclear_dt_fac = -1
[] maestro.use_soundspeed_firstdt = 1
[
] maestro.use_divu_firstdt = 1
[] maestro.spherical = 1
[
] maestro.octant = 1
maestro.do_2d_planar_octant = 0
[] maestro.regrid_int = 2
maestro.amr_buf_width = -1
[
] maestro.drdxfac = 5
maestro.minwidth = 8
maestro.min_eff = 0.9
maestro.use_tpert_in_tagging = 0
[] maestro.plot_int = -1
maestro.small_plot_int = 0
[
] maestro.plot_deltat = 1
[] maestro.small_plot_deltat = 0.025
[
] maestro.chk_int = -1
maestro.chk_deltat = -1
maestro.plot_h_with_use_tfromp = 1
maestro.plot_spec = 1
maestro.plot_omegadot = 1
maestro.plot_aux = 0
maestro.plot_Hext = 0
maestro.plot_Hnuc = 1
maestro.plot_eta = 0
maestro.plot_trac = 0
maestro.plot_base_state = 1
maestro.plot_gpi = 1
maestro.plot_cs = 0
maestro.plot_grav = 0
[] maestro.plot_base_name = subch_
[
] maestro.small_plot_base_name = smallsubch_
maestro.check_base_name = chk
maestro.diag_buf_size = 10
maestro.plot_ad_excess = 0
maestro.plot_processors = 0
maestro.plot_pidivu = 0
maestro.small_plot_vars = rho p0 magvel
[] maestro.init_iter = 3
[
] maestro.init_divu_iter = 1
maestro.restart_file =
maestro.restart_into_finer = 0
maestro.do_initial_projection = 1
maestro.mg_verbose = 1
maestro.cg_verbose = 0
maestro.mg_cycle_type = 3
maestro.hg_cycle_type = 3
maestro.hg_bottom_solver = 4
maestro.mg_bottom_solver = 4
[] maestro.max_mg_bottom_nlevels = 3
maestro.mg_bottom_nu = 10
maestro.mg_nu_1 = 2
maestro.mg_nu_2 = 2
maestro.hg_dense_stencil = 1
[
] maestro.do_sponge = 1
maestro.sponge_kappa = 10
[] maestro.sponge_center_density = 60000
[
] maestro.sponge_start_factor = 2
maestro.plot_sponge_fdamp = 0
[] maestro.anelastic_cutoff_density = 60000
[
] maestro.base_cutoff_density = 10000
maestro.burning_cutoff_density_lo = -1
maestro.burning_cutoff_density_hi = 1e+100
maestro.heating_cutoff_density_lo = -1
maestro.heating_cutoff_density_hi = 1e+100
[] maestro.buoyancy_cutoff_factor = 2
maestro.dpdt_factor = 0
maestro.do_planar_invsq_grav = 0
maestro.planar_invsq_mass = 0
maestro.evolve_base_state = 1
maestro.use_exact_base_state = 0
maestro.fix_base_state = 0
maestro.average_base_state = 0
maestro.do_smallscale = 0
maestro.do_eos_h_above_cutoff = 1
maestro.enthalpy_pred_type = 1
[
] maestro.species_pred_type = 3
maestro.use_delta_gamma1_term = 1
maestro.use_etarho = 1
maestro.add_pb = 0
maestro.slope_order = 4
[] maestro.grav_const = -2.45e+14
maestro.ppm_type = 1
maestro.bds_type = 0
maestro.ppm_trace_forces = 0
maestro.beta0_type = 1
maestro.use_linear_grav_in_beta0 = 0
maestro.rotational_frequency = 0
maestro.co_latitude = 0
maestro.rotation_radius = 1e+06
maestro.use_centrifugal = 1
maestro.mach_max_abort = -1
maestro.drive_initial_convection = 0
maestro.stop_initial_convection = -1
maestro.restart_with_vel_field = 0
maestro.use_alt_energy_fix = 1
maestro.use_omegadot_terms_in_S = 1
maestro.use_thermal_diffusion = 0
maestro.temp_diffusion_formulation = 2
maestro.thermal_diffusion_type = 1
[
] maestro.limit_conductivity = 1
maestro.do_burning = 1
maestro.burner_threshold_species =
maestro.burner_threshold_cutoff = 1e-10
maestro.do_subgrid_burning = 0
maestro.reaction_sum_tol = 1e-10
maestro.small_temp = 5e+06
maestro.small_dens = 1e-05
[] maestro.use_tfromp = 1
maestro.use_eos_e_instead_of_h = 0
maestro.use_pprime_in_tfromp = 0
maestro.s0_interp_type = 3
maestro.w0_interp_type = 2
maestro.s0mac_interp_type = 1
maestro.w0mac_interp_type = 1
maestro.print_update_diagnostics = 0
maestro.track_grid_losses = 0
maestro.sum_interval = -1
maestro.sum_per = -1
maestro.show_center_of_mass = 0
maestro.hard_cfl_limit = 1
maestro.job_name =
maestro.output_at_completion = 1
maestro.reset_checkpoint_time = -1e+200
maestro.reset_checkpoint_step = -1
maestro.use_particles = 0
maestro.store_particle_vels = 0
maestro.do_heating = 0
maestro.deterministic_nodal_solve = 0
maestro.eps_init_proj_cart = 1e-12
maestro.eps_init_proj_sph = 1e-10
maestro.eps_divu_cart = 1e-12
maestro.eps_divu_sph = 1e-10
maestro.divu_iter_factor = 100
maestro.divu_level_factor = 10
maestro.eps_mac = 1e-10
maestro.eps_mac_max = 1e-08
maestro.mac_level_factor = 10
maestro.eps_mac_bottom = 0.001
[
] maestro.eps_hg = 1e-11
maestro.eps_hg_max = 1e-10
maestro.hg_level_factor = 10
maestro.eps_hg_bottom = 0.0001

@zingale
Copy link
Member

zingale commented Aug 30, 2024

okay, you are using an older version of the code 24.05. We've fixed an OMP race condition since that version (see #455).

Can you try updating to the latest version of MAESTROeX (you'll probably just need to do

git pull --recurse-submodules

then try running again

@andrewsilver1997
Copy link
Author

thanks! it works now.

Btw, I have two following questions:
1.is it possible to run it in 128x128x128 resolution?
2.can I change some physical variables so that the final result will look different?

@zingale
Copy link
Member

zingale commented Aug 30, 2024

you can change the number of zones by changing the value of amr.n_cell in the inputs file.

the best way to see different variations of the same mass model is to change the initial perturbation, but adjusting these values:

problem.velpert_amplitude = 1.e5                                                                                                                                                   
problem.velpert_scale = 5.e7                                                                                                                                                       
problem.velpert_steep = 1.e7  

Note that 128 zones is quite coarse for this problem, so you probably want to run more zones.

@andrewsilver1997
Copy link
Author

thank you!

@zingale zingale closed this as completed Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants