Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SimpleMultiG sediment models will not run on GPU #143

Closed
jagoosw opened this issue Sep 18, 2023 · 33 comments
Closed

SimpleMultiG sediment models will not run on GPU #143

jagoosw opened this issue Sep 18, 2023 · 33 comments
Labels

Comments

@jagoosw
Copy link
Collaborator

jagoosw commented Sep 18, 2023

While the kernel for SimpleMultiG sediments will compile on GPU they will not run due to their complexity leading to a PTX error about parameter size being thrown. This may be fixed in the future if LLVM updates its CUDA version support (JuliaGPU/CUDA.jl#2080). We may also be able to rework the model to use less parameter space so please do get in contact if you would like to use the model on GPU.

An error similar to the following will be raised:

Failed to compile PTX code (ptxas exited with code 255)
Invocation arguments: --generate-line-info --verbose --gpu-name sm_75 --output-file /tmp/jl_bELIheLkzO.cubin /tmp/jl_VczMJifKRu.ptx
ptxas /tmp/jl_VczMJifKRu.ptx, line 3215; error   : Entry function '_Z26gpu__calculate_tendencies_16CompilerMetadataI10StaticSizeI6_3__3_E12DynamicCheckvv7NDRangeILi2ES0_I6_1__1_ES0_I8_16__16_EvvEE12SimpleMultiGI7Float6410NamedTupleI24__A___B___C___D___E___F_5TupleIS4_S4_S4_S4_S4_S4_EES5_I24__A___B___C___D___E___F_S6_IS4_S4_S4_S4_S4_S4_EES5_I24__A___B___C___D___E___F_S6_IS4_S4_S4_S4_S4_S4_EES5_I24__A___B___C___D___depth_S6_IS4_S4_5Int64S4_S4_EES5_I52__C_slow___C_fast___N_slow___N_fast___C_ref___N_ref_S6_I11OffsetArrayIS4_Li3E13CuDeviceArrayIS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEEEES5_I10__G____G__S6_IS5_I52__C_slow___C_fast___N_slow___N_fast___C_ref___N_ref_S6_IS8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEEEES5_I52__C_slow___C_fast___N_slow___N_fast___C_ref___N_ref_S6_IS8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEEEEEES9_IS7_Li2ELi1EEE7LOBSTERIS4_3ValI18_true__true__true_ES5_I14__sPOM___bPOM_S6_IS5_I12__u___v___w_S6_I13ConstantFieldIS7_Li3EES12_IS7_Li3EES8_IS4_Li3ES9_IS4_Li3ELi1EEEEES5_I12__u___v___w_S6_IS12_IS7_Li3EES12_IS7_Li3EES8_IS4_Li3ES9_IS4_Li3ELi1EEEEEEEE15RectilinearGridIS4_8PeriodicS14_7BoundedS4_S4_S4_S8_IS4_Li1...
ptxas /tmp/jl_VczMJifKRu.ptx, line 3215; error   : Feature 'Kernel parameter size larger than 4352 bytes' requires PTX ISA .version 8.1 or later
ptxas fatal   : Ptx assembly aborted due to errors
If you think this is a bug, please file an issue and attach /tmp/jl_VczMJifKRu.ptx

Also, note that InstantRemineralisation sediment does work on GPU.

Discussed in #138.

@jagoosw jagoosw added bug Something isn't working awaiting upstream change GPU labels Sep 18, 2023
@glwagner
Copy link
Collaborator

While the kernel for SimpleMultiG sediments

where is this kernel? Link to code?

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 18, 2023

This is as it currently is in main:

@kernel function _calculate_tendencies!(sediment::SimpleMultiG, bgc, grid, advection, tracers, timestepper)
i, j = @index(Global, NTuple)
k = bottom_index(i, j, sediment)
Δz = zspacing(i, j, k, grid, Center(), Center(), Center())
@inbounds begin
carbon_deposition = carbon_flux(i, j, k, grid, advection, bgc, tracers) * Δz
nitrogen_deposition = nitrogen_flux(i, j, k, grid, advection, bgc, tracers) * Δz
# rates
C_min_slow = sediment.fields.C_slow[i, j, 1] * sediment.slow_decay_rate
C_min_fast = sediment.fields.C_fast[i, j, 1] * sediment.fast_decay_rate
N_min_slow = sediment.fields.N_slow[i, j, 1] * sediment.slow_decay_rate
N_min_fast = sediment.fields.N_fast[i, j, 1] * sediment.fast_decay_rate
Cᵐⁱⁿ = C_min_slow + C_min_fast
Nᵐⁱⁿ = N_min_slow + N_min_fast
k = Cᵐⁱⁿ * day / (sediment.fields.C_slow[i, j, 1] + sediment.fields.C_fast[i, j, 1])
# sediment evolution
sediment.tendencies.Gⁿ.C_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * carbon_deposition - C_min_slow
sediment.tendencies.Gⁿ.C_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * carbon_deposition - C_min_fast
sediment.tendencies.Gⁿ.C_ref[i, j, 1] = sediment.refactory_fraction * carbon_deposition
sediment.tendencies.Gⁿ.N_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * nitrogen_deposition - N_min_slow
sediment.tendencies.Gⁿ.N_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * nitrogen_deposition - N_min_fast
sediment.tendencies.Gⁿ.N_ref[i, j, 1] = sediment.refactory_fraction * nitrogen_deposition
# efflux/influx
O₂ = tracers.O₂[i, j, 1]
NO₃ = tracers.NO₃[i, j, 1]
NH₄ = tracers.NH₄[i, j, 1]
pₙᵢₜ = exp(sediment.nitrate_oxidation_params.A +
sediment.nitrate_oxidation_params.B * log(Cᵐⁱⁿ * day) * log(O₂) +
sediment.nitrate_oxidation_params.C * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.nitrate_oxidation_params.D * log(k) * log(NH₄) +
sediment.nitrate_oxidation_params.E * log(Cᵐⁱⁿ * day) +
sediment.nitrate_oxidation_params.F * log(Cᵐⁱⁿ * day) * log(NH₄)) / (Nᵐⁱⁿ * day)
#=
pᵈᵉⁿⁱᵗ = exp(sediment.denitrification_params.A +
sediment.denitrification_params.B * log(Cᵐⁱⁿ * day) +
sediment.denitrification_params.C * log(NO₃) ^ 2 +
sediment.denitrification_params.D * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.denitrification_params.E * log(k) ^ 2 +
sediment.denitrification_params.F * log(O₂) * log(k)) / (Cᵐⁱⁿ * day)
=#
pₐₙₒₓ = exp(sediment.anoxic_params.A +
sediment.anoxic_params.B * log(Cᵐⁱⁿ * day) +
sediment.anoxic_params.C * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.anoxic_params.D * log(k) +
sediment.anoxic_params.E * log(O₂) * log(k) +
sediment.anoxic_params.F * log(NO₃) ^ 2) / (Cᵐⁱⁿ * day)
if isnan(pₐₙₒₓ)
println("$(Cᵐⁱⁿ), $(k), $(O₂), $(NO₃)")
error("Sediment anoxia has caused model failure")
end
pₛₒₗᵢ = sediment.solid_dep_params.A * (sediment.solid_dep_params.C * sediment.solid_dep_params.depth ^ sediment.solid_dep_params.D) ^ sediment.solid_dep_params.B
Δz = grid.Δzᵃᵃᶜ[1]
timestepper.Gⁿ.NH₄[i, j, 1] += Nᵐⁱⁿ * (1 - pₙᵢₜ) / Δz
timestepper.Gⁿ.NO₃[i, j, 1] += Nᵐⁱⁿ * pₙᵢₜ / Δz
timestepper.Gⁿ.DIC[i, j, 1] += Cᵐⁱⁿ / Δz
timestepper.Gⁿ.O₂[i, j, 1] -= max(0, ((1 - pₐₙₒₓ * pₛₒₗᵢ) * Cᵐⁱⁿ + 2 * Nᵐⁱⁿ * pₙᵢₜ)/ Δz) # this seems dodge but this model doesn't cope with anoxia properly
end
end

Which has another bug in it which is fixed in #138

@glwagner
Copy link
Collaborator

Interesting. I'm surprised this won't compile. Likely you can simplify the arguments to the kernel to get it to compile. Parameter space issues don't have to do with kernel complexity directly, but rather than complexity of the input arguments (ie using Field rather than OffsetArray). Reducing the input argument complexity is why, for example, we can't support Field directly in GPU kernels and are forced to work with OffsetArrays instead.

One issue might be that the time-stepper is input directly. You only need the named tuple Gⁿ --- it could be better to use that as an input argument instead.

sediment may also have more info than you need on the GPU, which you can simplify with an appropriate adapt_structure method.

In general, its good practice to define adapt_structure to return the minimal info expected to be useful within a GPU kernel so as much is compilable as possible.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 18, 2023

This is the fixed version that does some of those things:

@kernel function _calculate_tendencies!(sediment::SimpleMultiG, bgc, grid, advection, tracers, tendencies)
i, j = @index(Global, NTuple)
k = bottom_index(i, j, sediment)
Δz = zspacing(i, j, k, grid, Center(), Center(), Center())
@inbounds begin
carbon_deposition = carbon_flux(i, j, k, grid, advection, bgc, tracers) * Δz
nitrogen_deposition = nitrogen_flux(i, j, k, grid, advection, bgc, tracers) * Δz
# rates
C_min_slow = sediment.fields.C_slow[i, j, 1] * sediment.slow_decay_rate
C_min_fast = sediment.fields.C_fast[i, j, 1] * sediment.fast_decay_rate
N_min_slow = sediment.fields.N_slow[i, j, 1] * sediment.slow_decay_rate
N_min_fast = sediment.fields.N_fast[i, j, 1] * sediment.fast_decay_rate
Cᵐⁱⁿ = C_min_slow + C_min_fast
Nᵐⁱⁿ = N_min_slow + N_min_fast
reactivity = Cᵐⁱⁿ * day / (sediment.fields.C_slow[i, j, 1] + sediment.fields.C_fast[i, j, 1])
# sediment evolution
sediment.tendencies.Gⁿ.C_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * carbon_deposition - C_min_slow
sediment.tendencies.Gⁿ.C_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * carbon_deposition - C_min_fast
sediment.tendencies.Gⁿ.C_ref[i, j, 1] = sediment.refactory_fraction * carbon_deposition
sediment.tendencies.Gⁿ.N_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * nitrogen_deposition - N_min_slow
sediment.tendencies.Gⁿ.N_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * nitrogen_deposition - N_min_fast
sediment.tendencies.Gⁿ.N_ref[i, j, 1] = sediment.refactory_fraction * nitrogen_deposition
# efflux/influx
O₂ = tracers.O₂[i, j, k]
NO₃ = tracers.NO₃[i, j, k]
NH₄ = tracers.NH₄[i, j, k]
pₙᵢₜ = exp(sediment.nitrate_oxidation_params.A +
sediment.nitrate_oxidation_params.B * log(Cᵐⁱⁿ * day) * log(O₂) +
sediment.nitrate_oxidation_params.C * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.nitrate_oxidation_params.D * log(reactivity) * log(NH₄) +
sediment.nitrate_oxidation_params.E * log(Cᵐⁱⁿ * day) +
sediment.nitrate_oxidation_params.F * log(Cᵐⁱⁿ * day) * log(NH₄)) / (Nᵐⁱⁿ * day)
#=
pᵈᵉⁿⁱᵗ = exp(sediment.denitrification_params.A +
sediment.denitrification_params.B * log(Cᵐⁱⁿ * day) +
sediment.denitrification_params.C * log(NO₃) ^ 2 +
sediment.denitrification_params.D * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.denitrification_params.E * log(reactivity) ^ 2 +
sediment.denitrification_params.F * log(O₂) * log(reactivity)) / (Cᵐⁱⁿ * day)
=#
pₐₙₒₓ = exp(sediment.anoxic_params.A +
sediment.anoxic_params.B * log(Cᵐⁱⁿ * day) +
sediment.anoxic_params.C * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.anoxic_params.D * log(reactivity) +
sediment.anoxic_params.E * log(O₂) * log(reactivity) +
sediment.anoxic_params.F * log(NO₃) ^ 2) / (Cᵐⁱⁿ * day)
if isnan(pₐₙₒₓ)
error("Sediment anoxia has caused model failure")
end
pₛₒₗᵢ = sediment.solid_dep_params.A * (sediment.solid_dep_params.C * sediment.solid_dep_params.depth ^ sediment.solid_dep_params.D) ^ sediment.solid_dep_params.B
tendencies.NH₄[i, j, k] += Nᵐⁱⁿ * (1 - pₙᵢₜ) / Δz
tendencies.NO₃[i, j, k] += Nᵐⁱⁿ * pₙᵢₜ / Δz
tendencies.DIC[i, j, k] += Cᵐⁱⁿ / Δz
tendencies.O₂[i, j, k] -= max(0, ((1 - pₐₙₒₓ * pₛₒₗᵢ) * Cᵐⁱⁿ + 2 * Nᵐⁱⁿ * pₙᵢₜ)/ Δz) # this seems dodge but this model doesn't cope with anoxia properly (I think)
end
end

I think the adapt structure route may be the solution, e.g. at the moment we're unnecessarily passing the G$^{-1}$ tendency fields for the sediment fields, but otherwise this kernel does need the majority of the info in sediment.

@glwagner
Copy link
Collaborator

Also, this line does not belong inside the kernel:

     if isnan(pₐₙₒₓ) 
             error("Sediment anoxia has caused model failure") 
         end 

If you want to check for NaNs, you should do that outside the kernel. But note that the NaNChecker can also be configured to check for NaNs in any field. You may want to use that instead.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 18, 2023

Oh yeah that's a good idea. Perhaps that is the better solution to #144 as well

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 18, 2023

This is the fixed version that does some of those things:

@kernel function _calculate_tendencies!(sediment::SimpleMultiG, bgc, grid, advection, tracers, tendencies)
i, j = @index(Global, NTuple)
k = bottom_index(i, j, sediment)
Δz = zspacing(i, j, k, grid, Center(), Center(), Center())
@inbounds begin
carbon_deposition = carbon_flux(i, j, k, grid, advection, bgc, tracers) * Δz
nitrogen_deposition = nitrogen_flux(i, j, k, grid, advection, bgc, tracers) * Δz
# rates
C_min_slow = sediment.fields.C_slow[i, j, 1] * sediment.slow_decay_rate
C_min_fast = sediment.fields.C_fast[i, j, 1] * sediment.fast_decay_rate
N_min_slow = sediment.fields.N_slow[i, j, 1] * sediment.slow_decay_rate
N_min_fast = sediment.fields.N_fast[i, j, 1] * sediment.fast_decay_rate
Cᵐⁱⁿ = C_min_slow + C_min_fast
Nᵐⁱⁿ = N_min_slow + N_min_fast
reactivity = Cᵐⁱⁿ * day / (sediment.fields.C_slow[i, j, 1] + sediment.fields.C_fast[i, j, 1])
# sediment evolution
sediment.tendencies.Gⁿ.C_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * carbon_deposition - C_min_slow
sediment.tendencies.Gⁿ.C_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * carbon_deposition - C_min_fast
sediment.tendencies.Gⁿ.C_ref[i, j, 1] = sediment.refactory_fraction * carbon_deposition
sediment.tendencies.Gⁿ.N_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * nitrogen_deposition - N_min_slow
sediment.tendencies.Gⁿ.N_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * nitrogen_deposition - N_min_fast
sediment.tendencies.Gⁿ.N_ref[i, j, 1] = sediment.refactory_fraction * nitrogen_deposition
# efflux/influx
O₂ = tracers.O₂[i, j, k]
NO₃ = tracers.NO₃[i, j, k]
NH₄ = tracers.NH₄[i, j, k]
pₙᵢₜ = exp(sediment.nitrate_oxidation_params.A +
sediment.nitrate_oxidation_params.B * log(Cᵐⁱⁿ * day) * log(O₂) +
sediment.nitrate_oxidation_params.C * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.nitrate_oxidation_params.D * log(reactivity) * log(NH₄) +
sediment.nitrate_oxidation_params.E * log(Cᵐⁱⁿ * day) +
sediment.nitrate_oxidation_params.F * log(Cᵐⁱⁿ * day) * log(NH₄)) / (Nᵐⁱⁿ * day)
#=
pᵈᵉⁿⁱᵗ = exp(sediment.denitrification_params.A +
sediment.denitrification_params.B * log(Cᵐⁱⁿ * day) +
sediment.denitrification_params.C * log(NO₃) ^ 2 +
sediment.denitrification_params.D * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.denitrification_params.E * log(reactivity) ^ 2 +
sediment.denitrification_params.F * log(O₂) * log(reactivity)) / (Cᵐⁱⁿ * day)
=#
pₐₙₒₓ = exp(sediment.anoxic_params.A +
sediment.anoxic_params.B * log(Cᵐⁱⁿ * day) +
sediment.anoxic_params.C * log(Cᵐⁱⁿ * day) ^ 2 +
sediment.anoxic_params.D * log(reactivity) +
sediment.anoxic_params.E * log(O₂) * log(reactivity) +
sediment.anoxic_params.F * log(NO₃) ^ 2) / (Cᵐⁱⁿ * day)
if isnan(pₐₙₒₓ)
error("Sediment anoxia has caused model failure")
end
pₛₒₗᵢ = sediment.solid_dep_params.A * (sediment.solid_dep_params.C * sediment.solid_dep_params.depth ^ sediment.solid_dep_params.D) ^ sediment.solid_dep_params.B
tendencies.NH₄[i, j, k] += Nᵐⁱⁿ * (1 - pₙᵢₜ) / Δz
tendencies.NO₃[i, j, k] += Nᵐⁱⁿ * pₙᵢₜ / Δz
tendencies.DIC[i, j, k] += Cᵐⁱⁿ / Δz
tendencies.O₂[i, j, k] -= max(0, ((1 - pₐₙₒₓ * pₛₒₗᵢ) * Cᵐⁱⁿ + 2 * Nᵐⁱⁿ * pₙᵢₜ)/ Δz) # this seems dodge but this model doesn't cope with anoxia properly (I think)
end
end

I think the adapt structure route may be the solution, e.g. at the moment we're unnecessarily passing the G$^{-1}$ tendency fields for the sediment fields, but otherwise this kernel does need the majority of the info in sediment.

This worked!

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 18, 2023

Hmm actually this reduced the parameter size enough on for the test with a rectilinear grid for non hydrostatic and hydrostatic, but still fails on long/lat grid

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 18, 2023

We're also passing a load of repeated information about the advection schemes, maybe that could be reduced

@glwagner
Copy link
Collaborator

You want to reduce parameter size to the minimum needed. If you are near the limit, your code could fail when CUDA updates for example. It will also be difficult to develop the code. You should inspect every input argument and ensure that only the minimum is loaded onto the GPU.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 19, 2023

We do need all of the entries I think but I think the question is if we need to pass all of model.advection because it is very large. I will try to work out a way to just take the ones we need

@glwagner
Copy link
Collaborator

You may need all the entries, but you may be missing adapt_structure methods for them.

You probably want to pass just the tracer advection right? For hydrostatic models, the tracer and momentum schemes are stored separately anyways.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 19, 2023

For the advection schemes, I think we can reduce it by just passing the advection schemes for the tracers which sink.

I'm not sure where else to reduce the size though. Currently, we have:

sediment, underlying_biogeochemistry, grid, advection_schemes, tracers, timestepper.Gⁿ, tendencies.Gⁿ

I've written adapt structures for sediment so it just has the parameters and fields, underlying_biogeochemistry which always just has parameters and the sinking velocities, advection schemes which is either a scheme or NamedTuple, tracers and both tendency sets which are NamedTuples. All of these things are required because they're used or modified, or used to work out the flux of the sinking tracers.

@glwagner
Copy link
Collaborator

Which properties have fields embedded, and can you link to the code for those adapt_structure so we can see them here?

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 19, 2023

The sediment models have fields (and tendencies) which are excluded in the adapt_structure

adapt_structure(to, sediment::SimpleMultiG) =
SimpleMultiG(sediment.fast_decay_rate,
sediment.slow_decay_rate,
sediment.fast_redfield,
sediment.slow_redfield,
sediment.fast_fraction,
sediment.slow_fraction,
sediment.refactory_fraction,
sediment.nitrate_oxidation_params,
sediment.denitrification_params,
sediment.anoxic_params,
sediment.solid_dep_params,
adapt(to, sediment.fields),
nothing,
adapt(to, sediment.bottom_indices))

and

adapt_structure(to, sediment::InstantRemineralisation) =
InstantRemineralisation(sediment.burial_efficiency_constant1,
sediment.burial_efficiency_constant2,
sediment.burial_efficiency_half_saturaiton,
adapt(to, sediment.fields),
nothing,
adapt(to, sediment.bottom_indices))

and the models have sinking_velocities which are named tuples of named tuples of fields:

adapt_structure(to, npzd::NPZD) =
NutrientPhytoplanktonZooplanktonDetritus(npzd.initial_photosynthetic_slope,
npzd.base_maximum_growth,
npzd.nutrient_half_saturation,
npzd.base_respiration_rate,
npzd.phyto_base_mortality_rate,
npzd.maximum_grazing_rate,
npzd.grazing_half_saturation,
npzd.assimulation_efficiency,
npzd.base_excretion_rate,
npzd.zoo_base_mortality_rate,
npzd.remineralization_rate,
adapt(to, npzd.sinking_velocities))

and

adapt_structure(to, lobster::LOBSTER) =
LOBSTER(lobster.phytoplankton_preference,
lobster.maximum_grazing_rate,
lobster.grazing_half_saturation,
lobster.light_half_saturation,
lobster.nitrate_ammonia_inhibition,
lobster.nitrate_half_saturation,
lobster.ammonia_half_saturation,
lobster.maximum_phytoplankton_growthrate,
lobster.zooplankton_assimilation_fraction,
lobster.zooplankton_mortality,
lobster.zooplankton_excretion_rate,
lobster.phytoplankton_mortality,
lobster.small_detritus_remineralisation_rate,
lobster.large_detritus_remineralisation_rate,
lobster.phytoplankton_exudation_fraction,
lobster.nitrifcaiton_rate,
lobster.ammonia_fraction_of_exudate,
lobster.ammonia_fraction_of_excriment,
lobster.ammonia_fraction_of_detritus,
lobster.phytoplankton_redfield,
lobster.organic_redfield,
lobster.phytoplankton_chlorophyll_ratio,
lobster.organic_carbon_calcate_ratio,
lobster.respiraiton_oxygen_nitrogen_ratio,
lobster.nitrifcation_oxygen_nitrogen_ratio,
lobster.slow_sinking_mortality_fraction,
lobster.fast_sinking_mortality_fraction,
lobster.disolved_organic_breakdown_rate,
lobster.zooplankton_calcite_dissolution,
lobster.optionals,
adapt(to, lobster.sinking_velocities))

We also have this:

adapt_structure(to, velocities::NamedTuple{(:u, :v, :w), Tuple{AbstractField, AbstractField, AbstractField}}) = NamedTuple{(:u, :v, :w)}(adapt.(to, values(velocities)))

which I think might be redundant

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 19, 2023

I guess some of the complexity for SimpleMultiG is also that the _params properties are tuples. Do they need to be adapted?

@glwagner
Copy link
Collaborator

I guess some of the complexity for SimpleMultiG is also that the _params properties are tuples. Do they need to be adapted?

perhaps! It's conservative to propagate the adapt to every property.

Note that you also want to write your code to be robust to change in the future. When you assume that you don't need to adapt some property, you implicitly prevent someone from improving / extending your model to properties that do need adaptation. By conservatively adapting everything, you grease the wheels for scientific advancement in the future.

@glwagner
Copy link
Collaborator

If you call adapt on a tuple, it will call adapt on every one of it's properties:

https://github.com/JuliaGPU/Adapt.jl/blob/df06bcb6936baa7352b8cc7bf5f08f98f2653f25/src/base.jl#L3

The basic structure of any adapt for a custom struct should follow the same logic. The extra thing that custom structs can do it to completely throw away unneeded properties (ie set them to nothing). Or other bespoke actions.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

Okay this is a good point, I'll update all the adapts to make sure everything gets adapted.

Looking at the error message here:
image

a lot of the information in lines 2 to 4 is about those named tuples so I'll have a go running it with them changed to tuples

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

That took 8 bytes off the parameter size so wasn't very successful

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

Vectors worked a bit better taking 56 bytes off. For some reason forcing it to only pass one of the advection schemes (rather than an NamedTuple of them) doesn't save any

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

Our problems are solved: JuliaGPU/CUDA.jl#2080 !!

@glwagner
Copy link
Collaborator

Wow, that's huge. Hopefully there isn't a catastrophic loss of performance...

Are you sure the function that failed is the one we are concerned about? It's not clear from the error.

@glwagner
Copy link
Collaborator

and the models have sinking_velocities which are named tuples of named tuples of fields:

This nested structure is often the cause for issues. You can try to flatten it, perhaps.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

Yeah hopefully, I can't find docs about the ISA change that allows this, but presumably someone down the line has tested the performance change.

And yeah its this

@kernel function _calculate_tendencies!(sediment::SimpleMultiG, bgc, grid, advection, tracers, tendencies, sediment_tendencies)

method (there was more detail somewhere else in the full error that confirmed it to me).

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

This nested structure is often the cause for issues. You can try to flatten it, perhaps.

Thinking about it we never have u or v components of slip velocity so we can reduce the complexity to just named tuple. Will test.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

So near, with all of the optimisation above and removing the u and v components from sinking velocities its still 8bytes too large.

@glwagner
Copy link
Collaborator

Out of curiosity (wondering if something I said was wrong) --- does it matter what the body of the kernel is? For example you could comment everything out except perhaps something trivial.

@glwagner
Copy link
Collaborator

As a fallback solution, you might try separating these calculations into multiple kernels.

It might be a good idea anyways to reorganize the code so that it's easily to toggle back and forth. For example, right now the tendency calculation for the different species are intertwined.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

Out of curiosity (wondering if something I said was wrong) --- does it matter what the body of the kernel is? For example you could comment everything out except perhaps something

It does still fail

@jagoosw
Copy link
Collaborator Author

jagoosw commented Sep 20, 2023

As a fallback solution, you might try separating these calculations into multiple kernels.

It might be a good idea anyways to reorganize the code so that it's easily to toggle back and forth. For example, right now the tendency calculation for the different species are intertwined.

I could see this being a better idea.

It is now working (by only giving it the tracers and tendencies it needs) but I assume the parameter size is close to the limits,is there a way for me to check?

I think I would rather make restructuring how the sediment tendencies are calculated a different PR since this is getting quite long now?

@glwagner
Copy link
Collaborator

Of course, no need to solve the world in one PR.

@jagoosw
Copy link
Collaborator Author

jagoosw commented Oct 3, 2023

Initial issue closed by #138 and remainder superceeded by #147

@jagoosw jagoosw closed this as completed Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants