`SimpleMultiG` sediment models will not run on GPU #143

jagoosw · 2023-09-18T16:18:26Z

While the kernel for SimpleMultiG sediments will compile on GPU they will not run due to their complexity leading to a PTX error about parameter size being thrown. This may be fixed in the future if LLVM updates its CUDA version support (JuliaGPU/CUDA.jl#2080). We may also be able to rework the model to use less parameter space so please do get in contact if you would like to use the model on GPU.

An error similar to the following will be raised:

Failed to compile PTX code (ptxas exited with code 255)
Invocation arguments: --generate-line-info --verbose --gpu-name sm_75 --output-file /tmp/jl_bELIheLkzO.cubin /tmp/jl_VczMJifKRu.ptx
ptxas /tmp/jl_VczMJifKRu.ptx, line 3215; error   : Entry function '_Z26gpu__calculate_tendencies_16CompilerMetadataI10StaticSizeI6_3__3_E12DynamicCheckvv7NDRangeILi2ES0_I6_1__1_ES0_I8_16__16_EvvEE12SimpleMultiGI7Float6410NamedTupleI24__A___B___C___D___E___F_5TupleIS4_S4_S4_S4_S4_S4_EES5_I24__A___B___C___D___E___F_S6_IS4_S4_S4_S4_S4_S4_EES5_I24__A___B___C___D___E___F_S6_IS4_S4_S4_S4_S4_S4_EES5_I24__A___B___C___D___depth_S6_IS4_S4_5Int64S4_S4_EES5_I52__C_slow___C_fast___N_slow___N_fast___C_ref___N_ref_S6_I11OffsetArrayIS4_Li3E13CuDeviceArrayIS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEEEES5_I10__G____G__S6_IS5_I52__C_slow___C_fast___N_slow___N_fast___C_ref___N_ref_S6_IS8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEEEES5_I52__C_slow___C_fast___N_slow___N_fast___C_ref___N_ref_S6_IS8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEES8_IS4_Li3ES9_IS4_Li3ELi1EEEEEEES9_IS7_Li2ELi1EEE7LOBSTERIS4_3ValI18_true__true__true_ES5_I14__sPOM___bPOM_S6_IS5_I12__u___v___w_S6_I13ConstantFieldIS7_Li3EES12_IS7_Li3EES8_IS4_Li3ES9_IS4_Li3ELi1EEEEES5_I12__u___v___w_S6_IS12_IS7_Li3EES12_IS7_Li3EES8_IS4_Li3ES9_IS4_Li3ELi1EEEEEEEE15RectilinearGridIS4_8PeriodicS14_7BoundedS4_S4_S4_S8_IS4_Li1...
ptxas /tmp/jl_VczMJifKRu.ptx, line 3215; error   : Feature 'Kernel parameter size larger than 4352 bytes' requires PTX ISA .version 8.1 or later
ptxas fatal   : Ptx assembly aborted due to errors
If you think this is a bug, please file an issue and attach /tmp/jl_VczMJifKRu.ptx

Also, note that InstantRemineralisation sediment does work on GPU.

Discussed in #138.

The text was updated successfully, but these errors were encountered:

glwagner · 2023-09-18T20:17:45Z

While the kernel for SimpleMultiG sediments

where is this kernel? Link to code?

jagoosw · 2023-09-18T20:20:29Z

This is as it currently is in main:

OceanBioME.jl/src/Boundaries/Sediments/simple_multi_G.jl

Lines 159 to 234 in ac8419a

    
           @kernel function _calculate_tendencies!(sediment::SimpleMultiG, bgc, grid, advection, tracers, timestepper) 
        
               i, j = @index(Global, NTuple) 
        
               k = bottom_index(i, j, sediment) 
        
               Δz = zspacing(i, j, k, grid, Center(), Center(), Center()) 
        
               @inbounds begin 
        
                   carbon_deposition = carbon_flux(i, j, k, grid, advection, bgc, tracers) * Δz 
        
                   nitrogen_deposition = nitrogen_flux(i, j, k, grid, advection, bgc, tracers) * Δz 
        
                   # rates 
        
                   C_min_slow = sediment.fields.C_slow[i, j, 1] * sediment.slow_decay_rate 
        
                   C_min_fast = sediment.fields.C_fast[i, j, 1] * sediment.fast_decay_rate 
        
                   N_min_slow = sediment.fields.N_slow[i, j, 1] * sediment.slow_decay_rate 
        
                   N_min_fast = sediment.fields.N_fast[i, j, 1] * sediment.fast_decay_rate 
        
                   Cᵐⁱⁿ = C_min_slow + C_min_fast 
        
                   Nᵐⁱⁿ = N_min_slow + N_min_fast 
        
                   k = Cᵐⁱⁿ * day / (sediment.fields.C_slow[i, j, 1] + sediment.fields.C_fast[i, j, 1]) 
        
                   # sediment evolution 
        
                   sediment.tendencies.Gⁿ.C_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * carbon_deposition - C_min_slow 
        
                   sediment.tendencies.Gⁿ.C_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * carbon_deposition - C_min_fast 
        
                   sediment.tendencies.Gⁿ.C_ref[i, j, 1] = sediment.refactory_fraction * carbon_deposition 
        
                   sediment.tendencies.Gⁿ.N_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * nitrogen_deposition - N_min_slow 
        
                   sediment.tendencies.Gⁿ.N_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * nitrogen_deposition - N_min_fast 
        
                   sediment.tendencies.Gⁿ.N_ref[i, j, 1] = sediment.refactory_fraction * nitrogen_deposition 
        
                   # efflux/influx 
        
                   O₂  = tracers.O₂[i, j, 1] 
        
                   NO₃ = tracers.NO₃[i, j, 1] 
        
                   NH₄ = tracers.NH₄[i, j, 1] 
        
                   pₙᵢₜ = exp(sediment.nitrate_oxidation_params.A + 
        
                              sediment.nitrate_oxidation_params.B * log(Cᵐⁱⁿ * day) * log(O₂) + 
        
                              sediment.nitrate_oxidation_params.C * log(Cᵐⁱⁿ * day) ^ 2 + 
        
                              sediment.nitrate_oxidation_params.D * log(k) * log(NH₄) + 
        
                              sediment.nitrate_oxidation_params.E * log(Cᵐⁱⁿ * day) + 
        
                              sediment.nitrate_oxidation_params.F * log(Cᵐⁱⁿ * day) * log(NH₄)) / (Nᵐⁱⁿ * day) 
        
                   #= 
        
                   pᵈᵉⁿⁱᵗ = exp(sediment.denitrification_params.A + 
        
                                sediment.denitrification_params.B * log(Cᵐⁱⁿ * day) + 
        
                                sediment.denitrification_params.C * log(NO₃) ^ 2 + 
        
                                sediment.denitrification_params.D * log(Cᵐⁱⁿ * day) ^ 2 + 
        
                                sediment.denitrification_params.E * log(k) ^ 2 + 
        
                                sediment.denitrification_params.F * log(O₂) * log(k)) / (Cᵐⁱⁿ * day) 
        
                   =# 
        
                   pₐₙₒₓ = exp(sediment.anoxic_params.A + 
        
                               sediment.anoxic_params.B * log(Cᵐⁱⁿ * day) + 
        
                               sediment.anoxic_params.C * log(Cᵐⁱⁿ * day) ^ 2 + 
        
                               sediment.anoxic_params.D * log(k) + 
        
                               sediment.anoxic_params.E * log(O₂) * log(k) + 
        
                               sediment.anoxic_params.F * log(NO₃) ^ 2) / (Cᵐⁱⁿ * day) 
        
                   if isnan(pₐₙₒₓ) 
        
                       println("$(Cᵐⁱⁿ), $(k), $(O₂), $(NO₃)") 
        
                       error("Sediment anoxia has caused model failure") 
        
                   end 
        
                   pₛₒₗᵢ = sediment.solid_dep_params.A * (sediment.solid_dep_params.C * sediment.solid_dep_params.depth ^ sediment.solid_dep_params.D) ^ sediment.solid_dep_params.B 
        
                   Δz = grid.Δzᵃᵃᶜ[1] 
        
                   timestepper.Gⁿ.NH₄[i, j, 1] += Nᵐⁱⁿ * (1 - pₙᵢₜ) / Δz 
        
                   timestepper.Gⁿ.NO₃[i, j, 1] += Nᵐⁱⁿ * pₙᵢₜ / Δz 
        
                   timestepper.Gⁿ.DIC[i, j, 1] += Cᵐⁱⁿ / Δz 
        
                   timestepper.Gⁿ.O₂[i, j, 1]  -= max(0, ((1 - pₐₙₒₓ * pₛₒₗᵢ) * Cᵐⁱⁿ + 2 * Nᵐⁱⁿ * pₙᵢₜ)/ Δz) # this seems dodge but this model doesn't cope with anoxia properly 
        
               end 
        
           end

Which has another bug in it which is fixed in #138

glwagner · 2023-09-18T20:25:50Z

Interesting. I'm surprised this won't compile. Likely you can simplify the arguments to the kernel to get it to compile. Parameter space issues don't have to do with kernel complexity directly, but rather than complexity of the input arguments (ie using Field rather than OffsetArray). Reducing the input argument complexity is why, for example, we can't support Field directly in GPU kernels and are forced to work with OffsetArrays instead.

One issue might be that the time-stepper is input directly. You only need the named tuple Gⁿ --- it could be better to use that as an input argument instead.

sediment may also have more info than you need on the GPU, which you can simplify with an appropriate adapt_structure method.

In general, its good practice to define adapt_structure to return the minimal info expected to be useful within a GPU kernel so as much is compilable as possible.

jagoosw · 2023-09-18T20:29:03Z

This is the fixed version that does some of those things:

OceanBioME.jl/src/Boundaries/Sediments/simple_multi_G.jl

Lines 175 to 246 in 8f133df

    
           @kernel function _calculate_tendencies!(sediment::SimpleMultiG, bgc, grid, advection, tracers, tendencies) 
        
               i, j = @index(Global, NTuple) 
        
               k = bottom_index(i, j, sediment) 
        
               Δz = zspacing(i, j, k, grid, Center(), Center(), Center()) 
        
               @inbounds begin 
        
                   carbon_deposition = carbon_flux(i, j, k, grid, advection, bgc, tracers) * Δz 
        
                   nitrogen_deposition = nitrogen_flux(i, j, k, grid, advection, bgc, tracers) * Δz 
        
                   # rates 
        
                   C_min_slow = sediment.fields.C_slow[i, j, 1] * sediment.slow_decay_rate 
        
                   C_min_fast = sediment.fields.C_fast[i, j, 1] * sediment.fast_decay_rate 
        
                   N_min_slow = sediment.fields.N_slow[i, j, 1] * sediment.slow_decay_rate 
        
                   N_min_fast = sediment.fields.N_fast[i, j, 1] * sediment.fast_decay_rate 
        
                   Cᵐⁱⁿ = C_min_slow + C_min_fast 
        
                   Nᵐⁱⁿ = N_min_slow + N_min_fast 
        
                   reactivity = Cᵐⁱⁿ * day / (sediment.fields.C_slow[i, j, 1] + sediment.fields.C_fast[i, j, 1]) 
        
                   # sediment evolution 
        
                   sediment.tendencies.Gⁿ.C_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * carbon_deposition - C_min_slow 
        
                   sediment.tendencies.Gⁿ.C_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * carbon_deposition - C_min_fast 
        
                   sediment.tendencies.Gⁿ.C_ref[i, j, 1] = sediment.refactory_fraction * carbon_deposition 
        
                   sediment.tendencies.Gⁿ.N_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * nitrogen_deposition - N_min_slow 
        
                   sediment.tendencies.Gⁿ.N_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * nitrogen_deposition - N_min_fast 
        
                   sediment.tendencies.Gⁿ.N_ref[i, j, 1] = sediment.refactory_fraction * nitrogen_deposition 
        
                   # efflux/influx 
        
                   O₂  = tracers.O₂[i, j, k] 
        
                   NO₃ = tracers.NO₃[i, j, k] 
        
                   NH₄ = tracers.NH₄[i, j, k] 
        
                   pₙᵢₜ = exp(sediment.nitrate_oxidation_params.A + 
        
                              sediment.nitrate_oxidation_params.B * log(Cᵐⁱⁿ * day) * log(O₂) + 
        
                              sediment.nitrate_oxidation_params.C * log(Cᵐⁱⁿ * day) ^ 2 + 
        
                              sediment.nitrate_oxidation_params.D * log(reactivity) * log(NH₄) + 
        
                              sediment.nitrate_oxidation_params.E * log(Cᵐⁱⁿ * day) + 
        
                              sediment.nitrate_oxidation_params.F * log(Cᵐⁱⁿ * day) * log(NH₄)) / (Nᵐⁱⁿ * day) 
        
                   #= 
        
                   pᵈᵉⁿⁱᵗ = exp(sediment.denitrification_params.A + 
        
                                sediment.denitrification_params.B * log(Cᵐⁱⁿ * day) + 
        
                                sediment.denitrification_params.C * log(NO₃) ^ 2 + 
        
                                sediment.denitrification_params.D * log(Cᵐⁱⁿ * day) ^ 2 + 
        
                                sediment.denitrification_params.E * log(reactivity) ^ 2 + 
        
                                sediment.denitrification_params.F * log(O₂) * log(reactivity)) / (Cᵐⁱⁿ * day) 
        
                   =# 
        
                   pₐₙₒₓ = exp(sediment.anoxic_params.A + 
        
                               sediment.anoxic_params.B * log(Cᵐⁱⁿ * day) + 
        
                               sediment.anoxic_params.C * log(Cᵐⁱⁿ * day) ^ 2 + 
        
                               sediment.anoxic_params.D * log(reactivity) + 
        
                               sediment.anoxic_params.E * log(O₂) * log(reactivity) + 
        
                               sediment.anoxic_params.F * log(NO₃) ^ 2) / (Cᵐⁱⁿ * day) 
        
                   if isnan(pₐₙₒₓ) 
        
                       error("Sediment anoxia has caused model failure") 
        
                   end 
        
                   pₛₒₗᵢ = sediment.solid_dep_params.A * (sediment.solid_dep_params.C * sediment.solid_dep_params.depth ^ sediment.solid_dep_params.D) ^ sediment.solid_dep_params.B 
        
                   tendencies.NH₄[i, j, k] += Nᵐⁱⁿ * (1 - pₙᵢₜ) / Δz 
        
                   tendencies.NO₃[i, j, k] += Nᵐⁱⁿ * pₙᵢₜ / Δz 
        
                   tendencies.DIC[i, j, k] += Cᵐⁱⁿ / Δz 
        
                   tendencies.O₂[i, j, k]  -= max(0, ((1 - pₐₙₒₓ * pₛₒₗᵢ) * Cᵐⁱⁿ + 2 * Nᵐⁱⁿ * pₙᵢₜ)/ Δz) # this seems dodge but this model doesn't cope with anoxia properly (I think) 
        
               end 
        
           end

I think the adapt structure route may be the solution, e.g. at the moment we're unnecessarily passing the G$^{-1}$ tendency fields for the sediment fields, but otherwise this kernel does need the majority of the info in sediment.

glwagner · 2023-09-18T20:35:55Z

Also, this line does not belong inside the kernel:

     if isnan(pₐₙₒₓ) 
             error("Sediment anoxia has caused model failure") 
         end

If you want to check for NaNs, you should do that outside the kernel. But note that the NaNChecker can also be configured to check for NaNs in any field. You may want to use that instead.

jagoosw · 2023-09-18T20:41:09Z

Oh yeah that's a good idea. Perhaps that is the better solution to #144 as well

jagoosw · 2023-09-18T21:04:48Z

This is the fixed version that does some of those things:

OceanBioME.jl/src/Boundaries/Sediments/simple_multi_G.jl

Lines 175 to 246 in 8f133df

@kernel function _calculate_tendencies!(sediment::SimpleMultiG, bgc, grid, advection, tracers, tendencies)

i, j = @index(Global, NTuple)

k = bottom_index(i, j, sediment)

Δz = zspacing(i, j, k, grid, Center(), Center(), Center())

@inbounds begin

carbon_deposition = carbon_flux(i, j, k, grid, advection, bgc, tracers) * Δz

nitrogen_deposition = nitrogen_flux(i, j, k, grid, advection, bgc, tracers) * Δz

# rates

C_min_slow = sediment.fields.C_slow[i, j, 1] * sediment.slow_decay_rate

C_min_fast = sediment.fields.C_fast[i, j, 1] * sediment.fast_decay_rate

N_min_slow = sediment.fields.N_slow[i, j, 1] * sediment.slow_decay_rate

N_min_fast = sediment.fields.N_fast[i, j, 1] * sediment.fast_decay_rate

Cᵐⁱⁿ = C_min_slow + C_min_fast

Nᵐⁱⁿ = N_min_slow + N_min_fast

reactivity = Cᵐⁱⁿ * day / (sediment.fields.C_slow[i, j, 1] + sediment.fields.C_fast[i, j, 1])

# sediment evolution

sediment.tendencies.Gⁿ.C_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * carbon_deposition - C_min_slow

sediment.tendencies.Gⁿ.C_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * carbon_deposition - C_min_fast

sediment.tendencies.Gⁿ.C_ref[i, j, 1] = sediment.refactory_fraction * carbon_deposition

sediment.tendencies.Gⁿ.N_slow[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.slow_fraction * nitrogen_deposition - N_min_slow

sediment.tendencies.Gⁿ.N_fast[i, j, 1] = (1 - sediment.refactory_fraction) * sediment.fast_fraction * nitrogen_deposition - N_min_fast

sediment.tendencies.Gⁿ.N_ref[i, j, 1] = sediment.refactory_fraction * nitrogen_deposition

# efflux/influx

O₂ = tracers.O₂[i, j, k]

NO₃ = tracers.NO₃[i, j, k]

NH₄ = tracers.NH₄[i, j, k]

pₙᵢₜ = exp(sediment.nitrate_oxidation_params.A +

sediment.nitrate_oxidation_params.B * log(Cᵐⁱⁿ * day) * log(O₂) +

sediment.nitrate_oxidation_params.C * log(Cᵐⁱⁿ * day) ^ 2 +

sediment.nitrate_oxidation_params.D * log(reactivity) * log(NH₄) +

sediment.nitrate_oxidation_params.E * log(Cᵐⁱⁿ * day) +

sediment.nitrate_oxidation_params.F * log(Cᵐⁱⁿ * day) * log(NH₄)) / (Nᵐⁱⁿ * day)

#=

pᵈᵉⁿⁱᵗ = exp(sediment.denitrification_params.A +

sediment.denitrification_params.B * log(Cᵐⁱⁿ * day) +

sediment.denitrification_params.C * log(NO₃) ^ 2 +

sediment.denitrification_params.D * log(Cᵐⁱⁿ * day) ^ 2 +

sediment.denitrification_params.E * log(reactivity) ^ 2 +

sediment.denitrification_params.F * log(O₂) * log(reactivity)) / (Cᵐⁱⁿ * day)

=#

pₐₙₒₓ = exp(sediment.anoxic_params.A +

sediment.anoxic_params.B * log(Cᵐⁱⁿ * day) +

sediment.anoxic_params.C * log(Cᵐⁱⁿ * day) ^ 2 +

sediment.anoxic_params.D * log(reactivity) +

sediment.anoxic_params.E * log(O₂) * log(reactivity) +

sediment.anoxic_params.F * log(NO₃) ^ 2) / (Cᵐⁱⁿ * day)

if isnan(pₐₙₒₓ)

error("Sediment anoxia has caused model failure")

end

pₛₒₗᵢ = sediment.solid_dep_params.A * (sediment.solid_dep_params.C * sediment.solid_dep_params.depth ^ sediment.solid_dep_params.D) ^ sediment.solid_dep_params.B

tendencies.NH₄[i, j, k] += Nᵐⁱⁿ * (1 - pₙᵢₜ) / Δz

tendencies.NO₃[i, j, k] += Nᵐⁱⁿ * pₙᵢₜ / Δz

tendencies.DIC[i, j, k] += Cᵐⁱⁿ / Δz

tendencies.O₂[i, j, k] -= max(0, ((1 - pₐₙₒₓ * pₛₒₗᵢ) * Cᵐⁱⁿ + 2 * Nᵐⁱⁿ * pₙᵢₜ)/ Δz) # this seems dodge but this model doesn't cope with anoxia properly (I think)

end

end

I think the adapt structure route may be the solution, e.g. at the moment we're unnecessarily passing the G$^{-1}$ tendency fields for the sediment fields, but otherwise this kernel does need the majority of the info in sediment.

This worked!

jagoosw · 2023-09-18T21:48:46Z

Hmm actually this reduced the parameter size enough on for the test with a rectilinear grid for non hydrostatic and hydrostatic, but still fails on long/lat grid

jagoosw · 2023-09-18T21:53:34Z

We're also passing a load of repeated information about the advection schemes, maybe that could be reduced

glwagner · 2023-09-19T15:26:23Z

You want to reduce parameter size to the minimum needed. If you are near the limit, your code could fail when CUDA updates for example. It will also be difficult to develop the code. You should inspect every input argument and ensure that only the minimum is loaded onto the GPU.

jagoosw · 2023-09-19T15:36:21Z

We do need all of the entries I think but I think the question is if we need to pass all of model.advection because it is very large. I will try to work out a way to just take the ones we need

glwagner · 2023-09-19T15:40:18Z

You may need all the entries, but you may be missing adapt_structure methods for them.

You probably want to pass just the tracer advection right? For hydrostatic models, the tracer and momentum schemes are stored separately anyways.

jagoosw · 2023-09-19T15:54:26Z

For the advection schemes, I think we can reduce it by just passing the advection schemes for the tracers which sink.

I'm not sure where else to reduce the size though. Currently, we have:

sediment, underlying_biogeochemistry, grid, advection_schemes, tracers, timestepper.Gⁿ, tendencies.Gⁿ

I've written adapt structures for sediment so it just has the parameters and fields, underlying_biogeochemistry which always just has parameters and the sinking velocities, advection schemes which is either a scheme or NamedTuple, tracers and both tendency sets which are NamedTuples. All of these things are required because they're used or modified, or used to work out the flux of the sinking tracers.

glwagner · 2023-09-19T17:14:49Z

Which properties have fields embedded, and can you link to the code for those adapt_structure so we can see them here?

jagoosw · 2023-09-19T18:45:17Z

The sediment models have fields (and tendencies) which are excluded in the adapt_structure

OceanBioME.jl/src/Boundaries/Sediments/simple_multi_G.jl

Lines 149 to 163 in 5abb421

    
           adapt_structure(to, sediment::SimpleMultiG) =  
        
               SimpleMultiG(sediment.fast_decay_rate, 
        
                            sediment.slow_decay_rate, 
        
                            sediment.fast_redfield, 
        
                            sediment.slow_redfield, 
        
                            sediment.fast_fraction, 
        
                            sediment.slow_fraction, 
        
                            sediment.refactory_fraction, 
        
                            sediment.nitrate_oxidation_params, 
        
                            sediment.denitrification_params, 
        
                            sediment.anoxic_params, 
        
                            sediment.solid_dep_params, 
        
                            adapt(to, sediment.fields), 
        
                            nothing, 
        
                            adapt(to, sediment.bottom_indices))

and

OceanBioME.jl/src/Boundaries/Sediments/instant_remineralization.jl

Lines 76 to 82 in 5abb421

    
           adapt_structure(to, sediment::InstantRemineralisation) =  
        
               InstantRemineralisation(sediment.burial_efficiency_constant1, 
        
                                       sediment.burial_efficiency_constant2, 
        
                                       sediment.burial_efficiency_half_saturaiton, 
        
                                       adapt(to, sediment.fields), 
        
                                       nothing, 
        
                                       adapt(to, sediment.bottom_indices))

and the models have sinking_velocities which are named tuples of named tuples of fields:

OceanBioME.jl/src/Models/AdvectedPopulations/NPZD.jl

Lines 306 to 321 in 5abb421

    
           adapt_structure(to, npzd::NPZD) =  
        
               NutrientPhytoplanktonZooplanktonDetritus(npzd.initial_photosynthetic_slope, 
        
                                                        npzd.base_maximum_growth, 
        
                                                        npzd.nutrient_half_saturation, 
        
                                                        npzd.base_respiration_rate, 
        
                                                        npzd.phyto_base_mortality_rate, 
        
                                                        npzd.maximum_grazing_rate, 
        
                                                        npzd.grazing_half_saturation, 
        
                                                        npzd.assimulation_efficiency, 
        
                                                        npzd.base_excretion_rate, 
        
                                                        npzd.zoo_base_mortality_rate, 
        
                                                        npzd.remineralization_rate, 
        
                                                        adapt(to, npzd.sinking_velocities))

and

OceanBioME.jl/src/Models/AdvectedPopulations/LOBSTER/LOBSTER.jl

Lines 402 to 433 in 5abb421

    
           adapt_structure(to, lobster::LOBSTER) = 
        
               LOBSTER(lobster.phytoplankton_preference, 
        
                       lobster.maximum_grazing_rate, 
        
                       lobster.grazing_half_saturation, 
        
                       lobster.light_half_saturation, 
        
                       lobster.nitrate_ammonia_inhibition, 
        
                       lobster.nitrate_half_saturation, 
        
                       lobster.ammonia_half_saturation, 
        
                       lobster.maximum_phytoplankton_growthrate, 
        
                       lobster.zooplankton_assimilation_fraction, 
        
                       lobster.zooplankton_mortality, 
        
                       lobster.zooplankton_excretion_rate, 
        
                       lobster.phytoplankton_mortality, 
        
                       lobster.small_detritus_remineralisation_rate, 
        
                       lobster.large_detritus_remineralisation_rate, 
        
                       lobster.phytoplankton_exudation_fraction, 
        
                       lobster.nitrifcaiton_rate, 
        
                       lobster.ammonia_fraction_of_exudate, 
        
                       lobster.ammonia_fraction_of_excriment, 
        
                       lobster.ammonia_fraction_of_detritus, 
        
                       lobster.phytoplankton_redfield, 
        
                       lobster.organic_redfield, 
        
                       lobster.phytoplankton_chlorophyll_ratio, 
        
                       lobster.organic_carbon_calcate_ratio, 
        
                       lobster.respiraiton_oxygen_nitrogen_ratio, 
        
                       lobster.nitrifcation_oxygen_nitrogen_ratio, 
        
                       lobster.slow_sinking_mortality_fraction, 
        
                       lobster.fast_sinking_mortality_fraction, 
        
                       lobster.disolved_organic_breakdown_rate, 
        
                       lobster.zooplankton_calcite_dissolution, 
        
                       lobster.optionals, 
        
                       adapt(to, lobster.sinking_velocities))

We also have this:

OceanBioME.jl/src/Utils/sinking_velocity_fields.jl

Line 29 in 5abb421

    
           adapt_structure(to, velocities::NamedTuple{(:u, :v, :w), Tuple{AbstractField, AbstractField, AbstractField}}) = NamedTuple{(:u, :v, :w)}(adapt.(to, values(velocities)))

which I think might be redundant

jagoosw · 2023-09-19T18:46:05Z

I guess some of the complexity for SimpleMultiG is also that the _params properties are tuples. Do they need to be adapted?

glwagner · 2023-09-20T11:59:00Z

I guess some of the complexity for SimpleMultiG is also that the _params properties are tuples. Do they need to be adapted?

perhaps! It's conservative to propagate the adapt to every property.

Note that you also want to write your code to be robust to change in the future. When you assume that you don't need to adapt some property, you implicitly prevent someone from improving / extending your model to properties that do need adaptation. By conservatively adapting everything, you grease the wheels for scientific advancement in the future.

glwagner · 2023-09-20T12:00:49Z

If you call adapt on a tuple, it will call adapt on every one of it's properties:

https://github.com/JuliaGPU/Adapt.jl/blob/df06bcb6936baa7352b8cc7bf5f08f98f2653f25/src/base.jl#L3

The basic structure of any adapt for a custom struct should follow the same logic. The extra thing that custom structs can do it to completely throw away unneeded properties (ie set them to nothing). Or other bespoke actions.

jagoosw · 2023-09-20T14:52:13Z

Okay this is a good point, I'll update all the adapts to make sure everything gets adapted.

Looking at the error message here:

a lot of the information in lines 2 to 4 is about those named tuples so I'll have a go running it with them changed to tuples

jagoosw · 2023-09-20T15:11:35Z

That took 8 bytes off the parameter size so wasn't very successful

jagoosw · 2023-09-20T15:24:21Z

Vectors worked a bit better taking 56 bytes off. For some reason forcing it to only pass one of the advection schemes (rather than an NamedTuple of them) doesn't save any

jagoosw · 2023-09-20T15:26:52Z

Our problems are solved: JuliaGPU/CUDA.jl#2080 !!

glwagner · 2023-09-20T20:19:56Z

Wow, that's huge. Hopefully there isn't a catastrophic loss of performance...

Are you sure the function that failed is the one we are concerned about? It's not clear from the error.

glwagner · 2023-09-20T20:21:11Z

and the models have sinking_velocities which are named tuples of named tuples of fields:

This nested structure is often the cause for issues. You can try to flatten it, perhaps.

jagoosw · 2023-09-20T20:44:20Z

Yeah hopefully, I can't find docs about the ISA change that allows this, but presumably someone down the line has tested the performance change.

And yeah its this

OceanBioME.jl/src/Boundaries/Sediments/simple_multi_G.jl

Line 175 in 5abb421

    
           @kernel function _calculate_tendencies!(sediment::SimpleMultiG, bgc, grid, advection, tracers, tendencies, sediment_tendencies)

method (there was more detail somewhere else in the full error that confirmed it to me).

jagoosw · 2023-09-20T20:44:35Z

This nested structure is often the cause for issues. You can try to flatten it, perhaps.

Thinking about it we never have u or v components of slip velocity so we can reduce the complexity to just named tuple. Will test.

jagoosw · 2023-09-20T21:07:43Z

So near, with all of the optimisation above and removing the u and v components from sinking velocities its still 8bytes too large.

glwagner · 2023-09-20T21:22:27Z

Out of curiosity (wondering if something I said was wrong) --- does it matter what the body of the kernel is? For example you could comment everything out except perhaps something trivial.

glwagner · 2023-09-20T21:23:45Z

As a fallback solution, you might try separating these calculations into multiple kernels.

It might be a good idea anyways to reorganize the code so that it's easily to toggle back and forth. For example, right now the tendency calculation for the different species are intertwined.

jagoosw · 2023-09-20T21:26:00Z

Out of curiosity (wondering if something I said was wrong) --- does it matter what the body of the kernel is? For example you could comment everything out except perhaps something

It does still fail

jagoosw · 2023-09-20T21:38:37Z

As a fallback solution, you might try separating these calculations into multiple kernels.

It might be a good idea anyways to reorganize the code so that it's easily to toggle back and forth. For example, right now the tendency calculation for the different species are intertwined.

I could see this being a better idea.

It is now working (by only giving it the tracers and tendencies it needs) but I assume the parameter size is close to the limits,is there a way for me to check?

I think I would rather make restructuring how the sediment tendencies are calculated a different PR since this is getting quite long now?

glwagner · 2023-09-21T04:30:01Z

Of course, no need to solve the world in one PR.

jagoosw · 2023-10-03T14:53:15Z

Initial issue closed by #138 and remainder superceeded by #147

jagoosw added bug Something isn't working awaiting upstream change GPU labels Sep 18, 2023

jagoosw mentioned this issue Sep 18, 2023

(0.7.0) Fix all the GPU bugs that have crept in #138

Merged

jagoosw mentioned this issue Oct 3, 2023

Sediment tendency calculations #147

Open

jagoosw closed this as completed Oct 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`SimpleMultiG` sediment models will not run on GPU #143

`SimpleMultiG` sediment models will not run on GPU #143

jagoosw commented Sep 18, 2023 •

edited

Loading

glwagner commented Sep 18, 2023

jagoosw commented Sep 18, 2023

glwagner commented Sep 18, 2023

jagoosw commented Sep 18, 2023 •

edited

Loading

glwagner commented Sep 18, 2023

jagoosw commented Sep 18, 2023

jagoosw commented Sep 18, 2023

jagoosw commented Sep 18, 2023

jagoosw commented Sep 18, 2023

glwagner commented Sep 19, 2023

jagoosw commented Sep 19, 2023

glwagner commented Sep 19, 2023

jagoosw commented Sep 19, 2023 •

edited

Loading

glwagner commented Sep 19, 2023

jagoosw commented Sep 19, 2023

jagoosw commented Sep 19, 2023 •

edited

Loading

glwagner commented Sep 20, 2023

glwagner commented Sep 20, 2023

jagoosw commented Sep 20, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023

jagoosw commented Sep 20, 2023

glwagner commented Sep 20, 2023

glwagner commented Sep 20, 2023

jagoosw commented Sep 20, 2023

jagoosw commented Sep 20, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023

glwagner commented Sep 20, 2023

glwagner commented Sep 20, 2023

jagoosw commented Sep 20, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023 •

edited

Loading

glwagner commented Sep 21, 2023

jagoosw commented Oct 3, 2023

SimpleMultiG sediment models will not run on GPU #143

SimpleMultiG sediment models will not run on GPU #143

Comments

jagoosw commented Sep 18, 2023 • edited Loading

glwagner commented Sep 18, 2023

jagoosw commented Sep 18, 2023

glwagner commented Sep 18, 2023

jagoosw commented Sep 18, 2023 • edited Loading

glwagner commented Sep 18, 2023

jagoosw commented Sep 18, 2023

jagoosw commented Sep 18, 2023

jagoosw commented Sep 18, 2023

jagoosw commented Sep 18, 2023

glwagner commented Sep 19, 2023

jagoosw commented Sep 19, 2023

glwagner commented Sep 19, 2023

jagoosw commented Sep 19, 2023 • edited Loading

glwagner commented Sep 19, 2023

jagoosw commented Sep 19, 2023

jagoosw commented Sep 19, 2023 • edited Loading

glwagner commented Sep 20, 2023

glwagner commented Sep 20, 2023

jagoosw commented Sep 20, 2023 • edited Loading

jagoosw commented Sep 20, 2023 • edited Loading

jagoosw commented Sep 20, 2023

jagoosw commented Sep 20, 2023

glwagner commented Sep 20, 2023

glwagner commented Sep 20, 2023

jagoosw commented Sep 20, 2023

jagoosw commented Sep 20, 2023 • edited Loading

jagoosw commented Sep 20, 2023

glwagner commented Sep 20, 2023

glwagner commented Sep 20, 2023

jagoosw commented Sep 20, 2023 • edited Loading

jagoosw commented Sep 20, 2023 • edited Loading

glwagner commented Sep 21, 2023

jagoosw commented Oct 3, 2023

`SimpleMultiG` sediment models will not run on GPU #143

`SimpleMultiG` sediment models will not run on GPU #143

jagoosw commented Sep 18, 2023 •

edited

Loading

jagoosw commented Sep 18, 2023 •

edited

Loading

jagoosw commented Sep 19, 2023 •

edited

Loading

jagoosw commented Sep 19, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023 •

edited

Loading

jagoosw commented Sep 20, 2023 •

edited

Loading