Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed activity error for 2nd order diff of fused broadcast into preallocated array #1751

Closed
danielwe opened this issue Aug 25, 2024 · 3 comments

Comments

@danielwe
Copy link
Contributor

danielwe commented Aug 25, 2024

Second order forward-over-reverse fails with a mixed activity error when the differentiated function fills a preallocated array using a non-trivial fused broadcast expression (by non-trivial here I mean anything but y .= x).

The stacktrace points to broadcast_unalias, which suggests that the problem might stem from the code path that allocates intermediate storage whenever there's aliasing between the LHS and RHS of a broadcast assignment. In the MWE below, this code path is not hit and no intermediate storage is allocated, but the code path exists in the compiled f (for one, it makes AllocCheck.jl report allocations in f), so I guess Enzyme has to differentiate it.

The error does not occur if the broadcast expression is replaced with the equivalent map! call. Notably, map! does not check for aliasing.

Possibly related: #1637, #1482, #1476

MWE:

using Enzyme
using LinearAlgebra

g(x) = x

function f(x, tmp)
    tmp .= g.(x)
    # map!(g, tmp, x)  # equivalent alternative that works
    return dot(tmp, x)
end

function f_gradient_deferred!(dx, x, tmp)
    dtmp = make_zero(tmp)
    autodiff_deferred(Reverse, f, Active, Duplicated(x, dx), Duplicated(tmp, dtmp))
    return nothing
end

function f_hvp!(hv, x, v, tmp)
    dx = make_zero(x)
    btmp = make_zero(tmp)
    autodiff(
        Forward,
        f_gradient_deferred!,
        Duplicated(dx, hv),
        Duplicated(x, v),
        Duplicated(tmp, btmp),
    )
    return nothing
end

x = [1.0]
v = [-1.0]
hv = make_zero(v)
tmp = similar(x)

f_hvp!(hv, x, v, tmp)
@show hv

Error:

julia> include("enzymebroadcast.jl");
ERROR: LoadError: Mismatched activity for:   %"'ip_phi3_cache.i.0.i" = phi double addrspace(13)* addrspace(10)*
 [ bitcast ({} addrspace(10)* @ejl_jl_nothing to double addrspace(13)* addrspace(10)*), %L33.i.i ], [ %phi.cast
.i, %L44.i.i ], [ bitcast ({} addrspace(10)* @ejl_jl_nothing to double addrspace(13)* addrspace(10)*), %L36.i.i
 ] const val: double addrspace(13)* addrspace(10)* bitcast ({} addrspace(10)* @ejl_jl_nothing to double addrspa
ce(13)* addrspace(10)*)
Type tree: {[-1]:Pointer, [-1,0]:Pointer, [-1,0,0]:Float@double, [-1,0,8]:Float@double, [-1,0,16]:Float@double,
 [-1,0,24]:Float@double, [-1,0,32]:Float@double, [-1,0,40]:Float@double, [-1,0,48]:Float@double, [-1,0,56]:Floa
t@double, [-1,0,64]:Float@double, [-1,0,72]:Float@double, [-1,0,80]:Float@double, [-1,0,88]:Float@double, [-1,0
,96]:Float@double, [-1,0,104]:Float@double, [-1,0,112]:Float@double, [-1,0,120]:Float@double}
 llvalue=double addrspace(13)* addrspace(10)* bitcast ({} addrspace(10)* @ejl_jl_nothing to double addrspace(13
)* addrspace(10)*)
You may be using a constant variable as temporary storage for active memory (https://enzyme.mit.edu/julia/stabl
e/faq/#Activity-of-temporary-storage). If not, please open an issue, and either rewrite this variable to not be
 conditionally active or use Enzyme.API.runtimeActivity!(true) as a workaround for now

Stacktrace:
  [1] size
    @ ./array.jl:191
  [2] axes
    @ ./abstractarray.jl:98
  [3] newindexer
    @ ./broadcast.jl:625
  [4] extrude
    @ ./broadcast.jl:676
  [5] preprocess
    @ ./broadcast.jl:984
  [6] preprocess_args
    @ ./broadcast.jl:987
  [7] preprocess
    @ ./broadcast.jl:983
  [8] copyto!
    @ ./broadcast.jl:1000
  [9] copyto!
    @ ./broadcast.jl:956
 [10] materialize!
    @ ./broadcast.jl:914
 [11] materialize!
    @ ./broadcast.jl:911
 [12] f
    @ ~/enzymebroadcast.jl:7
 [13] diffejulia_f_4008wrap
    @ ~/enzymebroadcast.jl:0
 [14] macro expansion
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:7049
 [15] enzyme_call
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:6658
 [16] CombinedAdjointThunk
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:6535
 [17] autodiff_deferred
    @ ~/.julia/packages/Enzyme/XGb4o/src/Enzyme.jl:478
 [18] autodiff_deferred
    @ ~/.julia/packages/Enzyme/XGb4o/src/Enzyme.jl:548
 [19] f_gradient_deferred!
    @ ~/enzymebroadcast.jl:14

Stacktrace:
  [1] broadcast_unalias
    @ ./broadcast.jl:977 [inlined]
  [2] preprocess
    @ ./broadcast.jl:984 [inlined]
  [3] preprocess_args
    @ ./broadcast.jl:987 [inlined]
  [4] preprocess
    @ ./broadcast.jl:983 [inlined]
  [5] copyto!
    @ ./broadcast.jl:1000 [inlined]
  [6] copyto!
    @ ./broadcast.jl:956 [inlined]
  [7] materialize!
    @ ./broadcast.jl:914 [inlined]
  [8] materialize!
    @ ./broadcast.jl:911 [inlined]
  [9] f
    @ ~/enzymebroadcast.jl:7 [inlined]
 [10] diffejulia_f_4008wrap
    @ ~/enzymebroadcast.jl:0 [inlined]
 [11] macro expansion
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:7049 [inlined]
 [12] enzyme_call
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:6658 [inlined]
 [13] CombinedAdjointThunk
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:6535 [inlined]
 [14] autodiff_deferred
    @ ~/.julia/packages/Enzyme/XGb4o/src/Enzyme.jl:478 [inlined]
 [15] autodiff_deferred
    @ ~/.julia/packages/Enzyme/XGb4o/src/Enzyme.jl:548 [inlined]
 [16] f_gradient_deferred!
    @ ~/enzymebroadcast.jl:14 [inlined]
 [17] fwddiffejulia_f_gradient_deferred__4005wrap
    @ ~/enzymebroadcast.jl:0
 [18] macro expansion
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:7049 [inlined]
 [19] enzyme_call
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:6658 [inlined]
 [20] ForwardModeThunk
    @ ~/.julia/packages/Enzyme/XGb4o/src/compiler.jl:6538 [inlined]
 [21] autodiff
    @ ~/.julia/packages/Enzyme/XGb4o/src/Enzyme.jl:437 [inlined]
 [22] autodiff
    @ ~/.julia/packages/Enzyme/XGb4o/src/Enzyme.jl:348 [inlined]
 [23] autodiff
    @ ~/.julia/packages/Enzyme/XGb4o/src/Enzyme.jl:329 [inlined]
 [24] f_hvp!(hv::Vector{Float64}, x::Vector{Float64}, v::Vector{Float64}, tmp::Vector{Float64})
    @ Main ~/enzymebroadcast.jl:21
 [25] top-level scope
    @ ~/enzymebroadcast.jl:36
 [26] include(fname::String)
    @ Base.MainInclude ./client.jl:489
 [27] top-level scope
    @ REPL[5]:1
in expression starting at /home/daniel/enzymebroadcast.jl:36
@danielwe
Copy link
Contributor Author

danielwe commented Aug 25, 2024

Forgot to mention: Enzyme.API.runtimeActivity!(true) does fix the problem, but leads to a warning about fallback BLAS replacements. This warning appears both for the broadcast and map! versions.

julia> using Enzyme

julia> Enzyme.API.runtimeActivity!(true)

julia> include("enzymebroadcast.jl");
┌ Warning: Using fallback BLAS replacements for (["cblas_daxpy64_", "cblas_dcopy64_"]), perfo
rmance may be degraded
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
hv = [-2.0]

Let me also mention that in a Pluto notebook this warning fails to show and triggers an error instead: ArgumentError: Base.TTY(RawFD(4294967295) invalid status, 0 bytes waiting) is not initialized. But that's probably a Pluto issue. (Edit: submitted fonsp/Pluto.jl#3012.)

@danielwe
Copy link
Contributor Author

The error goes away if I replace dot(tmp, x) with the ~equivalent sum(tmp .* x) (and thus remove the dependency on LinearAlgebra). So the trigger is not the broadcasted assignment alone, but some interaction between it and dot, when doing 2nd order forward-over-reverse.

@wsmoses
Copy link
Member

wsmoses commented Aug 25, 2024

should be fixed by #1752

@wsmoses wsmoses closed this as completed Aug 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@wsmoses @danielwe and others