Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

findmax does not work on DimStacks #858

Open
asinghvi17 opened this issue Nov 12, 2024 · 5 comments
Open

findmax does not work on DimStacks #858

asinghvi17 opened this issue Nov 12, 2024 · 5 comments

Comments

@asinghvi17
Copy link
Collaborator

MWE:

using DimensionalData
A = [1.0 2.0 3.0;
     4.0 5.0 6.0]
x, y, z = X([:a, :b]), Y(10.0:10.0:30.0; metadata=Dict()), Z()
dimz = x, y
da1 = DimArray(A, (x, y); name=:one, metadata=Metadata())
da2 = DimArray(Float32.(2A), (x, y); name=:two)
da3 = DimArray(Int.(3A), (x, y); name=:three)
da4 = DimArray(cat(4A, 5A, 6A, 7A; dims=3), (x, y, z); name=:extradim)

s = DimStack((da1, da2, da3))

maximum(s) # fine
findmax(s) # errors

Error:

julia> maximum(s) # fine
(one = 6.0, two = 12.0f0, three = 18)

julia> findmax(s) # errors
ERROR: MethodError: no method matching isless(::DimMatrix{…}, ::DimMatrix{…})
The function `isless` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  isless(::Missing, ::Any)
   @ Base missing.jl:87
  isless(::Any, ::Missing)
   @ Base missing.jl:88
  isless(::Base.CoreLogging.LogLevel, ::Base.CoreLogging.LogLevel)
   @ Base logging/logging.jl:131
  ...

Stacktrace:
  [1] _rf_findmax(::Tuple{DimMatrix{…}, Symbol}, ::Tuple{DimMatrix{…}, Symbol})
    @ Base ./reduce.jl:907
  [2] BottomRF
    @ ./reduce.jl:86 [inlined]
  [3] MappingRF (repeats 2 times)
    @ ./reduce.jl:100 [inlined]
  [4] _foldl_impl(op::Base.MappingRF{…}, init::Base._InitialValue, itr::Base.Iterators.Zip{…})
    @ Base ./reduce.jl:62
  [5] foldl_impl(op::Base.MappingRF{…}, nt::Base._InitialValue, itr::Base.Iterators.Zip{…})
    @ Base ./reduce.jl:48
  [6] mapfoldl_impl(f::Base.var"#353#354"{}, op::typeof(Base._rf_findmax), nt::Base._InitialValue, itr::Base.Generator{…})
    @ Base ./reduce.jl:44
  [7] mapfoldl(f::Function, op::Function, itr::Base.Generator{…}; init::Base._InitialValue)
    @ Base ./reduce.jl:175
  [8] mapfoldl
    @ ./reduce.jl:175 [inlined]
  [9] _findmax
    @ ./reduce.jl:906 [inlined]
 [10] findmax(f::Function, domain::DimStack{…})
    @ Base ./reduce.jl:905
 [11] _findmax(a::DimStack{…}, ::Colon)
    @ Base ./reduce.jl:935
 [12] findmax(itr::DimStack{…})
    @ Base ./reduce.jl:934
 [13] top-level scope
    @ REPL[265]:1
Some type information was truncated. Use `show(err)` to see complete types.

If this is intended, we should dispatch on findmax and throw a better error message. If not - how would this work? Returning a namedtuple of selectors for each layer seems a bit arbitrary...

@asinghvi17
Copy link
Collaborator Author

btw, I don't need this, just an edge case I stumbled into.

@rafaqz
Copy link
Owner

rafaqz commented Nov 12, 2024

Oh no now reductions are kind of broken if you expect the new iteration 😿

The NamedTuple/Array crossover is endless pain, every change breaks some other expectation

@rafaqz
Copy link
Owner

rafaqz commented Nov 12, 2024

But doing this over a NameTuple iterator is totally useless, it will just error return a weird result:

julia> maximum(s) # fine
(one = 6.0, two = 12.0f0, three = 18)

Its also generally useless over multi-value objects like arrays, because you just end up with a single object that was chosen as the "maximum" for some arcane reason or other (maximum first value in this case), rather than the maximum of each separate position in the vector.

TLDR we can't have findmax find a single Int and have maximum work like this too, unless findmax returns a NamedTuple of indices in each separate layer. Or, it just looks at the first layer, breaking with how the other reducing methods work.

@rafaqz
Copy link
Owner

rafaqz commented Nov 12, 2024

So this is the problem:

julia> maximum([(a=1, b=200), (a=2, b=1)])
(a = 2, b = 1)

We may need a slightly more formal definition of what a DimStack is and how it operates under iteration and reduction. Its just hard to know what definition is the most useful practically until you try them out for a year

@rafaqz
Copy link
Owner

rafaqz commented Nov 13, 2024

This is what DimStack is currently:

Array of NamedTuple

  • iteration and indexing: one single Array of NamedTuple

NamedTuple of Array

  • reduction: multiple separate arrays of different lengths reduced separately
  • broadcast_dims: separate broadcasts over each layer like reductions
  • merge and Symbol indexing/getproperty: separate layers like a NamedTuple of Array

Both

  • tables: Row-wise A of NT, Col-wise NT of A (because looped to be the same length)

Undefined

  • broadcast: ?

So, still pretty weird. I find sum(dimstack) to be an odd outcome, and different to summing the iterator unless all layers are the same size.

findmax just doesn't really fit in at all.

I always wondered if it was trying to do too much in one object, and should just be two that you can switch between.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants