-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extending Base.stack for DimArrays #645
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #645 +/- ##
==========================================
+ Coverage 83.83% 83.99% +0.15%
==========================================
Files 45 45
Lines 4102 4136 +34
==========================================
+ Hits 3439 3474 +35
+ Misses 663 662 -1 ☔ View full report in Codecov by Sentry. |
src/array/methods.jl
Outdated
@@ -131,15 +131,15 @@ end | |||
""" | |||
function Base.eachslice(A::AbstractDimArray; dims) | |||
dimtuple = _astuple(dims) | |||
if !(dimtuple == ()) | |||
if !(dimtuple == ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to remove all these whitespace changes so we can see the real changes
src/array/methods.jl
Outdated
end | ||
|
||
newdims = first(origdims) | ||
newdims = ntuple(d -> d == newdim ? AnonDim() : newdims[d-(d>newdim)], length(newdims) + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make this a do
block so its easier to read
src/array/methods.jl
Outdated
|
||
To fix for `AbstractDimArray`, pass new lookup values as `cat(As...; dims=$D(newlookupvals))` keyword or `dims=$D()` for empty `NoLookup`. | ||
""" | ||
|
||
function Base._typed_stack(::Colon, ::Type{T}, ::Type{S}, A, Aax=_iterator_axes(A)) where {T,S<:AbstractDimArray} | ||
origdims = dims.(A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
origdims = dims.(A) | |
origdims = map(dims, A) |
src/array/methods.jl
Outdated
DimArray(_A, newdims) | ||
end | ||
|
||
function Base.stack(dim::Dimension, A::AbstractVector{<:AbstractDimArray}; dims=nothing, kwargs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats the idea with dim
as the first argument?
The tests also don't hit this code.
Thanks for the suggestions! Went through and cleaned up the PR. The Let me know if you have more thoughts on this, happy to work on it further :) |
src/array/methods.jl
Outdated
true | ||
``` | ||
""" | ||
function Base.stack(dim::Dimension, A::AbstractVector{<:AbstractDimArray}; dims=nothing, kwargs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not pass dim
in dims
?
This method signature is a bit strange , we usually try not to add varients on Base methods, we just allow dims
to specify named dimensions.
We try not to change base method signatures besides allowing I would just put that new You will also need to |
I see your point about keeping new signatures to a minimum, though. How about allowing the keyword argument
We could also, instead, have that the new dimension is always an |
Using a Always |
src/array/methods.jl
Outdated
end | ||
newdims = format(newdims, B) | ||
|
||
B = rebuild(B; dims=newdims) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
B = rebuild(B; dims=newdims) | |
B = rebuild(B; dims=format(newdims, B)) |
User specified dims are usually incomplete and possibly incorrect
a93d665
to
7a4cf26
Compare
src/array/methods.jl
Outdated
@@ -547,6 +547,100 @@ $message on dimension $D. | |||
To fix for `AbstractDimArray`, pass new lookup values as `cat(As...; dims=$D(newlookupvals))` keyword or `dims=$D()` for empty `NoLookup`. | |||
""" | |||
|
|||
function Base._typed_stack(::Colon, ::Type{T}, ::Type{S}, A, Aax=_iterator_axes(A)) where {T,S<:AbstractDimArray} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How stable do you think these methods are... could we add a method to Base.stack
instead? What do we gain from touching these internals?
I know we use internals elsewhere, but we should stop:
#522
src/array/methods.jl
Outdated
end | ||
|
||
newdims = first(origdims) | ||
newdims = ntuple(length(newdims) + 1) do d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this type-stable?
@brendanjohnharris any updates? I can also finish this if you don't have time |
Sorry fell off my radar, my bad using DimensionalData
a = [1 2 3; 4 5 6]
da = DimArray(a, (X(4.0:5.0), Y(6.0:8.0)))
b = [7 8 9; 10 11 12]
db = DimArray(b, (X(4.0:5.0), Y(6.0:8.0)))
x = DimArray([da, db], (Z(4.0:5.0)))
stack(x; dims=2) # Has dims X, *Z*, Y |
Perfect, simpler is better! But we could also allow Dimension |
src/array/methods.jl
Outdated
comparedims(Bool, dims.(iter)...; order=true, val=true, msg=Dimensions.Warn(" Can't `stack` AbstractDimArray, applying to `parent` object.")) | ||
iter | ||
end | ||
Base.stack(iter::AbstractArray{<:AbstractDimArray}; dims=:) = Base._stack(dims, check_stack_dims(iter)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base.stack(iter::AbstractArray{<:AbstractDimArray}; dims=:) = Base._stack(dims, check_stack_dims(iter)) | |
Base.stack(iter::AbstractArray{<:AbstractDimArray}; dims=:) = Base._stack(dimnum(first(iter), dims), check_stack_dims(iter)) |
This should allow passing Dimension/Symbol etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about:
Base._stack(dims::Dimension, iter) = Base._stack(typeof(dims), iter)
Base._stack(dims::Type{<:Dimension}, iter) = Base._stack(dimnum(first(iter), dims), iter)
This helps because the default dims
in Base's Base.stack
is (oddly) dims=:
. This has the same effect as dims=ndims(first(iter))+1
, as far as I can tell, but it doesn't play well with dimnums
in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to avoid dispatch on the underscore as much as possible... And we also need to accept Symbol. I would just define another method like _maybe_dimnum
that has dispatch for Colon and Int that don't call dimnum and everything else uses dimnum
x = DimArray([da, ca], (Dim{:a}(1:2),)) | ||
sx = stack(x; dims=1) | ||
sy = @test_nowarn stack(x; dims=:) | ||
sz = @test_nowarn stack(x; dims=X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with dims=Z
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, dims=Z
(where Z is not a dimension in the DimArrays being stacked) falls back to adding the new dimension at the end (so the result is the same as for the default behavior):
_maybe_dimnum(x, dim) = hasdim(x, dim) ? dimnum(x, dim) : ndims(x) + 1
Should we throw an error instead, like when an out-of-range Integer dim is give? I.e.:
function _maybe_dimnum(x, dim::Int)
if dim < ndims(x) + 2
return dim
else
throw(ArgumentError(LazyString("cannot stack slices ndims(x) = ", ndims(x) + 1, " along dims = ", dim)))
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we should add a dimension like it is currently, but we can set it to be a Z
dimension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the current minimal approach, stacking with a new dimension Z
would be:
a = [1 2 3; 4 5 6]
da = DimArray(a, (X(4.0:5.0), Y(6.0:8.0)))
b = [7 8 9; 10 11 12]
db = DimArray(b, (X(4.0:5.0), Y(6.0:8.0)))
x = DimArray([da, db], (Dim{:a}(1:2),))
stack(set(x, :a=>Z)) # Or
set(stack(x), :a=>Z})
We could have this behavior be automatic (without overloading Base._stack
) with:
function Base.stack(iter::AbstractArray{<:AbstractDimArray}; dims=:)
x = Base._stack(_maybe_dimnum(first(iter), dims), check_stack_dims(iter))
if !hasdim(x, dims) && Z isa Union{Dimension,Type{<:Dimension}}
x = set(x, DimensionalData.dims(x)[end] => dims)
end
return x
end
But I wonder if it could be confusing to have this different behavior for when dims <: Dimension
and hasdim(x, dims)
(the new dimension is inserted before dims
but keeps the original name) versus !hasdims(x, dims)
(the new dimension is inserted at the end and renamed to dims
); it also means the return type can't be inferred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, dims=Z (where Z is not a dimension in the DimArrays being stacked) falls back to adding the new dimension at the end (so the result is the same as for the default behavior)
I'm not sure we are understanding each other... My only ask here was that this new dimension is a Z
dimension if dims=Z
was used
Co-authored-by: Rafael Schouten <[email protected]>
Adds methods for
Base.stack
, and related non-exported functions from Base, that are compatible withDimArray
s.Syntax follows Base: stacking
DimArray
s along a given axisdims
creates a new dimension. However, existing dimension data is preserved, and the new dimension becomes anAnonDim
.Optionally, a
Dimension
dim
can be provided as the first argument tostack
, in which case the new dimension is assigned asdim
rather thanAnonDim
.