Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor AggMode submodule and types #162

Merged
merged 9 commits into from
Apr 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ version = "0.9.0"
[deps]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"

[compat]
CategoricalArrays = "0.10"
Expand Down
36 changes: 7 additions & 29 deletions docs/src/introduction/gettingstarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@ julia> true_targets = [ 1, 0, -2];
julia> pred_outputs = [0.5, 2, -1];
julia> value(L2DistLoss(), pred_outputs, true_targets)
3-element Array{Float64,1}:
julia> value.(L2DistLoss(), pred_outputs, true_targets)
3-element Vector{Float64}:
0.25
4.0
1.0
Expand All @@ -92,10 +92,10 @@ This will avoid allocating a temporary array and directly
compute the result.

```julia-repl
julia> value(L2DistLoss(), pred_outputs, true_targets, AggMode.Sum())
julia> sum(L2DistLoss(), pred_outputs, true_targets)
5.25
julia> value(L2DistLoss(), pred_outputs, true_targets, AggMode.Mean())
julia> mean(L2DistLoss(), pred_outputs, true_targets)
1.75
```

Expand All @@ -105,33 +105,11 @@ each observation in the predicted outputs and so allow to give
certain observations a stronger influence over the result.

```julia-repl
julia> value(L2DistLoss(), pred_outputs, true_targets, AggMode.WeightedSum([2,1,1]))
julia> sum(L2DistLoss(), pred_outputs, true_targets, [2,1,1], normalize=false)
5.5
julia> value(L2DistLoss(), pred_outputs, true_targets, AggMode.WeightedMean([2,1,1]))
1.375
```

All these function signatures of [`value`](@ref) also apply for
computing the derivatives using [`deriv`](@ref) and the second
derivatives using [`deriv2`](@ref).

```julia-repl
julia> true_targets = [ 1, 0, -2];
julia> pred_outputs = [0.5, 2, -1];
julia> deriv(L2DistLoss(), pred_outputs, true_targets)
3-element Array{Float64,1}:
-1.0
4.0
2.0
julia> deriv2(L2DistLoss(), pred_outputs, true_targets)
3-element Array{Float64,1}:
2.0
2.0
2.0
julia> mean(L2DistLoss(), pred_outputs, true_targets, [2,1,1], normalize=false)
1.8333333333333333
```

## Getting Help
Expand Down
135 changes: 15 additions & 120 deletions docs/src/user/aggregate.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@ say "naive", because it will not give us an acceptable
performance.

```jldoctest
julia> value(L1DistLoss(), [2,5,-2], [1.,2,3])
julia> value.(L1DistLoss(), [2,5,-2], [1.,2,3])
3-element Vector{Float64}:
1.0
3.0
5.0

julia> sum(value(L1DistLoss(), [2,5,-2], [1.,2,3])) # WARNING: Bad code
julia> sum(value.(L1DistLoss(), [2,5,-2], [1.,2,3])) # WARNING: Bad code
9.0
```

Expand All @@ -53,52 +53,25 @@ that we don't need in the end and could avoid.

For that reason we provide special methods that compute the
common accumulations efficiently without allocating temporary
arrays. These methods can be invoked using an additional
parameter which specifies how the values should be accumulated /
averaged. The type of this parameter has to be a subtype of
`AggregateMode`.

## Aggregation Modes

Before we discuss these memory-efficient methods, let us briefly
introduce the available aggregation mode types. We provide a number
of different aggregation modes, all of which are contained within
the namespace `AggMode`. An instance of such type can then be
used as additional parameter to [`value`](@ref), [`deriv`](@ref),
and [`deriv2`](@ref), as we will see further down.

It follows a list of available aggregation modes. Each of which with
a short description of what their effect would be when used as an
additional parameter to the functions mentioned above.

```@docs
AggMode.None
AggMode.Sum
AggMode.Mean
AggMode.WeightedSum
AggMode.WeightedMean
```

## Unweighted Sum and Mean
arrays.

As hinted before, we provide special memory efficient methods for
computing the **sum** or the **mean** of the element-wise (or
broadcasted) results of [`value`](@ref), [`deriv`](@ref), and
[`deriv2`](@ref). These methods avoid the allocation of a
temporary array and instead compute the result directly.
```jldoctest
julia> sum(L1DistLoss(), [2,5,-2], [1.,2,3])
9.0

## Weighted Sum and Mean
julia> mean(L1DistLoss(), [2,5,-2], [1.,2,3])
3.0
```

Up to this point, all the averaging was performed in an
unweighted manner. That means that each observation was treated
as equal and had thus the same potential influence on the result.
In this sub-section we will consider the situations in which we
In the following we will consider situations in which we
do want to explicitly specify the influence of each observation
(i.e. we want to weigh them). When we say we "weigh" an
observation, what it effectively boils down to is multiplying the
result for that observation (i.e. the computed loss or
derivative) with some number. This is done for every observation
individually.
result for that observation (i.e. the computed loss) with some number.
This is done for every observation individually.

To get a better understand of what we are talking about, let us
consider performing a weighting scheme manually. The following
Expand Down Expand Up @@ -127,88 +100,10 @@ between the different weights. In the example above the second
observation was thus considered twice as important as any of the
other two observations.

In the case of multi-dimensional arrays the process isn't that
simple anymore. In such a scenario, computing the weighted sum
(or weighted mean) can be thought of as having an additional
step. First we either compute the sum or (unweighted) average for
each observation (which results in a vector), and then we compute
the weighted sum of all observations.

The following code snipped demonstrates how to compute the
`AggMode.WeightedSum([2,1])` manually. This is **not** meant as
an example of how to do it, but simply to show what is happening
qualitatively. In this example we assume that we are working in a
multi-variable regression setting, in which our data set has four
observations with two target-variables each.

```jldoctest weight
julia> targets = reshape(1:8, (2, 4)) ./ 8
2×4 Matrix{Float64}:
0.125 0.375 0.625 0.875
0.25 0.5 0.75 1.0

julia> outputs = reshape(1:2:16, (2, 4)) ./ 8
2×4 Matrix{Float64}:
0.125 0.625 1.125 1.625
0.375 0.875 1.375 1.875

julia> # WARNING: BAD CODE - ONLY FOR ILLUSTRATION

julia> tmp = sum(value.(L1DistLoss(), outputs, targets), dims=2)
2×1 Matrix{Float64}:
1.5
2.0

julia> sum(tmp .* [2, 1]) # weigh 1st observation twice as high
5.0
```

To manually compute the result for `AggMode.WeightedMean([2,1])`
we follow a similar approach, but use the normalized weight
vector in the last step.

```jldoctest weight
julia> using Statistics # for access to "mean"

julia> # WARNING: BAD CODE - ONLY FOR ILLUSTRATION

julia> tmp = mean(value.(L1DistLoss(), outputs, targets), dims=2)
2×1 Matrix{Float64}:
0.375
0.5

julia> sum(tmp .* [0.6666, 0.3333]) # weigh 1st observation twice as high
0.416625
```

Note that you can specify explicitly if you want to normalize the
weight vector. That option is supported for computing the
weighted sum, as well as for computing the weighted mean. See the
documentation for [`AggMode.WeightedSum`](@ref) and
[`AggMode.WeightedMean`](@ref) for more information.

The code-snippets above are of course very inefficient, because
they allocate (multiple) temporary arrays. We only included them
to demonstrate what is happening in terms of desired result /
effect. For doing those computations efficiently we provide
special methods for [`value`](@ref), [`deriv`](@ref),
[`deriv2`](@ref) and their mutating counterparts.

```jldoctest weight
julia> value(L1DistLoss(), [2,5,-2], [1.,2,3], AggMode.WeightedSum([1,2,1]))
julia> sum(L1DistLoss(), [2,5,-2], [1.,2,3], [1,2,1], normalize=false)
12.0

julia> value(L1DistLoss(), [2,5,-2], [1.,2,3], AggMode.WeightedMean([1,2,1]))
julia> mean(L1DistLoss(), [2,5,-2], [1.,2,3], [1,2,1])
1.0
```

We also provide this functionality for [`deriv`](@ref) and
[`deriv2`](@ref) respectively.

```jldoctest weight
julia> deriv(L2DistLoss(), [2,5,-2], [1.,2,3], AggMode.WeightedSum([1,2,1]))
4.0

julia> deriv(L2DistLoss(), [2,5,-2], [1.,2,3], AggMode.WeightedMean([1,2,1]))
0.3333333333333333
```
```
12 changes: 6 additions & 6 deletions src/LossFunctions.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ module LossFunctions
using Markdown
using CategoricalArrays: CategoricalValue

# aggregation mode
include("aggmode.jl")
import Base: sum
import Statistics: mean

# trait functions
include("traits.jl")
Expand All @@ -31,9 +31,6 @@ export
islipschitzcont, islocallylipschitzcont,
isclipable, isclasscalibrated, issymmetric,

# relevant submodules
AggMode,

# margin-based losses
ZeroOneLoss,
LogitMarginLoss,
Expand Down Expand Up @@ -68,6 +65,9 @@ export

# meta losses
ScaledLoss,
WeightedMarginLoss
WeightedMarginLoss,

# reexport mean
mean

end # module
111 changes: 0 additions & 111 deletions src/aggmode.jl

This file was deleted.

Loading