Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing using NamedTuple as initial_params #632

Merged
merged 12 commits into from
Jul 23, 2024
63 changes: 42 additions & 21 deletions src/sampler.jl
Original file line number Diff line number Diff line change
Expand Up @@ -142,38 +142,59 @@ By default, it returns an instance of [`SampleFromPrior`](@ref).
"""
initialsampler(spl::Sampler) = SampleFromPrior()

function initialize_parameters!!(
vi::AbstractVarInfo, initial_params, spl::Sampler, model::Model
function set_values!!(
varinfo::AbstractVarInfo,
initial_params::AbstractVector{<:Union{Real,Missing}},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the array element type from Real to Union{Real, Missing} because we allow initialization vector with missing in it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? In what scenario would this be used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chain = sample(model, sampler, 1; initial_params=[missing, -1], progress=false)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure how I feel about that. I guess it means to sample that parameter from the prior? But that's so simple to do by hand anyways these days.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose so, although it made sense to me because the initialization vector need to be the same dimension as the model, so it makes sense for someone to say "I don't care about these" by setting them to missing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but you can also just do

rand(Vector, model)

and alter those values 😕

Basically, it's a question of whether we want to maintain this or just leave it up to the user when it is a simple one-liner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got your argument, 👍
But in this case it's not really a one-liner, as user also need to set individual values.
For models with small dimensions, current syntax can still be useful. I am for keeping this option.

spl::AbstractSampler,
)
@debug "Using passed-in initial variable values" initial_params

# Flatten parameters.
init_theta = mapreduce(vcat, initial_params) do x
vec([x;])
end

# Get all values.
linked = islinked(vi, spl)
if linked
vi = invlink!!(vi, spl, model)
end
theta = vi[spl]
length(theta) == length(init_theta) || throw(
theta = varinfo[spl]
sunxd3 marked this conversation as resolved.
Show resolved Hide resolved
length(theta) == length(initial_params) || throw(
DimensionMismatch(
"Provided initial value size ($(length(init_theta))) doesn't match the model size ($(length(theta)))",
"Provided initial value size ($(length(initial_params))) doesn't match the model size ($(length(theta)))",
),
)

# Update values that are provided.
for i in eachindex(init_theta)
x = init_theta[i]
for i in eachindex(initial_params)
x = initial_params[i]
if x !== missing
theta[i] = x
end
end

# Update in `vi`.
vi = setindex!!(vi, theta, spl)
# Update in `varinfo`.
return setindex!!(varinfo, theta, spl)
end

# if initialize with scalar, convert to vector
function set_values!!(varinfo::AbstractVarInfo, initial_params::Real, spl::AbstractSampler)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason of adding this method is that, before this PR, initial_params can be a scalar.

Real is not concrete, but probably okay given this likely is not on the critical path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was not aware we allowed this either 🤷 Are we testing this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah

chain = sample(model, sampler, 1; initial_params=0.2, progress=false)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I say we drop this. Seems unnecessarily complicated when it's just a difference between 1 and [1].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see @torfjelde's point of simplicity of interface, but it is a breaking change, which makes me wonder if the small effort of keeping it supported is worth it. If we do change it, maybe bundle it with some other breaking changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it's technically breaking, I highly doubt this piece of code exists anywhere else, i.e. it would be a matter of bumping compat bounds 🤔

return set_values!!(varinfo, [initial_params], spl)
end

function set_values!!(
varinfo::AbstractVarInfo, initial_params::NamedTuple, spl::AbstractSampler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we happy with spl not doing anything in this function? I get that we need to leave it there to keep a uniform interface, but could this end up being called with varinfo that has not been sampled with spl, which would then produce unexpected results?

)
initial_params = NamedTuple(k => v for (k, v) in pairs(initial_params) if v !== missing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is good, but changed to support a more uniform interface: allowing NamedTuple with missing values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in last comment: when are we initializing using missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see reply above

return DynamicPPL.TestUtils.update_values!!(
sunxd3 marked this conversation as resolved.
Show resolved Hide resolved
varinfo, initial_params, map(k -> VarName{k}(), keys(initial_params))
)
end

function initialize_parameters!!(
vi::AbstractVarInfo, initial_params, spl::AbstractSampler, model::Model
)
@debug "Using passed-in initial variable values" initial_params

# `link` the varinfo if needed.
linked = islinked(vi, spl)
if linked
vi = invlink!!(vi, spl, model)
end

# Set the values in `vi`.
vi = set_values!!(vi, initial_params, spl)

# `invlink` if needed.
if linked
vi = link!!(vi, spl, model)
end
Expand Down
6 changes: 6 additions & 0 deletions src/varinfo.jl
Original file line number Diff line number Diff line change
Expand Up @@ -892,6 +892,12 @@ Base.keys(vi::TypedVarInfo{<:NamedTuple{()}}) = VarName[]
return expr
end

# FIXME(torfjelde): Don't use `_getvns`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still relevant? Don't know why _getvns should be shunned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just unnecessary; Base.keys should just be it + IIRC _getvns is overly complicated.

Base.keys(vi::UntypedVarInfo, spl::AbstractSampler) = _getvns(vi, spl)
function Base.keys(vi::TypedVarInfo, spl::AbstractSampler)
return mapreduce(values, vcat, _getvns(vi, spl))
end

"""
setgid!(vi::VarInfo, gid::Selector, vn::VarName)

Expand Down
112 changes: 59 additions & 53 deletions test/sampler.jl
Original file line number Diff line number Diff line change
Expand Up @@ -84,23 +84,25 @@
model = coinflip()
sampler = Sampler(alg)
lptrue = logpdf(Binomial(25, 0.2), 10)
chain = sample(model, sampler, 1; initial_params=0.2, progress=false)
@test chain[1].metadata.p.vals == [0.2]
@test getlogp(chain[1]) == lptrue

# parallel sampling
chains = sample(
model,
sampler,
MCMCThreads(),
1,
10;
initial_params=fill(0.2, 10),
progress=false,
)
for c in chains
@test c[1].metadata.p.vals == [0.2]
@test getlogp(c[1]) == lptrue
for inits in (0.2, (; p=0.2))
chain = sample(model, sampler, 1; initial_params=inits, progress=false)
@test chain[1].metadata.p.vals == [0.2]
@test getlogp(chain[1]) == lptrue

# parallel sampling
chains = sample(
model,
sampler,
MCMCThreads(),
1,
10;
initial_params=fill(inits, 10),
progress=false,
)
for c in chains
@test c[1].metadata.p.vals == [0.2]
@test getlogp(c[1]) == lptrue
end
end

# model with two variables: initialization s = 4, m = -1
Expand All @@ -110,45 +112,49 @@
end
model = twovars()
lptrue = logpdf(InverseGamma(2, 3), 4) + logpdf(Normal(0, 2), -1)
chain = sample(model, sampler, 1; initial_params=[4, -1], progress=false)
@test chain[1].metadata.s.vals == [4]
@test chain[1].metadata.m.vals == [-1]
@test getlogp(chain[1]) == lptrue

# parallel sampling
chains = sample(
model,
sampler,
MCMCThreads(),
1,
10;
initial_params=fill([4, -1], 10),
progress=false,
)
for c in chains
@test c[1].metadata.s.vals == [4]
@test c[1].metadata.m.vals == [-1]
@test getlogp(c[1]) == lptrue
for inits in ([4, -1], (; s=4, m=-1))
chain = sample(model, sampler, 1; initial_params=inits, progress=false)
@test chain[1].metadata.s.vals == [4]
@test chain[1].metadata.m.vals == [-1]
@test getlogp(chain[1]) == lptrue

# parallel sampling
chains = sample(
model,
sampler,
MCMCThreads(),
1,
10;
initial_params=fill(inits, 10),
progress=false,
)
for c in chains
@test c[1].metadata.s.vals == [4]
@test c[1].metadata.m.vals == [-1]
@test getlogp(c[1]) == lptrue
end
end

# set only m = -1
chain = sample(model, sampler, 1; initial_params=[missing, -1], progress=false)
@test !ismissing(chain[1].metadata.s.vals[1])
@test chain[1].metadata.m.vals == [-1]

# parallel sampling
chains = sample(
model,
sampler,
MCMCThreads(),
1,
10;
initial_params=fill([missing, -1], 10),
progress=false,
)
for c in chains
@test !ismissing(c[1].metadata.s.vals[1])
@test c[1].metadata.m.vals == [-1]
for inits in ([missing, -1], (; s=missing, m=-1), (; m=-1))
chain = sample(model, sampler, 1; initial_params=inits, progress=false)
@test !ismissing(chain[1].metadata.s.vals[1])
@test chain[1].metadata.m.vals == [-1]

# parallel sampling
chains = sample(
model,
sampler,
MCMCThreads(),
1,
10;
initial_params=fill(inits, 10),
progress=false,
)
for c in chains
@test !ismissing(c[1].metadata.s.vals[1])
@test c[1].metadata.m.vals == [-1]
end
end

# specify `initial_params=nothing`
Expand Down
Loading