Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474

seabbs · 2024-11-24T23:21:45Z

Description

This PR closes #267 by moving to a formula approach for pwindow and swindow. On top of allowing for the correct likelihood this also allows users to set priors on pwindow and swindow. This could be extended to allow for custom formulas (for pwindow this would be a useful feature but not for swindow).

Checklist

My PR is based on a package issue and I have explicitly linked it.
I have included the target issue or issues in the PR title in the for Issue(s) issue-numbers: PR title
I have read the contribution guidelines.
I have tested my changes locally.
I have added or updated unit tests where necessary.
I have updated the documentation if required.
My code follows the established coding standards.
I have added a news item linked to this PR.
I have reviewed CI checks for this PR and addressed them as far as I am able.

R/latent_model.R

seabbs · 2024-11-25T16:37:56Z

Noting this has been slowed down as I realised that reparameterisation wasn't working currently with general families and .replace_prior didn't appear to work as expected when used in epidist_prior.

seabbs · 2024-11-25T16:57:10Z

Something else to note here is that the matrix multiplication and handling of the uniform priors appears to be very inefficient so in its current form it doesn't really make a great deal of sense.

seabbs · 2024-11-25T17:35:02Z

I don't think that matrix multiplication issue is possible to get around so I think I would argue this PR is repurposed to address the issues found whilst exploring this and we hit pause on custom priors and look for another method for correcting the likelihood

seabbs · 2024-11-27T18:46:40Z

This PR now:

Added enforce_presence argument to epidist_prior() to allow for priors to be
specified if they do not match existing parameters. See Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474.
Added a merge argument to epidist_prior() to allow for not merging user and package priors. See Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474.
Added user settable primary event priors to the latent model. See Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474.
Added a marginalised likelihood to the latent model. See Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474.
Generalised the stan reparametrisation feature to work across all distributions without manual specification by generating stan code with brms and then extracting the reparameterisation. See Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474.

I also added #478, #476, and #477 to enhance functionality from this PR. In general, I found quite a few edge cases in places (especially the prior handling) when looking at this that I think might need ongoing work.

Due to the inefficiency of the window as formulas approaches this implements the log likelihood where windows are marginalised out. I think this is more correct that the random unlinked sample version but is potentially not ideal. The only way I can think of to write down the full likelihood is to somehow switch the model from the direct version to the formula version when doing posterior prediction. That feels kind of bonkers though. The marginalised likelihood is currently very slow (due to the issues noted above) so it may feel less of a problem as those are resolved.

…reme log lik values

athowes

I think the way to do the reparameterisation is cool. I do worry about it being a bit hacky, and relying on regex. Perhaps though the regex is relatively robust and the pattern it's looking for in brms Stan code can be relied not to change. Also to note I'm unsure about why this feature is a part of this PR. (Is it related to refactor to use formula for pwindow and swindow?). Also if it's not using S3 suggest dropping S3.
On the log_lik, posterior_predict and posterior_epred I think the generalisation beyond being about the latent model is good. I wonder about how we will extend this to working with the marginal_model (PR Issue #221: Add marginal model #426). For the naive model it already has these methods as it doesn't need a custom family. One could wonder about simulating from the right likelihood after using the naive model, but perhaps it's not going to happen / be a useful feature.
It's exciting to have priors on the pwindow_raw and swindow_raw. I wonder about how easy it will be for users on the journey to changing those priors (knowing which parameters to set).
About the prior infrastructure, it seems to be getting quite complicated and or brittle. Perhaps you disagree. I wonder about what we might be able to do to simplify it. I think at some level I feel quite bad about including an argument like merge_priors in the main epidist() function as it shows our approach isn't really working that well. I wonder what use cases the "built in approach isn't flexible enough for" -- is it mainly this one about prior for all coefficients? I was a bit confused by that because in the previous approach I thought we were matching on all entries in the dataframe. Now it seems like we are doing some regex and it's a different approach.
Is there some way we can document how users can make use of the priors on window functionality? And when they would do it? Inclusion in some vignette? Create new issue on this? I think otherwise it is quite a challenging thing to figure out.

R/epidist.R

R/family.R

R/gen.R

R/latent_model.R

vignettes/faq.Rmd

R/latent_model.R

seabbs · 2024-11-28T16:34:54Z

ty for the review @athowes. I think I have addressed your specific concerns and discussed your points below. My high level is I agree some parts of this are a bit complex but I think we might need some more user testing to see how it all shakes out before making updates.

Stan code can be relied not to change. Also to note I'm unsure about why this feature is a part of this PR. (Is it related to refactor to use formula for pwindow and swindow?). Also if it's not using S3 suggest dropping S3

The S3 option is there in case it isn't working well enough and so you can write a s3 method as a workaround. If we didn't have this you would be just stuck. As I said in the specific comment I think we should monitor this.

Also to note I'm unsure about why this feature is a part of this PR.

Its part of this PR as I refactored the stan code section of the latent model - one part of that was this (it also needed to be done in order to make the model work at all when we were passing the window parameters as dpars which we no longer are).

I wonder about how we will extend this to working with the marginal_model

Yes I agree. I think we should make sure that the marginal model has the vreals etc that this needs and then it should just be plug and play

Is there some way we can document how users can make use of the priors on window functionality? And when they would do it? Inclusion in some vignette? Create new issue on this? I think otherwise it is quite a challenging thing to figure out.

As above and yes I think so. I think we might want to wait to see what custom priors looks like in the marginal model (if and when that goes in) as that will have other limitations/features we should probably discuss. I'll make an issue now though.

I was a bit confused by that because in the previous approach I thought we were matching on all entries in the dataframe. Now it seems like we are doing some regex and it's a different approach.

I found multiple instances where this didn't work (with some of them in the new tests). My working with this more generally made me think we might need a list argument per low level function in epidist to pass optional parameters as it felt pretty limited when playing around.

it seems to be getting quite complicated and or brittle

It got so complicated because the current implementation was pretty brittle when actually trying to use it (mostly when trying to pass pwindow and swindow in the formula which a future model might need to do). I think we might need some more user testing to see where we are but I really can't see how we can do away with the handling of manual priors as a special case for example.

I wonder about how easy it will be for users on the journey to changing those priors (knowing which parameters to set).

Not at all easy I think and there are lots of difficult edge cases (also posterior_predict etc is uniform only) I think we might need another issue to expand on this/add some guard rails.

athowes mentioned this pull request Nov 25, 2024

Add option to pass custom additional priors #123

Closed

athowes reviewed Nov 25, 2024

View reviewed changes

R/latent_model.R Outdated Show resolved Hide resolved

seabbs force-pushed the window-priors branch from aabcfeb to b8e61e9 Compare November 25, 2024 11:27

athowes changed the title ~~Issue 267: Refactor to use formulas for pwindow and swindow~~ Issue #267: Refactor to use formulas for pwindow and swindow Nov 25, 2024

athowes reviewed Nov 25, 2024

View reviewed changes

R/latent_model.R Outdated Show resolved Hide resolved

athowes reviewed Nov 25, 2024

View reviewed changes

R/latent_model.R Outdated Show resolved Hide resolved

This was referenced Nov 27, 2024

Rewrite epidist_gen_log_lik to more efficiently wrap brms log_lik #476

Open

Update epidist_gen_log_lik to use primarycensored analytical solutions #477

Open

Add a warning if non-uniform events or non IID primary events are used in the latent model #478

Open

seabbs added 17 commits November 27, 2024 18:47

first pass at refactoring latent model to use window formulas

0fc1fa9

add docs to stan function

4b3195c

check getting started -drive by fix plotting

190f2e1

update approach to handling formulas

4dd9e4f

get reparameterisation from brms itself vs enforcing manual declaration

8ff54fe

work on regexing:

bf4a910

test manually setting new priors

269e8f6

fix .replace_prior

eb7d502

reset for pause

4b2ae8e

add back in lower bounds

8ece45f

revert pass in via formula

03d24f0

add custom priors pass in

7f4fa97

write priors down more neatly

4381326

add manual prior mode and optout

b0b4834

clean up easy test failures

856aacf

use marginalised log likelihood

57d2dae

debug marginalised likelihood

23dea17

seabbs added 7 commits November 27, 2024 18:47

workaround for liklihood vectorisation

237f48a

further increase prior complexity options

cb0c403

update prior ordering

750ea52

catch printing issue for .replace_prior

b2578a6

add news iteem

ead5f67

add PR links

fb50250

speeed up test

5beebde

seabbs force-pushed the window-priors branch from 2dc537a to 5beebde Compare November 27, 2024 18:48

seabbs marked this pull request as ready for review November 27, 2024 18:48

seabbs added 3 commits November 27, 2024 18:51

code read through

bafa1cf

clean up precommit

df49ef0

turn off priorsense to check theory its numerical instability for ext…

c425116

…reme log lik values

seabbs mentioned this pull request Nov 27, 2024

Resolve priorsense error with marginalised likelihood #479

Open

athowes self-requested a review November 28, 2024 14:45

athowes reviewed Nov 28, 2024

View reviewed changes

review comments

8711ddc

seabbs changed the title ~~Issue #267: Refactor to use formulas for pwindow and swindow~~ Issue #267: Refactor to allow custom event priors and marginalised the latent likelihood Nov 28, 2024

seabbs changed the title ~~Issue #267: Refactor to allow custom event priors and marginalised the latent likelihood~~ Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood Nov 28, 2024

seabbs requested a review from athowes November 28, 2024 16:40

This was referenced Nov 28, 2024

Document approaches for users setting primary event priors and reasons to do so. #481

Open

Do we need functionality to pass arguments from epidist to all or some low level functions #482

Open

Review and refactor prior handling #483

Open

seabbs enabled auto-merge (squash) November 28, 2024 16:46

athowes approved these changes Nov 28, 2024

View reviewed changes

seabbs merged commit f9e2fc8 into main Nov 28, 2024
10 checks passed

seabbs deleted the window-priors branch November 28, 2024 17:08

athowes mentioned this pull request Dec 3, 2024

Latent event times as distributional parameters? #343

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474

Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474

seabbs commented Nov 24, 2024 •

edited

Loading

seabbs commented Nov 25, 2024

seabbs commented Nov 25, 2024 •

edited

Loading

seabbs commented Nov 25, 2024

seabbs commented Nov 27, 2024 •

edited

Loading

athowes left a comment •

edited

Loading

seabbs commented Nov 28, 2024 •

edited

Loading

Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474

Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474

Conversation

seabbs commented Nov 24, 2024 • edited Loading

Description

Checklist

seabbs commented Nov 25, 2024

seabbs commented Nov 25, 2024 • edited Loading

seabbs commented Nov 25, 2024

seabbs commented Nov 27, 2024 • edited Loading

athowes left a comment • edited Loading

Choose a reason for hiding this comment

seabbs commented Nov 28, 2024 • edited Loading

seabbs commented Nov 24, 2024 •

edited

Loading

seabbs commented Nov 25, 2024 •

edited

Loading

seabbs commented Nov 27, 2024 •

edited

Loading

athowes left a comment •

edited

Loading

seabbs commented Nov 28, 2024 •

edited

Loading