-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sampling for factored algorithms #733
Comments
I think the size grows as N**k which is faster than N*k. More importantly what do you mean by "the support of the variable that needs pairing down" and how do you "squeeze it back down"? Does the procedure only work on Gaussians or any mixture of continuous variables? |
The size of the factor grows exponentially in k, yes, I was just saying the support of the variable grows linearly in k. This (in theory) works on any mixture of any continuous and discrete variables. I have not fully tested it though. To you question about squeezing, say you have this code: If(Flip(0.1), Normal(0, 1), Normal(10, 1)) The way we do it now, we'd take, say 10 samples from each Normal. The support of the If statement would then be 10*2 since the probability that any two samples from each normal are the same is ~0. Now, we fix the support of the If statement to 10 samples. Since there are 20 possible values from the two normals, we sub-sample those 20 values down to 10. This process repeats itself anytime the number of samples grows beyond 10 (in this example). |
Still not sure what you are doing. Are you:
The latter makes more intuitive sense, but I'm not certain either is correct. |
Hi Brian,
I like this! If it’s tested (i.e. doesn’t degrade performance on any of our tests and examples), let’s incorporate it into the main branch.
One question: How will this interact with LSFI?
Avi
From: Brian Ruttenberg <[email protected]>
Reply-To: p2t2/figaro <[email protected]>
Date: Wednesday, December 6, 2017 at 1:06 PM
To: p2t2/figaro <[email protected]>
Cc: Avi Pfeffer <[email protected]>, Mention <[email protected]>
Subject: [p2t2/figaro] Sampling for factored algorithms (#733)
Working with Alex Ihler and Rina Dechter, we've come up with a new way to do sampling of continuous elements for factored algorithms. Previously we would take N samples from the prior of each continuous element, then propagate those samples though the model. Since there was no consistency of samples between elements, this would often result in factor explosion as sampled elements were combined together (e.g. a mixture of gaussians would have support k*N, where k is the number of gaussians).
We changed this procedure to always enforce that no more than N samples are propagated through the model. This means that, in the mixture cause, out of k*N samples, we squeeze it back down to N samples. We now do this by uniformly sampling from the support of the variable that needs pairing down. This is equivalent to particle BP under a uniform proposal.
We're only doing this if a continuous element is detected along the dependency path of an element. When there is a mix of continuous and discrete, we still do this, and its ok because we still retain the density factor for the discrete values (ie, if the discrete values are subsampling, they will be renormalized according the selected samples).
So, do we want to push this change back into the main branch? It seems like a much better way to do sampling than before.
@apfeffer<https://github.com/apfeffer>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#733>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFJkd0bQw9DMJAY7-fH9MahacvtcsYgRks5s9tengaJpZM4Q4Wee>.
|
Glenn: Let X be 10 samples from N(0,1) and Y be 10 samples from N(10,1). The support of the If is now 10 samples taken from X \union Y. Avi: Good question on LSFI. I can't give you an answer right now, I have to look into it. For continuous elements themselves, this will have no effect as this change is only implemented at Chains. |
It still seems to me that this is ignoring the Flip(0.1). If the 20 are treated equally, aren’t you likely to get a sample from N(5, 2)? Whereas it should be closer to N(9,..) ?
*******************
Glenn Takata, PhD
Sr Software Engineer
Charles River Analytics
617.491.3474 x625
ww.cra.com
From: Brian Ruttenberg [mailto:[email protected]]
Sent: Wednesday, December 6, 2017 2:22 PM
To: p2t2/figaro <[email protected]>
Cc: Glenn Takata <[email protected]>; Comment <[email protected]>
Subject: Re: [p2t2/figaro] Sampling for factored algorithms (#733)
Glenn: Let X be 10 samples from N(0,1) and Y be 10 samples from N(10,1). The support of the If is now 10 samples taken from X \union Y.
Avi: Good question on LSFI. I can't give you an answer right now, I have to look into it. For continuous elements themselves, this will have no effect as this change is only implemented at Chains.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#733 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFJDDRKbwo6uD1H2yviuNQt66RdHFySeks5s9ulFgaJpZM4Q4Wee>.
|
No, you're not going to ignore the flip because you'll still make a factor for the flip, which will then weight the output samples accordingly (ie, samples that came from N(10,1) will have more weight) |
Note, I'm also adding this for Apply's as well. The point Alex made was that this way we have total control over the size of the model that is sampled. (Again, only if a continuous variable is detected) |
In your example, does the support of the Normals change? Or, do e.g. the 10 values originally sampled from Normal(0,1) get mapped onto the 10 values of the Chain by some sort of approximation? If it's the latter, you might vaguely recall that I once implemented something like this for Gibbs sampling (see here), but I never had the chance to thoroughly test it. |
No, the support of the normals does not change. They are sampled once. The values of the Chain are subsampled from the union of the two normals, so the output of the chain is always consistent with some of the samples. |
Working with Alex Ihler and Rina Dechter, we've come up with a new way to do sampling of continuous elements for factored algorithms. Previously we would take N samples from the prior of each continuous element, then propagate those samples though the model. Since there was no consistency of samples between elements, this would often result in factor explosion as sampled elements were combined together (e.g. a mixture of gaussians would have support k*N, where k is the number of gaussians).
We changed this procedure to always enforce that no more than N samples are propagated through the model. This means that, in the mixture cause, out of k*N samples, we squeeze it back down to N samples. We now do this by uniformly sampling from the support of the variable that needs pairing down. This is equivalent to particle BP under a uniform proposal.
We're only doing this if a continuous element is detected along the dependency path of an element. When there is a mix of continuous and discrete, we still do this, and its ok because we still retain the density factor for the discrete values (ie, if the discrete values are subsampling, they will be renormalized according the selected samples).
So, do we want to push this change back into the main branch? It seems like a much better way to do sampling than before.
@apfeffer
The text was updated successfully, but these errors were encountered: