-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the best-practices wiki! Add your knowledge, experiences, and questions. :)
Our goal is to systematize and automate what we do in our everyday computational cognitive science---to level-up psychology. For some of the things we do, better tools won't help very much in the near-term. For instance, coming up with brand new theories that change our view of how the mind works; identifying new classes of phenomena. But for many other research activities we can be faster, be more systematic, and reduce the rate of bugs or errors.
The above sketch shows is a cartoon flow diagram for the main elements of work in (experimental) computational cognitive science. The cycle of science:
- Intutive phenomena of interest ("People seem to do this...")
- -> hypotheses -> formal (PPL) models -> predictions
- -> space of experiments -> implemented experiment space
- --> specific experiment design (e.g. OED)
- ---> data (e.g. MTurk)
- ----> data analysis
- -----> model free (hierarchical regression, etc; test qualitative predictions)
- -----> single model fitting and evaluation
- -----> model comparison
- ---> residual variation analysis (scatter plots, PPC)
- --> updated phenomena, hypotheses, or experiment space
Probabilistic programming languages (PPLs) provide a way to formalize hypotheses. Methodologically having fully formal theories means you can automate more of the steps of science.
Useful PPLs include Church, WebPPL, Jags, Stan, etc.
Bootstrapping confidence intervals is simple and useful. This matters especially when predictions come on the basis of other experimental data (e.g. measured priors), but is also useful when we don't know how much variance there is in the results of inference.
Some details here about how to bootstrap? Pointers to tools for bootstrapping PPL predictions? Any warnings (e.g. avoiding duplicate labels in resampled Ss)?
One caveat is the CIs will tell you about variance but not bias in your inference algorithms. For unbiased algorithms (such as MCMC) this is not a problem, but for many it is. A partial solution is to make plots of predictions and confidence intervals against an inference parameter that is such that in the limit the predictions are unbiased -- g.g. increase number of particles for particle filtering.
Suppose you've collected N=70 subjects ratings of combinations of properties, which you're calling the prior. Naively, we would take the mean of subjects responses (i.e. each subject gives us a distribution, and we take the mean distribution). When we bootstrap, we resample w/ replacement n=70 from our sample of N=70 (if this were done w/o replacement, it would return our original sample). From this resampled sample, take the mean distribution. This amounts one bootstrapped sample. Run your model forward. Repeat this 500, 1000, 10000 times, and you've just bootstrapped your empirical priors. You can treat this as a proxy for the actual distribution, and look at the 2.5-97.5% directly to find your confidence interval.
We remove outliers -- but the exclusion criteria should be pre-planned before running the experiment.
We often Z-score or normalize data, which is essentially fitting means, variances, etc of the data set before analysis.
In the longer run we should probably move away from data pre-processing and toward explicit linking function models between predictions and dependent measures. These linking models include, for example, ordinal regression for Likert scales, and non-liner compression for slider scales. This allows us to verify our linking assumptions, jointly fit linking parameters with model parameters, and take into account individual differences more clearly.
What should we do with free parameters in our models?
optimizing R2 or marginal likelihood?
integrating out params. good because of bayes occam.
what priors to use? uninformative (but still depends on parameterization).
relation to posterior predictive checks?
to optimize or integrate -- is the full bayesian way to 'fit and compare' a comparison of posterior predictive to data?
how to compute (marginal) likelihood of data?
sets of conditions. order effects? number of Ss. dollars to bits conversion. dependent measures and linking functions. loose or tight linking assumptions?
looking at model/data scatter plots (PPC and means and such)
What does ``that thing we do,'' of measuring all the components of the model and showing they are related in the predicted way, suggest about tools?