Home

Welcome to the best-practices wiki! Add your knowledge, experiences, and questions. :)

Our goal is to systematize and automate what we do in our everyday computational cognitive science---to level-up psychology. For some of the things we do, better tools won't help very much in the near-term. For instance, coming up with brand new theories that change our view of how the mind works; identifying new classes of phenomena. But for many other research activities we can be faster, be more systematic, and reduce the rate of bugs or errors.

BPT Whiteboard Picture...

The above sketch shows is a cartoon flow diagram for the main elements of work in (experimental) computational cognitive science. The cycle of science:

Intutive phenomena of interest ("People seem to do this...")
-> hypotheses -> formal (PPL) models -> predictions
-> space of experiments -> implemented experiment space
--> specific experiment design (e.g. OED)
---> data (e.g. MTurk)
----> data analysis
-----> model free (hierarchical regression, etc; test qualitative predictions)
-----> single model fitting and evaluation
-----> model comparison
---> residual variation analysis (scatter plots, PPC)
--> updated phenomena, hypotheses, or experiment space

Formalizing hypotheses and generating predictions

Probabilistic Programs

Probabilistic programming languages (PPLs) provide a way to formalize hypotheses. Methodologically having fully formal theories means you can automate more of the steps of science.

Useful PPLs include Church, WebPPL, Jags, Stan, etc.

Confidence in predictions

Bootstrapping confidence intervals is simple and useful. This matters especially when predictions come on the basis of other experimental data (e.g. measured priors), but is also useful when we don't know how much variance there is in the results of inference.

Some details here about how to bootstrap? Pointers to tools for bootstrapping PPL predictions? Any warnings (e.g. avoiding duplicate labels in resampled Ss)?

One caveat is the CIs will tell you about variance but not bias in your inference algorithms. For unbiased algorithms (such as MCMC) this is not a problem, but for many it is. A partial solution is to make plots of predictions and confidence intervals against an inference parameter that is such that in the limit the predictions are unbiased -- g.g. increase number of particles for particle filtering.

An example of bootstrapping empirical priors

Suppose you've collected N=70 subjects ratings of combinations of properties, which you're calling the prior. Naively, we would take the mean of subjects responses (i.e. each subject gives us a distribution, and we take the mean distribution). When we bootstrap, we resample w/ replacement n=70 from our sample of N=70 (if this were done w/o replacement, it would return our original sample). From this resampled sample, take the mean distribution. This amounts one bootstrapped sample. Run your model forward. Repeat this 500, 1000, 10000 times, and you've just bootstrapped your empirical priors. You can treat this as a proxy for the actual distribution, and look at the 2.5-97.5% directly to find your confidence interval.

Pre-processing data

We remove outliers -- but the exclusion criteria should be pre-planned before running the experiment.

We often Z-score or normalize data, which is essentially fitting means, variances, etc of the data set before analysis.

In the longer run we should probably move away from data pre-processing and toward explicit linking function models between predictions and dependent measures. These linking models include, for example, ordinal regression for Likert scales, and non-liner compression for slider scales. This allows us to verify our linking assumptions, jointly fit linking parameters with model parameters, and take into account individual differences more clearly.

Dealing with parameters

What should we do with free parameters in our models?

Fit

optimizing R2 or marginal likelihood?

Integrating over params

integrating out params. good because of bayes occam.

what priors to use? uninformative (but still depends on parameterization).

relation to posterior predictive checks?

to optimize or integrate -- is the full bayesian way to 'fit and compare' a comparison of posterior predictive to data?

Hierarchical parameters and Ss level factors

Model comparison

how to compute (marginal) likelihood of data?

OED: Finding good experiments

sets of conditions. order effects? number of Ss. dollars to bits conversion. dependent measures and linking functions. loose or tight linking assumptions?

Generating alternative hypotheses

Finding residual patterns not explained by your models

looking at model/data scatter plots (PPC and means and such)

Measuring a bunch of things

What does ``that thing we do,'' of measuring all the components of the model and showing they are related in the predicted way, suggest about tools?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly