Detailed code description

Code information

Organization

fit - specifies settings for a parameter search, including an experiment, code to use to simulate that experiment, a type of model, parameter ranges, and summary stats to fit. This is used for convenience to define a group of the lower-level settings such as model_type, experiment, etc. in a relatively compact form.
model_type - string indicating model features and fixed parameters.
features - set of features of a model that are not conveniently specified in continuously varying parameters.
experiment - string indicating an experiment to fit and the set of summary statistics to include in calculating goodness-of-fit.
f_stats - function to use for determining summary stats.
analyses - set of analyses to run (must be defined by f_stats)
w - vector indicating the relative weighting to give each analysis in calculating GOF.
curves - a given analysis may generate multiple curves with different names (e.g. SPCs for different stimulus categories).
param_info - AKA gaparam. Indicates free parameters and the range to search over for each parameter.
res_info - indicates the summary stats to include when calculating GOF. Has one field for each curve.

Simulating catFR

Given a parameter struct and a data structure, can run a simulation to generate simulated data, optionally running many replications of the experiment to ensure stability. run_catFR_dist.m does a lot of special preparations before calling simulate_fr_gpu.m to simulate each list.

A number of functions in the gpu directory are used; this is not because I'm actually using a GPU to process things (performance benefits were small when I tried this out), but because I did some simplification when I wrote the GPU versions of some functions. On the params struct, there is a gpufun field, which should be set to 1, but gpu is set to 0, indicating that arrays should not actually be transferred to the GPU.

See the Parameters section for a list of most of the actively used parameters.

run_catFR_dist.m - simulates catFR or a catFR-like experiment. Can have state persist over multiple lists, or not (no persistence in most recent simulations).
run_session_pre - iterates over lists within a session, optionally simulating the N lists leading up to that list so that model state reflects some history. Can use prev_env to keep the prototypes persistant between lists.
run_session - runs one session (if the model is not reset between lists) or just one list (otherwise)
study_update.m - sets the rate of context updating, including distraction.
study_lrate.m - sets learning rate schedule, including primacy gradient.
calculate_session_patterns.m - determine how many patterns of context are necessary to run a list.
distrib_units_catFR.m - sets a number of parameters that determine what units are necessary for representing different categories, etc.
create_cfr_distrib_patterns.m - creates distributed patterns associated with different categories, examplars, and distractors.
create_ortogonal_patterns.m - sets the units that will be used in the feature layer for different items. Output is altered to reflect the different item and context representations.
create_new_cfr_distrib_patterns.m - given saved prototypes from a previous trial, generate new examplars. This was a hack to allow for persistant prototypes without changing the run_catFR_dist.m code too much.
simulate_fr_gpu.m - simulate one or more lists.
save_sim_session.m - store results from simulating a session or list.

Simulating individual lists

These functions haven't been customized too much for CFR. The main change is that the weights get set at the start of each list to reflect the category structure of the list.

simulate_fr_gpu.m - run one or more lists and return simulated data, optionally including the state of context at each step
present_distraction_gpu.m - applies distraction to change context.
present_item_localist.m - presents an item, with some assumptions that make execution faster.
context_update_gpu.m - updates context, optionally using GPU execution.
weight_update_localist.m - update associations by just adjusting the relevant column/row (this is faster in some cases).
fr_task_gpu.m - simulates free recall for one list.
recall_item_gpu.m - simulates one recall attempt.
decision_accum.m - given context support, runs a recall competition.
reactivate_item_localist.m - if an item was recalled, reactivate it. Makes some assumptions to make execution faster.

Group parameter search

Given all the simulation code, which takes parameters and an experiment definition and generates simulated data, we want to optimize parameters to maximize the fit to the data on specific summary statistics. Initially, this was done using a genetic algorithm, so functions often have ga in the name. Now I am using a type of differential evolution.

submit_ga_catFR.m - submits jobs to run a search
- de_dce-test (50 individuals, 1 replication), to make sure code is working
- de_dce-grid (5000 individuals, 15 replications), one generation
- de_dce-standard (50 individuals, 20 replications), until converged
run_ga_catFR.m - runs the actual search
get_model_features.m - sets many options to create different model variants; most options aren't used anymore, so there is a lot of complexity here. Unpacks the model_type string to determine the settings
ga_param_catFR_cmr.m - determines the range to search for each parameter, for each model variant/search. Has a huge library of different model variants. See README for information about what the components of the model_type string mean, and also get_model_features.m has some details.
get_fsim_catFR.m - determines the function to use for simulating a given experiment and model type.
run_catFR_dist.m - runs catFR simulations. See above for details.
get_analyses_catFR.m - gets information about how to evaluate a model for a given experiment, including a path to data, a function that calculates multiple stats, and a cell array of strings specifying which stats to use.
stats_catFR.m - calculates a number of summary statistics and makes the corresponding plots. Returns results in struct format.
expand_analyses_catFR.m - splits up some analyses, mainly by category, to get the full set to expect in the results structure.
extract_results_info.m - given a results struct (e.g. based on the observed data), determine how to pack new results to match that format
pack_results.m - converts a results structure into a results vector for calculating GOF.
get_weight_vector.m - determines point weighting based on the results info struct.
get_param_ranges.m - grabs parameter ranges from the param_info struct.
ga_opt_catFR.m - determines how the search is run. Supports GA and DE searches.
eval_catFR.m - converts a supplied parameter vector into a param structure that can be used to run a simulation; generates simulated data; calls the stats function to get a results struct; packs results into a vector that matches up with the actual data; evaluates model fitness
param_conversion_catFR.m - converts a parameter vector into structure format, then sets fixed parameters and sets any missing parameters.
complete_param_catFR_nosrc.m - given a param struct and model features, sets fixed parameters (based on model type) and missing parameters.
fixed_param_cfr.m - sets fixed parameters based on model type.
run_de_dce.m - using a Matlab session on a head node, submits jobs to evaluate individuals in each of a number of generations. Submits a set of jobs, waits, loads them, mutates to get the next generation, then repeats until the desired number of generations is run or until the user terminates the search.

Group best-fitting summary stats

Once we have the results of a parameter search, we really really just have a fitness value and a set of parameters. We don't save each individual simulation run during the search, in order to save disk space. Anyway, it's best to run a new simulation since the estimate of fitness will be inflated since we selected for the best-fitting parameter set in a generation, and there will be some regression to the mean. submit_best_catFR.m takes a set of best-fitting parameters and runs a simulation with a large number of replications to make sure the results are stable, and optionally can calculate a set of summary statistics (generally the same ones used in the fit), make plots of results and a table with parameters, and recalculate fitness. This new fitness value will be a better estimate than the original one obtained from the search.

submit_best_catFR.m - creates a call to sim_catFR_best_params.m to run a simulation using best-fitting parameters and calculate summary statistics, optionally with some manual changes to the model.
sim_catFR_best_params.m - loads search results and runs a simulation.
load_best_params_catFR.m - loads the best parameter from a search results file and converts to struct format.
param_conversion_catFR.m - converts parameter vectors to struct format.
complete_param_catFR_nosrc.m - sets parameter defaults and handles various conversions to map raw parameters into the values needed to run simulations. Is very complicated, so make sure to check that the parameters you get out match what you would expect based on the vector.
param_latex_catFR.m - sets LaTeX code for model parameters, for generating tables.
write_param_table.m - writes best-fitting parameters to a table.

Individual parameter search

After the group fits, allowed some parameters to vary between individual participants.

submit_indiv_ga_catFR.m - submits searches to fit individual subject behavior.
- de-standard - (50 individuals, 10 replications), until converged
- de-standard - (50 individuals, 20 replications), until converged
run_indiv_ga_catFR.m - runs the actual searches, with one task per subject.
get_subj_data_files.m - gets a list of data files for all subjects.
run_de_serial.m - similar to run_de_dce.m, but runs everything within one job/task. Unlike de_search.m, still saves out results on each generation, so can be restarted if interrupted.

Individual best-fitting summary stats

Once we have best-fitting parameters for each subject, need to look at summary stats.

submit_indiv_best_catFR.m
run_indiv_best_catFR.m

Context dynamics

In order to develop neural predictions, can look at how the simulated state of context is changing over time.

run_cmr_cat_context.m - calculates statistics and creates plots designed to match the train position plots from the scalp EEG data.
cmr_item_sim.m - calculates pairwise similarity between simulated states of context within a list.
cmr_cat_context.m - calculates a measure of category strength designed to be analogous to classifier evidence, by comparing each item to states of context corresponding to items of that category, relative to items from other categories.
cmr_cat_context_trainpos.m - category strength as a function of category and train position.
cmr_integration_rate.m - given raw simulated data, calculates category integration rate.

Parameter sweeps

submit_hand_fits_catFR.m - defines a parameter sweep and creates a set of jobs to run a simulation at each point.
run_hand_fits_catFR.m -

Parameters

Core parameters and aliases:

B_enc, B_enc_temp - context change rate during encoding.
B_rec, B_rec_temp - context change rate during recall.
sigma_cel, sigma_loc, sigma_obj - amount of normally distributed noise to add to each category prototype to generate examplars from that category.
gamma_fc, lrate_fc_enc - learning rate for episodic item-context associations.
gamma_cf, lrate_cf_enc - learning rate for episodic context-item associations.
eye_fc - strength of pre-experimental item-context associations.
eye_cf - strength of pre-experimental context-item associations.
p_scale - scale of primacy gradient.
p_decay - decay rate of primacy gradient.
K - leakage of recall competition.
L - strength of lateral inhibition.
eta - noise in recall competition.
tau - time constant.

Context representation details:

sigma - cell array of sigmas for each category.
sigma_labels - cell array of labels mapping category to the correct sigma to use.
n_subregions - subregions of the context representations to use (for different categories).
subregions - not to be confused with n_subregions; specifies the number of regions representing different aspects of the stimulus (e.g. item vs. encoding task)
n_subregion_dimensions - number of dimensions per category.
n_category_dimensions - number of category-specific units (i.e. dimensions pooled over all categories, but not including distractor units).
n_item_dimensions - number of item-specific units (not used in current version).
orthog_cat - if true, there are different units for each category.
orthog_distract - true if distractors are orthogonal to the item representations.
index_type - 'session' or 'subject' to indicate when category prototypes should be reset.
fix_sess_proto - if true, keep the same category prototypes throughout a session/subject.
distract_proto - if true, distractors are derived from a prototype.
first_distract - indicates what to set context to before the list. Currently use 'all_proto', which uses an average of all category prototypes.

Distraction:

B_distract - intercept of curve relating distraction duration to integration rate, i.e. amount of context change with no distraction.
B_distract_slope - slope relating distraction duration to integration rate. Specified as units of B_enc per second of distraction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly