-
Notifications
You must be signed in to change notification settings - Fork 0
Detailed code description
-
fit
- specifies settings for a parameter search, including an experiment, code to use to simulate that experiment, a type of model, parameter ranges, and summary stats to fit. This is used for convenience to define a group of the lower-level settings such asmodel_type
,experiment
, etc. in a relatively compact form. -
model_type
- string indicating model features and fixed parameters. -
features
- set of features of a model that are not conveniently specified in continuously varying parameters. -
experiment
- string indicating an experiment to fit and the set of summary statistics to include in calculating goodness-of-fit. -
f_stats
- function to use for determining summary stats. -
analyses
- set of analyses to run (must be defined byf_stats
) -
w
- vector indicating the relative weighting to give each analysis in calculating GOF. -
curves
- a given analysis may generate multiple curves with different names (e.g. SPCs for different stimulus categories). -
param_info
- AKAgaparam
. Indicates free parameters and the range to search over for each parameter. -
res_info
- indicates the summary stats to include when calculating GOF. Has one field for each curve.
Given a parameter struct and a data structure, can run a simulation to
generate simulated data, optionally running many replications of the
experiment to ensure stability. run_catFR_dist.m
does a lot of
special preparations before calling simulate_fr_gpu.m
to simulate
each list.
A number of functions in the gpu
directory are used; this is not
because I'm actually using a GPU to process things (performance
benefits were small when I tried this out), but because I did some
simplification when I wrote the GPU versions of some functions. On the
params struct, there is a gpufun
field, which should be set to 1,
but gpu
is set to 0, indicating that arrays should not actually be
transferred to the GPU.
See the Parameters section for a list of most of the actively used parameters.
-
run_catFR_dist.m
- simulates catFR or a catFR-like experiment. Can have state persist over multiple lists, or not (no persistence in most recent simulations). -
run_session_pre
- iterates over lists within a session, optionally simulating the N lists leading up to that list so that model state reflects some history. Can useprev_env
to keep the prototypes persistant between lists. -
run_session
- runs one session (if the model is not reset between lists) or just one list (otherwise) -
study_update.m
- sets the rate of context updating, including distraction. -
study_lrate.m
- sets learning rate schedule, including primacy gradient. -
calculate_session_patterns.m
- determine how many patterns of context are necessary to run a list. -
distrib_units_catFR.m
- sets a number of parameters that determine what units are necessary for representing different categories, etc. -
create_cfr_distrib_patterns.m
- creates distributed patterns associated with different categories, examplars, and distractors. -
create_ortogonal_patterns.m
- sets the units that will be used in the feature layer for different items. Output is altered to reflect the different item and context representations. -
create_new_cfr_distrib_patterns.m
- given saved prototypes from a previous trial, generate new examplars. This was a hack to allow for persistant prototypes without changing therun_catFR_dist.m
code too much. -
simulate_fr_gpu.m
- simulate one or more lists. -
save_sim_session.m
- store results from simulating a session or list.
These functions haven't been customized too much for CFR. The main change is that the weights get set at the start of each list to reflect the category structure of the list.
-
simulate_fr_gpu.m
- run one or more lists and return simulated data, optionally including the state of context at each step -
present_distraction_gpu.m
- applies distraction to change context. -
present_item_localist.m
- presents an item, with some assumptions that make execution faster. -
context_update_gpu.m
- updates context, optionally using GPU execution. -
weight_update_localist.m
- update associations by just adjusting the relevant column/row (this is faster in some cases). -
fr_task_gpu.m
- simulates free recall for one list. -
recall_item_gpu.m
- simulates one recall attempt. -
decision_accum.m
- given context support, runs a recall competition. -
reactivate_item_localist.m
- if an item was recalled, reactivate it. Makes some assumptions to make execution faster.
Given all the simulation code, which takes parameters and an experiment definition and generates simulated data, we want to optimize parameters to maximize the fit to the data on specific summary statistics. Initially, this was done using a genetic algorithm, so functions often have ga in the name. Now I am using a type of differential evolution.
-
submit_ga_catFR.m
- submits jobs to run a search-
de_dce-test
(50 individuals, 1 replication), to make sure code is working -
de_dce-grid
(5000 individuals, 15 replications), one generation -
de_dce-standard
(50 individuals, 20 replications), until converged
-
-
run_ga_catFR.m
- runs the actual search -
get_model_features.m
- sets many options to create different model variants; most options aren't used anymore, so there is a lot of complexity here. Unpacks themodel_type
string to determine the settings -
ga_param_catFR_cmr.m
- determines the range to search for each parameter, for each model variant/search. Has a huge library of different model variants. See README for information about what the components of themodel_type
string mean, and alsoget_model_features.m
has some details. -
get_fsim_catFR.m
- determines the function to use for simulating a given experiment and model type. -
run_catFR_dist.m
- runs catFR simulations. See above for details. -
get_analyses_catFR.m
- gets information about how to evaluate a model for a given experiment, including a path to data, a function that calculates multiple stats, and a cell array of strings specifying which stats to use. -
stats_catFR.m
- calculates a number of summary statistics and makes the corresponding plots. Returns results in struct format. -
expand_analyses_catFR.m
- splits up some analyses, mainly by category, to get the full set to expect in the results structure. -
extract_results_info.m
- given a results struct (e.g. based on the observed data), determine how to pack new results to match that format -
pack_results.m
- converts a results structure into a results vector for calculating GOF. -
get_weight_vector.m
- determines point weighting based on the results info struct. -
get_param_ranges.m
- grabs parameter ranges from the param_info struct. -
ga_opt_catFR.m
- determines how the search is run. Supports GA and DE searches. -
eval_catFR.m
- converts a supplied parameter vector into a param structure that can be used to run a simulation; generates simulated data; calls the stats function to get a results struct; packs results into a vector that matches up with the actual data; evaluates model fitness -
param_conversion_catFR.m
- converts a parameter vector into structure format, then sets fixed parameters and sets any missing parameters. -
complete_param_catFR_nosrc.m
- given a param struct and model features, sets fixed parameters (based on model type) and missing parameters. -
fixed_param_cfr.m
- sets fixed parameters based on model type. -
run_de_dce.m
- using a Matlab session on a head node, submits jobs to evaluate individuals in each of a number of generations. Submits a set of jobs, waits, loads them, mutates to get the next generation, then repeats until the desired number of generations is run or until the user terminates the search.
Once we have the results of a parameter search, we really really just
have a fitness value and a set of parameters. We don't save each
individual simulation run during the search, in order to save disk
space. Anyway, it's best to run a new simulation since the estimate of
fitness will be inflated since we selected for the best-fitting
parameter set in a generation, and there will be some regression to
the mean. submit_best_catFR.m
takes a set of best-fitting parameters
and runs a simulation with a large number of replications to make sure
the results are stable, and optionally can calculate a set of summary
statistics (generally the same ones used in the fit), make plots of
results and a table with parameters, and recalculate fitness. This new
fitness value will be a better estimate than the original one obtained
from the search.
-
submit_best_catFR.m
- creates a call tosim_catFR_best_params.m
to run a simulation using best-fitting parameters and calculate summary statistics, optionally with some manual changes to the model. -
sim_catFR_best_params.m
- loads search results and runs a simulation. -
load_best_params_catFR.m
- loads the best parameter from a search results file and converts to struct format. -
param_conversion_catFR.m
- converts parameter vectors to struct format. -
complete_param_catFR_nosrc.m
- sets parameter defaults and handles various conversions to map raw parameters into the values needed to run simulations. Is very complicated, so make sure to check that the parameters you get out match what you would expect based on the vector. -
param_latex_catFR.m
- sets LaTeX code for model parameters, for generating tables. -
write_param_table.m
- writes best-fitting parameters to a table.
After the group fits, allowed some parameters to vary between individual participants.
-
submit_indiv_ga_catFR.m
- submits searches to fit individual subject behavior.-
de-standard
- (50 individuals, 10 replications), until converged -
de-standard
- (50 individuals, 20 replications), until converged
-
-
run_indiv_ga_catFR.m
- runs the actual searches, with one task per subject. -
get_subj_data_files.m
- gets a list of data files for all subjects. -
run_de_serial.m
- similar torun_de_dce.m
, but runs everything within one job/task. Unlikede_search.m
, still saves out results on each generation, so can be restarted if interrupted.
Once we have best-fitting parameters for each subject, need to look at summary stats.
submit_indiv_best_catFR.m
run_indiv_best_catFR.m
In order to develop neural predictions, can look at how the simulated state of context is changing over time.
-
run_cmr_cat_context.m
- calculates statistics and creates plots designed to match the train position plots from the scalp EEG data. -
cmr_item_sim.m
- calculates pairwise similarity between simulated states of context within a list. -
cmr_cat_context.m
- calculates a measure of category strength designed to be analogous to classifier evidence, by comparing each item to states of context corresponding to items of that category, relative to items from other categories. -
cmr_cat_context_trainpos.m
- category strength as a function of category and train position. -
cmr_integration_rate.m
- given raw simulated data, calculates category integration rate.
-
submit_hand_fits_catFR.m
- defines a parameter sweep and creates a set of jobs to run a simulation at each point. -
run_hand_fits_catFR.m
-
Core parameters and aliases:
-
B_enc
,B_enc_temp
- context change rate during encoding. -
B_rec
,B_rec_temp
- context change rate during recall. -
sigma_cel
,sigma_loc
,sigma_obj
- amount of normally distributed noise to add to each category prototype to generate examplars from that category. -
gamma_fc
,lrate_fc_enc
- learning rate for episodic item-context associations. -
gamma_cf
,lrate_cf_enc
- learning rate for episodic context-item associations. -
eye_fc
- strength of pre-experimental item-context associations. -
eye_cf
- strength of pre-experimental context-item associations. -
p_scale
- scale of primacy gradient. -
p_decay
- decay rate of primacy gradient. -
K
- leakage of recall competition. -
L
- strength of lateral inhibition. -
eta
- noise in recall competition. -
tau
- time constant.
Context representation details:
-
sigma
- cell array of sigmas for each category. -
sigma_labels
- cell array of labels mapping category to the correct sigma to use. -
n_subregions
- subregions of the context representations to use (for different categories). -
subregions
- not to be confused withn_subregions
; specifies the number of regions representing different aspects of the stimulus (e.g. item vs. encoding task) -
n_subregion_dimensions
- number of dimensions per category. -
n_category_dimensions
- number of category-specific units (i.e. dimensions pooled over all categories, but not including distractor units). -
n_item_dimensions
- number of item-specific units (not used in current version). -
orthog_cat
- if true, there are different units for each category. -
orthog_distract
- true if distractors are orthogonal to the item representations. -
index_type
- 'session' or 'subject' to indicate when category prototypes should be reset. -
fix_sess_proto
- if true, keep the same category prototypes throughout a session/subject. -
distract_proto
- if true, distractors are derived from a prototype. -
first_distract
- indicates what to set context to before the list. Currently use 'all_proto', which uses an average of all category prototypes.
Distraction:
-
B_distract
- intercept of curve relating distraction duration to integration rate, i.e. amount of context change with no distraction. -
B_distract_slope
- slope relating distraction duration to integration rate. Specified as units ofB_enc
per second of distraction.