From b6e11b9349074ecebf0ad3f65b2990c9df584714 Mon Sep 17 00:00:00 2001 From: "David A. Knowles" Date: Mon, 25 Sep 2017 19:35:28 -0700 Subject: [PATCH] minor doc changes --- docs/articles/Usage.html | 2 +- docs/articles/Visualization.html | 42 +- docs/reference/cluster_results_table.html | 4 +- .../dirichlet_multinomial_anova_mc.html | 8 +- docs/reference/dirichlet_multinomial_glm.html | 2 +- docs/reference/leaf_cutter_effect_sizes.html | 2 +- .../man/dirichlet_multinomial_anova_mc.Rd | 4 +- leafcutter/src/include/models.hpp | 436 +++++++++--------- leafcutter/vignettes/Usage.Rmd | 2 +- leafcutter/vignettes/Visualization.Rmd | 50 +- leafviz/prepare_results.R | 23 +- 11 files changed, 300 insertions(+), 275 deletions(-) diff --git a/docs/articles/Usage.html b/docs/articles/Usage.html index 916baa3..6a94a48 100644 --- a/docs/articles/Usage.html +++ b/docs/articles/Usage.html @@ -154,7 +154,7 @@

RNA.NA06986_CEU.chr1.bam CEU RNA.NA06989_CEU.chr1.bam CEU RNA.NA06994_CEU.chr1.bam CEU -

Having made that file we can run DS (this assumes you have successfully installed the leafcutter R package as described under Installation above)

+

Having made that file we can run DS (this assumes you have successfully installed the leafcutter R package as described under Installation above)

../scripts/leafcutter_ds.R --num_threads 4 --exons_file=../leafcutter/data/gencode19_exons.txt.gz ../example_data/testYRIvsEU_perind_numers.counts.gz ../example_data/test_diff_intron.txt

Running ../scripts/leafcutter_ds.R -h will give usage info for this script. Here have included an exons_file: this is optional but allows LeafCutter to annotate clusters according to which gene they correspond to. An example exon file for hg19 derived from GENCODE v19 is included with LeafCutter. The exons file should have the following columns: chr, start, end, strand, gene_name. We provide a helper script gtf_to_exons.R which will convert a .gtf to the required format. I’m also hosting the exon file for GRCh38 GENCODE v.26 here.

Two tab-separated text files are output:

diff --git a/docs/articles/Visualization.html b/docs/articles/Visualization.html index 120b2c6..0453f50 100644 --- a/docs/articles/Visualization.html +++ b/docs/articles/Visualization.html @@ -112,6 +112,7 @@

  • dplyr
  • ggplot2
  • shiny
  • +
  • shinyjs
  • DT
  • intervals
  • reshape2
  • @@ -140,26 +141,35 @@

    Step 1. Prepare the LeafCutter differential splicing results for visualisation

    -

    This step annotates each intron in each cluster at a given false discovery rate.

    -
    Rscript prepare_results.R --iFolder <iFolder> \
    -                         --oFolder <oFolder> \
    -                         --support <support> \
    -                         --annotation_code <annotation_code> \
    -                         --code <code> \
    -                         --FDR <FDR> \                       
    -

    iFolder - the folder that contains the results of the differential intron excision analysis. The script assumes all results share the same path and code.

    -

    oFolder - the folder where the results will be written.

    -

    support - the same support file as used for the previous analysis.

    -

    annotation_code - as before.

    -

    code - the same code used for the rest of the analysis, eg testYRIvsEU

    -

    FDR - the benjamini-hochberg false discovery rate with which to filter the results.

    -

    This will create the Rdata object <oFolder>/results/<code>_results.Rdata. The file ‘prepare_example.sh’ shows how this would be done for the example dataset if you wanted to rebuild ‘Brain_vs_Heart_results.Rdata’.

    +

    This step annotates each intron in each cluster at a given false discovery rate and generates a single .RData file with everything the shiny app needs to run.

    +
    ./prepare_results.R [options] <name>_perind_numers.counts.gz <name>_cluster_significance.txt <name>_effect_sizes.txt annotation_code 
    +

    ** _perind_numers.counts.gz ** The same counts matrix you gave to leafcutter_ds.R. ** _cluster_significance.txt ** The cluster significant table output by leafcutter_ds.R. ** _effect_sizes.txt ** The per junction effect size table output by leafcutter_ds.R. ** annotation_code ** will be something like annotation_codes/gencode_hg19/gencode_hg19 (see above)

    +

    Options:

    +
        -o OUTPUT, --output=OUTPUT
    +        The output file that will be created ready for loading by run_leafviz.R [leafviz.RData]
    +
    +    -m META_DATA_FILE, --meta_data_file=META_DATA_FILE
    +        The support file used in the differential splicing analysis. Columns should be file name and condition
    +
    +    -f FDR, --FDR=FDR
    +        the adjusted p value threshold to use [0.05]
    +
    +    -c CODE, --code=CODE
    +        A name for this analysis (will be available in leafviz through the Summary tab). [leafcutter_ds]
    +
    +    -h, --help
    +        Show this help message and exit
    +

    This will create the Rdata file wherever --output is pointed. The file ‘prepare_example.sh’ shows how this would be done for the example dataset if you wanted to rebuild ‘Brain_vs_Heart_results.Rdata’.

    +

    As a concrete example, let’s assume you just ran the example at Usage and you’re currently in the /example_data subdirectory, then running

    +
    ../leafviz/prepare_results.R --meta_data_file test_diff_intron.txt --code leafcutter testYRIvsEU_perind_numers.counts.gz leafcutter_ds_cluster_significance.txt leafcutter_ds_effect_sizes.txt ../leafviz/annotation_codes/gencode_hg19/gencode_hg19
    +

    should create an leafviz.RData file.

    Step 2. Visualise the results

    -
    cd leafvis/
    -Rscript run_leafvis.R <oFolder>/results/<code>_results.Rdata
    +

    From the example_data directory.

    +
    cd ../leafviz/
    +./run_leafviz.R ../example_data/leafviz.Rdata

    This will load in the Rdata object and run the LeafCutter Visualisation App in your browser.

    diff --git a/docs/reference/cluster_results_table.html b/docs/reference/cluster_results_table.html index a6f8151..822f970 100644 --- a/docs/reference/cluster_results_table.html +++ b/docs/reference/cluster_results_table.html @@ -90,7 +90,7 @@

    Parse output of differential_splicing

    -

    Parse output of differential_splicing and make a per cluster results table

    +

    Parse output of differential_splicing and make a per cluster results table

    cluster_results_table(results)
    @@ -100,7 +100,7 @@

    Ar results -

    From differential_splicing

    +

    From differential_splicing

    diff --git a/docs/reference/dirichlet_multinomial_anova_mc.html b/docs/reference/dirichlet_multinomial_anova_mc.html index a689d18..8e6c0c1 100644 --- a/docs/reference/dirichlet_multinomial_anova_mc.html +++ b/docs/reference/dirichlet_multinomial_anova_mc.html @@ -146,8 +146,12 @@

    Ar

    Can be one of "smart", "random". smart uses an method of moments estimator to get a reasonable initialization. The seed for "random" can be set through the ... arguments passed to rstan::optimizing.

    - #' -

    @param ... will be passed on the rstan::optimizing, so can be used for example to set the algorithm used (default is LBFGS) or the random seed if random initialization is requested.

    + smart_init_regularizer +

    Used to protect against colinear covariates.

    + + + ... +

    will be passed on the rstan::optimizing, so can be used for example to set the algorithm used (default is LBFGS) or the random seed if random initialization is requested.

    diff --git a/docs/reference/dirichlet_multinomial_glm.html b/docs/reference/dirichlet_multinomial_glm.html index 1cfc11a..4960791 100644 --- a/docs/reference/dirichlet_multinomial_glm.html +++ b/docs/reference/dirichlet_multinomial_glm.html @@ -90,7 +90,7 @@

    Dirichlet multinomial GLM (single overdispersion parameter)

    -

    We recommend using dirichlet_multinomial_anova_mc instead.

    +

    We recommend using dirichlet_multinomial_anova_mc instead.

    dirichlet_multinomial_glm(x, y, concShape = 1.0001, concRate = 1e-04)
    diff --git a/docs/reference/leaf_cutter_effect_sizes.html b/docs/reference/leaf_cutter_effect_sizes.html index b8174c7..0ba6a1c 100644 --- a/docs/reference/leaf_cutter_effect_sizes.html +++ b/docs/reference/leaf_cutter_effect_sizes.html @@ -100,7 +100,7 @@

    Ar results -

    From differential_splicing

    +

    From differential_splicing

    diff --git a/leafcutter/man/dirichlet_multinomial_anova_mc.Rd b/leafcutter/man/dirichlet_multinomial_anova_mc.Rd index f3183cf..f012120 100644 --- a/leafcutter/man/dirichlet_multinomial_anova_mc.Rd +++ b/leafcutter/man/dirichlet_multinomial_anova_mc.Rd @@ -32,7 +32,9 @@ dirichlet_multinomial_anova_mc(xFull, xNull, y, concShape = 1.0001, \item{init}{Can be one of {"smart", "random"}. smart uses an method of moments estimator to get a reasonable initialization. The seed for "random" can be set through the ... arguments passed to rstan::optimizing.} -\item{#'}{@param ... will be passed on the rstan::optimizing, so can be used for example to set the algorithm used (default is LBFGS) or the random seed if random initialization is requested.} +\item{smart_init_regularizer}{Used to protect against colinear covariates.} + +\item{...}{will be passed on the rstan::optimizing, so can be used for example to set the algorithm used (default is LBFGS) or the random seed if random initialization is requested.} } \description{ Dirichlet multinomial GLM likelihood ratio test for a single cluster diff --git a/leafcutter/src/include/models.hpp b/leafcutter/src/include/models.hpp index 0539dd3..7c6398a 100644 --- a/leafcutter/src/include/models.hpp +++ b/leafcutter/src/include/models.hpp @@ -23,7 +23,7 @@ #include -namespace model_bb_glm_fix_conc_namespace { +namespace model_bb_glm_namespace { using std::istream; using std::string; @@ -42,29 +42,28 @@ static int current_statement_begin__; stan::io::program_reader prog_reader__() { stan::io::program_reader reader; - reader.add_event(0, 0, "start", "model_bb_glm_fix_conc"); - reader.add_event(28, 28, "end", "model_bb_glm_fix_conc"); + reader.add_event(0, 0, "start", "model_bb_glm"); + reader.add_event(28, 28, "end", "model_bb_glm"); return reader; } -class model_bb_glm_fix_conc : public prob_grad { +class model_bb_glm : public prob_grad { private: int N; int P; matrix_d x; vector ys; vector ns; - double conc; double concShape; double concRate; public: - model_bb_glm_fix_conc(stan::io::var_context& context__, + model_bb_glm(stan::io::var_context& context__, std::ostream* pstream__ = 0) : prob_grad(0) { ctor_body(context__, 0, pstream__); } - model_bb_glm_fix_conc(stan::io::var_context& context__, + model_bb_glm(stan::io::var_context& context__, unsigned int random_seed__, std::ostream* pstream__ = 0) : prob_grad(0) { @@ -80,7 +79,7 @@ class model_bb_glm_fix_conc : public prob_grad { current_statement_begin__ = -1; - static const char* function__ = "model_bb_glm_fix_conc_namespace::model_bb_glm_fix_conc"; + static const char* function__ = "model_bb_glm_namespace::model_bb_glm"; (void) function__; // dummy to suppress unused var warning size_t pos__; (void) pos__; // dummy to suppress unused var warning @@ -135,11 +134,6 @@ class model_bb_glm_fix_conc : public prob_grad { for (size_t i_0__ = 0; i_0__ < ns_limit_0__; ++i_0__) { ns[i_0__] = vals_i__[pos__++]; } - context__.validate_dims("data initialization", "conc", "double", context__.to_vec()); - conc = double(0); - vals_r__ = context__.vals_r("conc"); - pos__ = 0; - conc = vals_r__[pos__++]; context__.validate_dims("data initialization", "concShape", "double", context__.to_vec()); concShape = double(0); vals_r__ = context__.vals_r("concShape"); @@ -160,7 +154,6 @@ class model_bb_glm_fix_conc : public prob_grad { for (int k0__ = 0; k0__ < N; ++k0__) { check_greater_or_equal(function__,"ns[k0__]",ns[k0__],0); } - check_greater_or_equal(function__,"conc",conc,0); check_greater_or_equal(function__,"concShape",concShape,0); check_greater_or_equal(function__,"concRate",concRate,0); // initialize data variables @@ -177,11 +170,12 @@ class model_bb_glm_fix_conc : public prob_grad { // validate, set parameter ranges num_params_r__ = 0U; param_ranges_i__.clear(); + ++num_params_r__; validate_non_negative_index("beta", "P", P); num_params_r__ += P; } - ~model_bb_glm_fix_conc() { } + ~model_bb_glm() { } void transform_inits(const stan::io::var_context& context__, @@ -194,6 +188,20 @@ class model_bb_glm_fix_conc : public prob_grad { std::vector vals_r__; std::vector vals_i__; + if (!(context__.contains_r("conc"))) + throw std::runtime_error("variable conc missing"); + vals_r__ = context__.vals_r("conc"); + pos__ = 0U; + context__.validate_dims("initialization", "conc", "double", context__.to_vec()); + // generate_declaration conc + double conc(0); + conc = vals_r__[pos__++]; + try { + writer__.scalar_lb_unconstrain(0,conc); + } catch (const std::exception& e) { + throw std::runtime_error(std::string("Error transforming variable conc: ") + e.what()); + } + if (!(context__.contains_r("beta"))) throw std::runtime_error("variable beta missing"); vals_r__ = context__.vals_r("beta"); @@ -240,6 +248,13 @@ class model_bb_glm_fix_conc : public prob_grad { // model parameters stan::io::reader in__(params_r__,params_i__); + T__ conc; + (void) conc; // dummy to suppress unused var warning + if (jacobian__) + conc = in__.scalar_lb_constrain(0,lp__); + else + conc = in__.scalar_lb_constrain(0); + Eigen::Matrix beta; (void) beta; // dummy to suppress unused var warning if (jacobian__) @@ -321,6 +336,7 @@ class model_bb_glm_fix_conc : public prob_grad { void get_param_names(std::vector& names__) const { names__.resize(0); + names__.push_back("conc"); names__.push_back("beta"); } @@ -329,6 +345,8 @@ class model_bb_glm_fix_conc : public prob_grad { dimss__.resize(0); std::vector dims__; dims__.resize(0); + dimss__.push_back(dims__); + dims__.resize(0); dims__.push_back(P); dimss__.push_back(dims__); } @@ -343,10 +361,12 @@ class model_bb_glm_fix_conc : public prob_grad { std::ostream* pstream__ = 0) const { vars__.resize(0); stan::io::reader in__(params_r__,params_i__); - static const char* function__ = "model_bb_glm_fix_conc_namespace::write_array"; + static const char* function__ = "model_bb_glm_namespace::write_array"; (void) function__; // dummy to suppress unused var warning // read-transform, write parameters + double conc = in__.scalar_lb_constrain(0); vector_d beta = in__.vector_constrain(P); + vars__.push_back(conc); for (int k_0__ = 0; k_0__ < P; ++k_0__) { vars__.push_back(beta[k_0__]); } @@ -408,7 +428,7 @@ class model_bb_glm_fix_conc : public prob_grad { } static std::string model_name() { - return "model_bb_glm_fix_conc"; + return "model_bb_glm"; } @@ -416,6 +436,9 @@ class model_bb_glm_fix_conc : public prob_grad { bool include_tparams__ = true, bool include_gqs__ = true) const { std::stringstream param_name_stream__; + param_name_stream__.str(std::string()); + param_name_stream__ << "conc"; + param_names__.push_back(param_name_stream__.str()); for (int k_0__ = 1; k_0__ <= P; ++k_0__) { param_name_stream__.str(std::string()); param_name_stream__ << "beta" << '.' << k_0__; @@ -432,6 +455,9 @@ class model_bb_glm_fix_conc : public prob_grad { bool include_tparams__ = true, bool include_gqs__ = true) const { std::stringstream param_name_stream__; + param_name_stream__.str(std::string()); + param_name_stream__ << "conc"; + param_names__.push_back(param_name_stream__.str()); for (int k_0__ = 1; k_0__ <= P; ++k_0__) { param_name_stream__.str(std::string()); param_name_stream__ << "beta" << '.' << k_0__; @@ -454,7 +480,7 @@ class model_bb_glm_fix_conc : public prob_grad { #include -namespace model_bb_glm_namespace { +namespace model_bb_glm_fix_conc_namespace { using std::istream; using std::string; @@ -473,28 +499,29 @@ static int current_statement_begin__; stan::io::program_reader prog_reader__() { stan::io::program_reader reader; - reader.add_event(0, 0, "start", "model_bb_glm"); - reader.add_event(28, 28, "end", "model_bb_glm"); + reader.add_event(0, 0, "start", "model_bb_glm_fix_conc"); + reader.add_event(28, 28, "end", "model_bb_glm_fix_conc"); return reader; } -class model_bb_glm : public prob_grad { +class model_bb_glm_fix_conc : public prob_grad { private: int N; int P; matrix_d x; vector ys; vector ns; + double conc; double concShape; double concRate; public: - model_bb_glm(stan::io::var_context& context__, + model_bb_glm_fix_conc(stan::io::var_context& context__, std::ostream* pstream__ = 0) : prob_grad(0) { ctor_body(context__, 0, pstream__); } - model_bb_glm(stan::io::var_context& context__, + model_bb_glm_fix_conc(stan::io::var_context& context__, unsigned int random_seed__, std::ostream* pstream__ = 0) : prob_grad(0) { @@ -510,7 +537,7 @@ class model_bb_glm : public prob_grad { current_statement_begin__ = -1; - static const char* function__ = "model_bb_glm_namespace::model_bb_glm"; + static const char* function__ = "model_bb_glm_fix_conc_namespace::model_bb_glm_fix_conc"; (void) function__; // dummy to suppress unused var warning size_t pos__; (void) pos__; // dummy to suppress unused var warning @@ -565,6 +592,11 @@ class model_bb_glm : public prob_grad { for (size_t i_0__ = 0; i_0__ < ns_limit_0__; ++i_0__) { ns[i_0__] = vals_i__[pos__++]; } + context__.validate_dims("data initialization", "conc", "double", context__.to_vec()); + conc = double(0); + vals_r__ = context__.vals_r("conc"); + pos__ = 0; + conc = vals_r__[pos__++]; context__.validate_dims("data initialization", "concShape", "double", context__.to_vec()); concShape = double(0); vals_r__ = context__.vals_r("concShape"); @@ -585,6 +617,7 @@ class model_bb_glm : public prob_grad { for (int k0__ = 0; k0__ < N; ++k0__) { check_greater_or_equal(function__,"ns[k0__]",ns[k0__],0); } + check_greater_or_equal(function__,"conc",conc,0); check_greater_or_equal(function__,"concShape",concShape,0); check_greater_or_equal(function__,"concRate",concRate,0); // initialize data variables @@ -601,12 +634,11 @@ class model_bb_glm : public prob_grad { // validate, set parameter ranges num_params_r__ = 0U; param_ranges_i__.clear(); - ++num_params_r__; validate_non_negative_index("beta", "P", P); num_params_r__ += P; } - ~model_bb_glm() { } + ~model_bb_glm_fix_conc() { } void transform_inits(const stan::io::var_context& context__, @@ -619,20 +651,6 @@ class model_bb_glm : public prob_grad { std::vector vals_r__; std::vector vals_i__; - if (!(context__.contains_r("conc"))) - throw std::runtime_error("variable conc missing"); - vals_r__ = context__.vals_r("conc"); - pos__ = 0U; - context__.validate_dims("initialization", "conc", "double", context__.to_vec()); - // generate_declaration conc - double conc(0); - conc = vals_r__[pos__++]; - try { - writer__.scalar_lb_unconstrain(0,conc); - } catch (const std::exception& e) { - throw std::runtime_error(std::string("Error transforming variable conc: ") + e.what()); - } - if (!(context__.contains_r("beta"))) throw std::runtime_error("variable beta missing"); vals_r__ = context__.vals_r("beta"); @@ -679,13 +697,6 @@ class model_bb_glm : public prob_grad { // model parameters stan::io::reader in__(params_r__,params_i__); - T__ conc; - (void) conc; // dummy to suppress unused var warning - if (jacobian__) - conc = in__.scalar_lb_constrain(0,lp__); - else - conc = in__.scalar_lb_constrain(0); - Eigen::Matrix beta; (void) beta; // dummy to suppress unused var warning if (jacobian__) @@ -767,7 +778,6 @@ class model_bb_glm : public prob_grad { void get_param_names(std::vector& names__) const { names__.resize(0); - names__.push_back("conc"); names__.push_back("beta"); } @@ -776,8 +786,6 @@ class model_bb_glm : public prob_grad { dimss__.resize(0); std::vector dims__; dims__.resize(0); - dimss__.push_back(dims__); - dims__.resize(0); dims__.push_back(P); dimss__.push_back(dims__); } @@ -792,12 +800,10 @@ class model_bb_glm : public prob_grad { std::ostream* pstream__ = 0) const { vars__.resize(0); stan::io::reader in__(params_r__,params_i__); - static const char* function__ = "model_bb_glm_namespace::write_array"; + static const char* function__ = "model_bb_glm_fix_conc_namespace::write_array"; (void) function__; // dummy to suppress unused var warning // read-transform, write parameters - double conc = in__.scalar_lb_constrain(0); vector_d beta = in__.vector_constrain(P); - vars__.push_back(conc); for (int k_0__ = 0; k_0__ < P; ++k_0__) { vars__.push_back(beta[k_0__]); } @@ -859,7 +865,7 @@ class model_bb_glm : public prob_grad { } static std::string model_name() { - return "model_bb_glm"; + return "model_bb_glm_fix_conc"; } @@ -867,9 +873,6 @@ class model_bb_glm : public prob_grad { bool include_tparams__ = true, bool include_gqs__ = true) const { std::stringstream param_name_stream__; - param_name_stream__.str(std::string()); - param_name_stream__ << "conc"; - param_names__.push_back(param_name_stream__.str()); for (int k_0__ = 1; k_0__ <= P; ++k_0__) { param_name_stream__.str(std::string()); param_name_stream__ << "beta" << '.' << k_0__; @@ -886,9 +889,6 @@ class model_bb_glm : public prob_grad { bool include_tparams__ = true, bool include_gqs__ = true) const { std::stringstream param_name_stream__; - param_name_stream__.str(std::string()); - param_name_stream__ << "conc"; - param_names__.push_back(param_name_stream__.str()); for (int k_0__ = 1; k_0__ <= P; ++k_0__) { param_name_stream__.str(std::string()); param_name_stream__ << "beta" << '.' << k_0__; @@ -911,7 +911,7 @@ class model_bb_glm : public prob_grad { #include -namespace model_dm_glm_multi_conc_namespace { +namespace model_dm_glm_namespace { using std::istream; using std::string; @@ -930,12 +930,12 @@ static int current_statement_begin__; stan::io::program_reader prog_reader__() { stan::io::program_reader reader; - reader.add_event(0, 0, "start", "model_dm_glm_multi_conc"); - reader.add_event(44, 44, "end", "model_dm_glm_multi_conc"); + reader.add_event(0, 0, "start", "model_dm_glm"); + reader.add_event(41, 41, "end", "model_dm_glm"); return reader; } -class model_dm_glm_multi_conc : public prob_grad { +class model_dm_glm : public prob_grad { private: int N; int P; @@ -945,13 +945,13 @@ class model_dm_glm_multi_conc : public prob_grad { double concShape; double concRate; public: - model_dm_glm_multi_conc(stan::io::var_context& context__, + model_dm_glm(stan::io::var_context& context__, std::ostream* pstream__ = 0) : prob_grad(0) { ctor_body(context__, 0, pstream__); } - model_dm_glm_multi_conc(stan::io::var_context& context__, + model_dm_glm(stan::io::var_context& context__, unsigned int random_seed__, std::ostream* pstream__ = 0) : prob_grad(0) { @@ -967,7 +967,7 @@ class model_dm_glm_multi_conc : public prob_grad { current_statement_begin__ = -1; - static const char* function__ = "model_dm_glm_multi_conc_namespace::model_dm_glm_multi_conc"; + static const char* function__ = "model_dm_glm_namespace::model_dm_glm"; (void) function__; // dummy to suppress unused var warning size_t pos__; (void) pos__; // dummy to suppress unused var warning @@ -1058,11 +1058,10 @@ class model_dm_glm_multi_conc : public prob_grad { num_params_r__ += (K - 1) * P; validate_non_negative_index("beta_scale", "P", P); num_params_r__ += P; - validate_non_negative_index("conc", "K", K); - num_params_r__ += K; + ++num_params_r__; } - ~model_dm_glm_multi_conc() { } + ~model_dm_glm() { } void transform_inits(const stan::io::var_context& context__, @@ -1115,15 +1114,12 @@ class model_dm_glm_multi_conc : public prob_grad { throw std::runtime_error("variable conc missing"); vals_r__ = context__.vals_r("conc"); pos__ = 0U; - validate_non_negative_index("conc", "K", K); - context__.validate_dims("initialization", "conc", "double", context__.to_vec(K)); + context__.validate_dims("initialization", "conc", "double", context__.to_vec()); // generate_declaration conc - std::vector conc(K,double(0)); - for (int i0__ = 0U; i0__ < K; ++i0__) - conc[i0__] = vals_r__[pos__++]; - for (int i0__ = 0U; i0__ < K; ++i0__) - try { - writer__.scalar_lb_unconstrain(0,conc[i0__]); + double conc(0); + conc = vals_r__[pos__++]; + try { + writer__.scalar_lb_unconstrain(0,conc); } catch (const std::exception& e) { throw std::runtime_error(std::string("Error transforming variable conc: ") + e.what()); } @@ -1178,15 +1174,12 @@ class model_dm_glm_multi_conc : public prob_grad { beta_scale.push_back(in__.scalar_constrain()); } - vector conc; - size_t dim_conc_0__ = K; - conc.reserve(dim_conc_0__); - for (size_t k_0__ = 0; k_0__ < dim_conc_0__; ++k_0__) { - if (jacobian__) - conc.push_back(in__.scalar_lb_constrain(0,lp__)); - else - conc.push_back(in__.scalar_lb_constrain(0)); - } + T__ conc; + (void) conc; // dummy to suppress unused var warning + if (jacobian__) + conc = in__.scalar_lb_constrain(0,lp__); + else + conc = in__.scalar_lb_constrain(0); // transformed parameters @@ -1253,18 +1246,9 @@ class model_dm_glm_multi_conc : public prob_grad { stan::math::initialize(lGaA, DUMMY_VAR__); stan::math::fill(lGaA,DUMMY_VAR__); - validate_non_negative_index("s", "K", K); - Eigen::Matrix s(static_cast(K)); - (void) s; // dummy to suppress unused var warning - - stan::math::initialize(s, DUMMY_VAR__); - stan::math::fill(s,DUMMY_VAR__); - stan::math::assign(s, softmax(multiply(beta,get_base1(x,n,"x",1)))); - for (int k = 1; k <= K; ++k) { - stan::math::assign(get_base1_lhs(a,k,"a",1), (get_base1(conc,k,"conc",1) * get_base1(s,k,"s",1))); - } + stan::math::assign(a, multiply(conc,softmax(multiply(beta,get_base1(x,n,"x",1))))); stan::math::assign(suma, sum(a)); stan::math::assign(aPlusY, add(a,get_base1(y,n,"y",1))); for (int k = 1; k <= K; ++k) { @@ -1318,7 +1302,6 @@ class model_dm_glm_multi_conc : public prob_grad { dims__.push_back(P); dimss__.push_back(dims__); dims__.resize(0); - dims__.push_back(K); dimss__.push_back(dims__); } @@ -1332,7 +1315,7 @@ class model_dm_glm_multi_conc : public prob_grad { std::ostream* pstream__ = 0) const { vars__.resize(0); stan::io::reader in__(params_r__,params_i__); - static const char* function__ = "model_dm_glm_multi_conc_namespace::write_array"; + static const char* function__ = "model_dm_glm_namespace::write_array"; (void) function__; // dummy to suppress unused var warning // read-transform, write parameters vector beta_raw; @@ -1345,11 +1328,7 @@ class model_dm_glm_multi_conc : public prob_grad { for (size_t k_0__ = 0; k_0__ < dim_beta_scale_0__; ++k_0__) { beta_scale.push_back(in__.scalar_constrain()); } - vector conc; - size_t dim_conc_0__ = K; - for (size_t k_0__ = 0; k_0__ < dim_conc_0__; ++k_0__) { - conc.push_back(in__.scalar_lb_constrain(0)); - } + double conc = in__.scalar_lb_constrain(0); for (int k_1__ = 0; k_1__ < K; ++k_1__) { for (int k_0__ = 0; k_0__ < P; ++k_0__) { vars__.push_back(beta_raw[k_0__][k_1__]); @@ -1358,9 +1337,7 @@ class model_dm_glm_multi_conc : public prob_grad { for (int k_0__ = 0; k_0__ < P; ++k_0__) { vars__.push_back(beta_scale[k_0__]); } - for (int k_0__ = 0; k_0__ < K; ++k_0__) { - vars__.push_back(conc[k_0__]); - } + vars__.push_back(conc); if (!include_tparams__) return; // declare and define transformed parameters @@ -1419,7 +1396,7 @@ class model_dm_glm_multi_conc : public prob_grad { } static std::string model_name() { - return "model_dm_glm_multi_conc"; + return "model_dm_glm"; } @@ -1439,11 +1416,9 @@ class model_dm_glm_multi_conc : public prob_grad { param_name_stream__ << "beta_scale" << '.' << k_0__; param_names__.push_back(param_name_stream__.str()); } - for (int k_0__ = 1; k_0__ <= K; ++k_0__) { - param_name_stream__.str(std::string()); - param_name_stream__ << "conc" << '.' << k_0__; - param_names__.push_back(param_name_stream__.str()); - } + param_name_stream__.str(std::string()); + param_name_stream__ << "conc"; + param_names__.push_back(param_name_stream__.str()); if (!include_gqs__ && !include_tparams__) return; @@ -1467,11 +1442,9 @@ class model_dm_glm_multi_conc : public prob_grad { param_name_stream__ << "beta_scale" << '.' << k_0__; param_names__.push_back(param_name_stream__.str()); } - for (int k_0__ = 1; k_0__ <= K; ++k_0__) { - param_name_stream__.str(std::string()); - param_name_stream__ << "conc" << '.' << k_0__; - param_names__.push_back(param_name_stream__.str()); - } + param_name_stream__.str(std::string()); + param_name_stream__ << "conc"; + param_names__.push_back(param_name_stream__.str()); if (!include_gqs__ && !include_tparams__) return; @@ -1489,7 +1462,7 @@ class model_dm_glm_multi_conc : public prob_grad { #include -namespace model_dm_glm_robust_namespace { +namespace model_dm_glm_multi_conc_namespace { using std::istream; using std::string; @@ -1508,12 +1481,12 @@ static int current_statement_begin__; stan::io::program_reader prog_reader__() { stan::io::program_reader reader; - reader.add_event(0, 0, "start", "model_dm_glm_robust"); - reader.add_event(52, 52, "end", "model_dm_glm_robust"); + reader.add_event(0, 0, "start", "model_dm_glm_multi_conc"); + reader.add_event(44, 44, "end", "model_dm_glm_multi_conc"); return reader; } -class model_dm_glm_robust : public prob_grad { +class model_dm_glm_multi_conc : public prob_grad { private: int N; int P; @@ -1522,16 +1495,14 @@ class model_dm_glm_robust : public prob_grad { vector y; double concShape; double concRate; - double outlier_prior_a; - double outlier_prior_b; public: - model_dm_glm_robust(stan::io::var_context& context__, + model_dm_glm_multi_conc(stan::io::var_context& context__, std::ostream* pstream__ = 0) : prob_grad(0) { ctor_body(context__, 0, pstream__); } - model_dm_glm_robust(stan::io::var_context& context__, + model_dm_glm_multi_conc(stan::io::var_context& context__, unsigned int random_seed__, std::ostream* pstream__ = 0) : prob_grad(0) { @@ -1547,7 +1518,7 @@ class model_dm_glm_robust : public prob_grad { current_statement_begin__ = -1; - static const char* function__ = "model_dm_glm_robust_namespace::model_dm_glm_robust"; + static const char* function__ = "model_dm_glm_multi_conc_namespace::model_dm_glm_multi_conc"; (void) function__; // dummy to suppress unused var warning size_t pos__; (void) pos__; // dummy to suppress unused var warning @@ -1612,16 +1583,6 @@ class model_dm_glm_robust : public prob_grad { vals_r__ = context__.vals_r("concRate"); pos__ = 0; concRate = vals_r__[pos__++]; - context__.validate_dims("data initialization", "outlier_prior_a", "double", context__.to_vec()); - outlier_prior_a = double(0); - vals_r__ = context__.vals_r("outlier_prior_a"); - pos__ = 0; - outlier_prior_a = vals_r__[pos__++]; - context__.validate_dims("data initialization", "outlier_prior_b", "double", context__.to_vec()); - outlier_prior_b = double(0); - vals_r__ = context__.vals_r("outlier_prior_b"); - pos__ = 0; - outlier_prior_b = vals_r__[pos__++]; // validate, data variables check_greater_or_equal(function__,"N",N,0); @@ -1629,8 +1590,6 @@ class model_dm_glm_robust : public prob_grad { check_greater_or_equal(function__,"K",K,0); check_greater_or_equal(function__,"concShape",concShape,0); check_greater_or_equal(function__,"concRate",concRate,0); - check_greater_or_equal(function__,"outlier_prior_a",outlier_prior_a,0); - check_greater_or_equal(function__,"outlier_prior_b",outlier_prior_b,0); // initialize data variables try { @@ -1652,10 +1611,9 @@ class model_dm_glm_robust : public prob_grad { num_params_r__ += P; validate_non_negative_index("conc", "K", K); num_params_r__ += K; - ++num_params_r__; } - ~model_dm_glm_robust() { } + ~model_dm_glm_multi_conc() { } void transform_inits(const stan::io::var_context& context__, @@ -1721,20 +1679,6 @@ class model_dm_glm_robust : public prob_grad { throw std::runtime_error(std::string("Error transforming variable conc: ") + e.what()); } - if (!(context__.contains_r("outlier_prob"))) - throw std::runtime_error("variable outlier_prob missing"); - vals_r__ = context__.vals_r("outlier_prob"); - pos__ = 0U; - context__.validate_dims("initialization", "outlier_prob", "double", context__.to_vec()); - // generate_declaration outlier_prob - double outlier_prob(0); - outlier_prob = vals_r__[pos__++]; - try { - writer__.scalar_lub_unconstrain(0,1,outlier_prob); - } catch (const std::exception& e) { - throw std::runtime_error(std::string("Error transforming variable outlier_prob: ") + e.what()); - } - params_r__ = writer__.data_r(); params_i__ = writer__.data_i(); } @@ -1795,13 +1739,6 @@ class model_dm_glm_robust : public prob_grad { conc.push_back(in__.scalar_lb_constrain(0)); } - T__ outlier_prob; - (void) outlier_prob; // dummy to suppress unused var warning - if (jacobian__) - outlier_prob = in__.scalar_lub_constrain(0,1,lp__); - else - outlier_prob = in__.scalar_lub_constrain(0,1); - // transformed parameters @@ -1835,7 +1772,6 @@ class model_dm_glm_robust : public prob_grad { stan::math::assign(get_base1_lhs(beta,k,p,"beta",1), (get_base1(beta_scale,p,"beta_scale",1) * (get_base1(get_base1(beta_raw,p,"beta_raw",1),k,"beta_raw",2) - (1.0 / K)))); } } - lp_accum__.add(beta_log(outlier_prob, outlier_prior_a, outlier_prior_b)); lp_accum__.add(gamma_log(conc, concShape, concRate)); for (int n = 1; n <= N; ++n) { { @@ -1850,11 +1786,6 @@ class model_dm_glm_robust : public prob_grad { stan::math::initialize(suma, DUMMY_VAR__); stan::math::fill(suma,DUMMY_VAR__); - T__ sumy; - (void) sumy; // dummy to suppress unused var warning - - stan::math::initialize(sumy, DUMMY_VAR__); - stan::math::fill(sumy,DUMMY_VAR__); validate_non_negative_index("aPlusY", "K", K); Eigen::Matrix aPlusY(static_cast(K)); (void) aPlusY; // dummy to suppress unused var warning @@ -1867,12 +1798,6 @@ class model_dm_glm_robust : public prob_grad { stan::math::initialize(lGaPlusY, DUMMY_VAR__); stan::math::fill(lGaPlusY,DUMMY_VAR__); - validate_non_negative_index("lG1PlusY", "K", K); - Eigen::Matrix lG1PlusY(static_cast(K)); - (void) lG1PlusY; // dummy to suppress unused var warning - - stan::math::initialize(lG1PlusY, DUMMY_VAR__); - stan::math::fill(lG1PlusY,DUMMY_VAR__); validate_non_negative_index("lGaA", "K", K); Eigen::Matrix lGaA(static_cast(K)); (void) lGaA; // dummy to suppress unused var warning @@ -1892,14 +1817,13 @@ class model_dm_glm_robust : public prob_grad { stan::math::assign(get_base1_lhs(a,k,"a",1), (get_base1(conc,k,"conc",1) * get_base1(s,k,"s",1))); } stan::math::assign(suma, sum(a)); - stan::math::assign(sumy, sum(get_base1(y,n,"y",1))); + stan::math::assign(aPlusY, add(a,get_base1(y,n,"y",1))); for (int k = 1; k <= K; ++k) { - stan::math::assign(get_base1_lhs(lGaPlusY,k,"lGaPlusY",1), stan::math::lgamma((get_base1(a,k,"a",1) + get_base1(get_base1(y,n,"y",1),k,"y",2)))); + stan::math::assign(get_base1_lhs(lGaPlusY,k,"lGaPlusY",1), stan::math::lgamma(get_base1(aPlusY,k,"aPlusY",1))); stan::math::assign(get_base1_lhs(lGaA,k,"lGaA",1), stan::math::lgamma(get_base1(a,k,"a",1))); - stan::math::assign(get_base1_lhs(lG1PlusY,k,"lG1PlusY",1), stan::math::lgamma((1.0 + get_base1(get_base1(y,n,"y",1),k,"y",2)))); } - lp_accum__.add(log_sum_exp(((((log((1.0 - outlier_prob)) + stan::math::lgamma(suma)) + sum(lGaPlusY)) - stan::math::lgamma((suma + sumy))) - sum(lGaA)),(((log(outlier_prob) + stan::math::lgamma(K)) + sum(lG1PlusY)) - stan::math::lgamma((K + sumy))))); + lp_accum__.add((((stan::math::lgamma(suma) + sum(lGaPlusY)) - stan::math::lgamma((suma + sum(get_base1(y,n,"y",1))))) - sum(lGaA))); } } } @@ -1931,7 +1855,6 @@ class model_dm_glm_robust : public prob_grad { names__.push_back("beta_raw"); names__.push_back("beta_scale"); names__.push_back("conc"); - names__.push_back("outlier_prob"); } @@ -1948,8 +1871,6 @@ class model_dm_glm_robust : public prob_grad { dims__.resize(0); dims__.push_back(K); dimss__.push_back(dims__); - dims__.resize(0); - dimss__.push_back(dims__); } template @@ -1962,7 +1883,7 @@ class model_dm_glm_robust : public prob_grad { std::ostream* pstream__ = 0) const { vars__.resize(0); stan::io::reader in__(params_r__,params_i__); - static const char* function__ = "model_dm_glm_robust_namespace::write_array"; + static const char* function__ = "model_dm_glm_multi_conc_namespace::write_array"; (void) function__; // dummy to suppress unused var warning // read-transform, write parameters vector beta_raw; @@ -1980,7 +1901,6 @@ class model_dm_glm_robust : public prob_grad { for (size_t k_0__ = 0; k_0__ < dim_conc_0__; ++k_0__) { conc.push_back(in__.scalar_lb_constrain(0)); } - double outlier_prob = in__.scalar_lub_constrain(0,1); for (int k_1__ = 0; k_1__ < K; ++k_1__) { for (int k_0__ = 0; k_0__ < P; ++k_0__) { vars__.push_back(beta_raw[k_0__][k_1__]); @@ -1992,7 +1912,6 @@ class model_dm_glm_robust : public prob_grad { for (int k_0__ = 0; k_0__ < K; ++k_0__) { vars__.push_back(conc[k_0__]); } - vars__.push_back(outlier_prob); if (!include_tparams__) return; // declare and define transformed parameters @@ -2051,7 +1970,7 @@ class model_dm_glm_robust : public prob_grad { } static std::string model_name() { - return "model_dm_glm_robust"; + return "model_dm_glm_multi_conc"; } @@ -2076,9 +1995,6 @@ class model_dm_glm_robust : public prob_grad { param_name_stream__ << "conc" << '.' << k_0__; param_names__.push_back(param_name_stream__.str()); } - param_name_stream__.str(std::string()); - param_name_stream__ << "outlier_prob"; - param_names__.push_back(param_name_stream__.str()); if (!include_gqs__ && !include_tparams__) return; @@ -2107,9 +2023,6 @@ class model_dm_glm_robust : public prob_grad { param_name_stream__ << "conc" << '.' << k_0__; param_names__.push_back(param_name_stream__.str()); } - param_name_stream__.str(std::string()); - param_name_stream__ << "outlier_prob"; - param_names__.push_back(param_name_stream__.str()); if (!include_gqs__ && !include_tparams__) return; @@ -2127,7 +2040,7 @@ class model_dm_glm_robust : public prob_grad { #include -namespace model_dm_glm_namespace { +namespace model_dm_glm_robust_namespace { using std::istream; using std::string; @@ -2146,12 +2059,12 @@ static int current_statement_begin__; stan::io::program_reader prog_reader__() { stan::io::program_reader reader; - reader.add_event(0, 0, "start", "model_dm_glm"); - reader.add_event(41, 41, "end", "model_dm_glm"); + reader.add_event(0, 0, "start", "model_dm_glm_robust"); + reader.add_event(52, 52, "end", "model_dm_glm_robust"); return reader; } -class model_dm_glm : public prob_grad { +class model_dm_glm_robust : public prob_grad { private: int N; int P; @@ -2160,14 +2073,16 @@ class model_dm_glm : public prob_grad { vector y; double concShape; double concRate; + double outlier_prior_a; + double outlier_prior_b; public: - model_dm_glm(stan::io::var_context& context__, + model_dm_glm_robust(stan::io::var_context& context__, std::ostream* pstream__ = 0) : prob_grad(0) { ctor_body(context__, 0, pstream__); } - model_dm_glm(stan::io::var_context& context__, + model_dm_glm_robust(stan::io::var_context& context__, unsigned int random_seed__, std::ostream* pstream__ = 0) : prob_grad(0) { @@ -2183,7 +2098,7 @@ class model_dm_glm : public prob_grad { current_statement_begin__ = -1; - static const char* function__ = "model_dm_glm_namespace::model_dm_glm"; + static const char* function__ = "model_dm_glm_robust_namespace::model_dm_glm_robust"; (void) function__; // dummy to suppress unused var warning size_t pos__; (void) pos__; // dummy to suppress unused var warning @@ -2248,6 +2163,16 @@ class model_dm_glm : public prob_grad { vals_r__ = context__.vals_r("concRate"); pos__ = 0; concRate = vals_r__[pos__++]; + context__.validate_dims("data initialization", "outlier_prior_a", "double", context__.to_vec()); + outlier_prior_a = double(0); + vals_r__ = context__.vals_r("outlier_prior_a"); + pos__ = 0; + outlier_prior_a = vals_r__[pos__++]; + context__.validate_dims("data initialization", "outlier_prior_b", "double", context__.to_vec()); + outlier_prior_b = double(0); + vals_r__ = context__.vals_r("outlier_prior_b"); + pos__ = 0; + outlier_prior_b = vals_r__[pos__++]; // validate, data variables check_greater_or_equal(function__,"N",N,0); @@ -2255,6 +2180,8 @@ class model_dm_glm : public prob_grad { check_greater_or_equal(function__,"K",K,0); check_greater_or_equal(function__,"concShape",concShape,0); check_greater_or_equal(function__,"concRate",concRate,0); + check_greater_or_equal(function__,"outlier_prior_a",outlier_prior_a,0); + check_greater_or_equal(function__,"outlier_prior_b",outlier_prior_b,0); // initialize data variables try { @@ -2274,10 +2201,12 @@ class model_dm_glm : public prob_grad { num_params_r__ += (K - 1) * P; validate_non_negative_index("beta_scale", "P", P); num_params_r__ += P; + validate_non_negative_index("conc", "K", K); + num_params_r__ += K; ++num_params_r__; } - ~model_dm_glm() { } + ~model_dm_glm_robust() { } void transform_inits(const stan::io::var_context& context__, @@ -2330,16 +2259,33 @@ class model_dm_glm : public prob_grad { throw std::runtime_error("variable conc missing"); vals_r__ = context__.vals_r("conc"); pos__ = 0U; - context__.validate_dims("initialization", "conc", "double", context__.to_vec()); + validate_non_negative_index("conc", "K", K); + context__.validate_dims("initialization", "conc", "double", context__.to_vec(K)); // generate_declaration conc - double conc(0); - conc = vals_r__[pos__++]; - try { - writer__.scalar_lb_unconstrain(0,conc); + std::vector conc(K,double(0)); + for (int i0__ = 0U; i0__ < K; ++i0__) + conc[i0__] = vals_r__[pos__++]; + for (int i0__ = 0U; i0__ < K; ++i0__) + try { + writer__.scalar_lb_unconstrain(0,conc[i0__]); } catch (const std::exception& e) { throw std::runtime_error(std::string("Error transforming variable conc: ") + e.what()); } + if (!(context__.contains_r("outlier_prob"))) + throw std::runtime_error("variable outlier_prob missing"); + vals_r__ = context__.vals_r("outlier_prob"); + pos__ = 0U; + context__.validate_dims("initialization", "outlier_prob", "double", context__.to_vec()); + // generate_declaration outlier_prob + double outlier_prob(0); + outlier_prob = vals_r__[pos__++]; + try { + writer__.scalar_lub_unconstrain(0,1,outlier_prob); + } catch (const std::exception& e) { + throw std::runtime_error(std::string("Error transforming variable outlier_prob: ") + e.what()); + } + params_r__ = writer__.data_r(); params_i__ = writer__.data_i(); } @@ -2390,12 +2336,22 @@ class model_dm_glm : public prob_grad { beta_scale.push_back(in__.scalar_constrain()); } - T__ conc; - (void) conc; // dummy to suppress unused var warning + vector conc; + size_t dim_conc_0__ = K; + conc.reserve(dim_conc_0__); + for (size_t k_0__ = 0; k_0__ < dim_conc_0__; ++k_0__) { + if (jacobian__) + conc.push_back(in__.scalar_lb_constrain(0,lp__)); + else + conc.push_back(in__.scalar_lb_constrain(0)); + } + + T__ outlier_prob; + (void) outlier_prob; // dummy to suppress unused var warning if (jacobian__) - conc = in__.scalar_lb_constrain(0,lp__); + outlier_prob = in__.scalar_lub_constrain(0,1,lp__); else - conc = in__.scalar_lb_constrain(0); + outlier_prob = in__.scalar_lub_constrain(0,1); // transformed parameters @@ -2430,6 +2386,7 @@ class model_dm_glm : public prob_grad { stan::math::assign(get_base1_lhs(beta,k,p,"beta",1), (get_base1(beta_scale,p,"beta_scale",1) * (get_base1(get_base1(beta_raw,p,"beta_raw",1),k,"beta_raw",2) - (1.0 / K)))); } } + lp_accum__.add(beta_log(outlier_prob, outlier_prior_a, outlier_prior_b)); lp_accum__.add(gamma_log(conc, concShape, concRate)); for (int n = 1; n <= N; ++n) { { @@ -2444,6 +2401,11 @@ class model_dm_glm : public prob_grad { stan::math::initialize(suma, DUMMY_VAR__); stan::math::fill(suma,DUMMY_VAR__); + T__ sumy; + (void) sumy; // dummy to suppress unused var warning + + stan::math::initialize(sumy, DUMMY_VAR__); + stan::math::fill(sumy,DUMMY_VAR__); validate_non_negative_index("aPlusY", "K", K); Eigen::Matrix aPlusY(static_cast(K)); (void) aPlusY; // dummy to suppress unused var warning @@ -2456,23 +2418,39 @@ class model_dm_glm : public prob_grad { stan::math::initialize(lGaPlusY, DUMMY_VAR__); stan::math::fill(lGaPlusY,DUMMY_VAR__); + validate_non_negative_index("lG1PlusY", "K", K); + Eigen::Matrix lG1PlusY(static_cast(K)); + (void) lG1PlusY; // dummy to suppress unused var warning + + stan::math::initialize(lG1PlusY, DUMMY_VAR__); + stan::math::fill(lG1PlusY,DUMMY_VAR__); validate_non_negative_index("lGaA", "K", K); Eigen::Matrix lGaA(static_cast(K)); (void) lGaA; // dummy to suppress unused var warning stan::math::initialize(lGaA, DUMMY_VAR__); stan::math::fill(lGaA,DUMMY_VAR__); + validate_non_negative_index("s", "K", K); + Eigen::Matrix s(static_cast(K)); + (void) s; // dummy to suppress unused var warning + + stan::math::initialize(s, DUMMY_VAR__); + stan::math::fill(s,DUMMY_VAR__); - stan::math::assign(a, multiply(conc,softmax(multiply(beta,get_base1(x,n,"x",1))))); + stan::math::assign(s, softmax(multiply(beta,get_base1(x,n,"x",1)))); + for (int k = 1; k <= K; ++k) { + stan::math::assign(get_base1_lhs(a,k,"a",1), (get_base1(conc,k,"conc",1) * get_base1(s,k,"s",1))); + } stan::math::assign(suma, sum(a)); - stan::math::assign(aPlusY, add(a,get_base1(y,n,"y",1))); + stan::math::assign(sumy, sum(get_base1(y,n,"y",1))); for (int k = 1; k <= K; ++k) { - stan::math::assign(get_base1_lhs(lGaPlusY,k,"lGaPlusY",1), stan::math::lgamma(get_base1(aPlusY,k,"aPlusY",1))); + stan::math::assign(get_base1_lhs(lGaPlusY,k,"lGaPlusY",1), stan::math::lgamma((get_base1(a,k,"a",1) + get_base1(get_base1(y,n,"y",1),k,"y",2)))); stan::math::assign(get_base1_lhs(lGaA,k,"lGaA",1), stan::math::lgamma(get_base1(a,k,"a",1))); + stan::math::assign(get_base1_lhs(lG1PlusY,k,"lG1PlusY",1), stan::math::lgamma((1.0 + get_base1(get_base1(y,n,"y",1),k,"y",2)))); } - lp_accum__.add((((stan::math::lgamma(suma) + sum(lGaPlusY)) - stan::math::lgamma((suma + sum(get_base1(y,n,"y",1))))) - sum(lGaA))); + lp_accum__.add(log_sum_exp(((((log((1.0 - outlier_prob)) + stan::math::lgamma(suma)) + sum(lGaPlusY)) - stan::math::lgamma((suma + sumy))) - sum(lGaA)),(((log(outlier_prob) + stan::math::lgamma(K)) + sum(lG1PlusY)) - stan::math::lgamma((K + sumy))))); } } } @@ -2504,6 +2482,7 @@ class model_dm_glm : public prob_grad { names__.push_back("beta_raw"); names__.push_back("beta_scale"); names__.push_back("conc"); + names__.push_back("outlier_prob"); } @@ -2518,6 +2497,9 @@ class model_dm_glm : public prob_grad { dims__.push_back(P); dimss__.push_back(dims__); dims__.resize(0); + dims__.push_back(K); + dimss__.push_back(dims__); + dims__.resize(0); dimss__.push_back(dims__); } @@ -2531,7 +2513,7 @@ class model_dm_glm : public prob_grad { std::ostream* pstream__ = 0) const { vars__.resize(0); stan::io::reader in__(params_r__,params_i__); - static const char* function__ = "model_dm_glm_namespace::write_array"; + static const char* function__ = "model_dm_glm_robust_namespace::write_array"; (void) function__; // dummy to suppress unused var warning // read-transform, write parameters vector beta_raw; @@ -2544,7 +2526,12 @@ class model_dm_glm : public prob_grad { for (size_t k_0__ = 0; k_0__ < dim_beta_scale_0__; ++k_0__) { beta_scale.push_back(in__.scalar_constrain()); } - double conc = in__.scalar_lb_constrain(0); + vector conc; + size_t dim_conc_0__ = K; + for (size_t k_0__ = 0; k_0__ < dim_conc_0__; ++k_0__) { + conc.push_back(in__.scalar_lb_constrain(0)); + } + double outlier_prob = in__.scalar_lub_constrain(0,1); for (int k_1__ = 0; k_1__ < K; ++k_1__) { for (int k_0__ = 0; k_0__ < P; ++k_0__) { vars__.push_back(beta_raw[k_0__][k_1__]); @@ -2553,7 +2540,10 @@ class model_dm_glm : public prob_grad { for (int k_0__ = 0; k_0__ < P; ++k_0__) { vars__.push_back(beta_scale[k_0__]); } - vars__.push_back(conc); + for (int k_0__ = 0; k_0__ < K; ++k_0__) { + vars__.push_back(conc[k_0__]); + } + vars__.push_back(outlier_prob); if (!include_tparams__) return; // declare and define transformed parameters @@ -2612,7 +2602,7 @@ class model_dm_glm : public prob_grad { } static std::string model_name() { - return "model_dm_glm"; + return "model_dm_glm_robust"; } @@ -2632,8 +2622,13 @@ class model_dm_glm : public prob_grad { param_name_stream__ << "beta_scale" << '.' << k_0__; param_names__.push_back(param_name_stream__.str()); } + for (int k_0__ = 1; k_0__ <= K; ++k_0__) { + param_name_stream__.str(std::string()); + param_name_stream__ << "conc" << '.' << k_0__; + param_names__.push_back(param_name_stream__.str()); + } param_name_stream__.str(std::string()); - param_name_stream__ << "conc"; + param_name_stream__ << "outlier_prob"; param_names__.push_back(param_name_stream__.str()); if (!include_gqs__ && !include_tparams__) return; @@ -2658,8 +2653,13 @@ class model_dm_glm : public prob_grad { param_name_stream__ << "beta_scale" << '.' << k_0__; param_names__.push_back(param_name_stream__.str()); } + for (int k_0__ = 1; k_0__ <= K; ++k_0__) { + param_name_stream__.str(std::string()); + param_name_stream__ << "conc" << '.' << k_0__; + param_names__.push_back(param_name_stream__.str()); + } param_name_stream__.str(std::string()); - param_name_stream__ << "conc"; + param_name_stream__ << "outlier_prob"; param_names__.push_back(param_name_stream__.str()); if (!include_gqs__ && !include_tparams__) return; diff --git a/leafcutter/vignettes/Usage.Rmd b/leafcutter/vignettes/Usage.Rmd index 567d25b..ba4212e 100644 --- a/leafcutter/vignettes/Usage.Rmd +++ b/leafcutter/vignettes/Usage.Rmd @@ -112,7 +112,7 @@ RNA.NA06989_CEU.chr1.bam CEU RNA.NA06994_CEU.chr1.bam CEU ``` -Having made that file we can run DS (this assumes you have successfully installed the `leafcutter` R package as described under Installation above) +Having made that file we can run DS (this assumes you have successfully installed the `leafcutter` R package as described under [Installation](./Installation.html) above) ``` ../scripts/leafcutter_ds.R --num_threads 4 --exons_file=../leafcutter/data/gencode19_exons.txt.gz ../example_data/testYRIvsEU_perind_numers.counts.gz ../example_data/test_diff_intron.txt ``` diff --git a/leafcutter/vignettes/Visualization.Rmd b/leafcutter/vignettes/Visualization.Rmd index 9cf4263..ad0e4af 100755 --- a/leafcutter/vignettes/Visualization.Rmd +++ b/leafcutter/vignettes/Visualization.Rmd @@ -59,6 +59,7 @@ These should all have been installed when you installed leafcutter: * dplyr * ggplot2 * shiny +* shinyjs * DT * intervals * reshape2 @@ -93,39 +94,48 @@ We've only tested with the GENCODE V26 human GRCh37/GRCh38 and mouse GRCm38.p5 G ### Step 1. Prepare the LeafCutter differential splicing results for visualisation -This step annotates each intron in each cluster at a given false discovery rate. - +This step annotates each intron in each cluster at a given false discovery rate and generates a single `.RData` file with everything the shiny app needs to run. ``` -Rscript prepare_results.R --iFolder \ - --oFolder \ - --support \ - --annotation_code \ - --code \ - --FDR \ +./prepare_results.R [options] _perind_numers.counts.gz _cluster_significance.txt _effect_sizes.txt annotation_code ``` +** _perind_numers.counts.gz ** The same counts matrix you gave to `leafcutter_ds.R`. +** _cluster_significance.txt ** The cluster significant table output by `leafcutter_ds.R`. +** _effect_sizes.txt ** The per junction effect size table output by `leafcutter_ds.R`. +** annotation_code ** will be something like `annotation_codes/gencode_hg19/gencode_hg19` (see above) -**iFolder** - the folder that contains the results of the differential intron excision analysis. The script assumes all results share the same path and **code**. - -**oFolder** - the folder where the results will be written. +Options: +``` + -o OUTPUT, --output=OUTPUT + The output file that will be created ready for loading by run_leafviz.R [leafviz.RData] -**support** - the same support file as used for the previous analysis. + -m META_DATA_FILE, --meta_data_file=META_DATA_FILE + The support file used in the differential splicing analysis. Columns should be file name and condition -**annotation_code** - as before. + -f FDR, --FDR=FDR + the adjusted p value threshold to use [0.05] -**code** - the same code used for the rest of the analysis, eg `testYRIvsEU` + -c CODE, --code=CODE + A name for this analysis (will be available in leafviz through the Summary tab). [leafcutter_ds] -**FDR** - the benjamini-hochberg false discovery rate with which to filter the results. + -h, --help + Show this help message and exit +``` +This will create the Rdata file wherever `--output` is pointed. The file 'prepare_example.sh' shows how this would be done for the example dataset if you wanted to rebuild 'Brain_vs_Heart_results.Rdata'. -This will create the Rdata object `/results/_results.Rdata`. The file 'prepare_example.sh' shows how this would be done for the example dataset if you wanted to rebuild 'Brain_vs_Heart_results.Rdata'. +As a concrete example, let's assume you just ran the example at [Usage](./Usage.html) and you're currently in the `/example_data` subdirectory, then running +``` +../leafviz/prepare_results.R --meta_data_file test_diff_intron.txt --code leafcutter testYRIvsEU_perind_numers.counts.gz leafcutter_ds_cluster_significance.txt leafcutter_ds_effect_sizes.txt ../leafviz/annotation_codes/gencode_hg19/gencode_hg19 +``` +should create an `leafviz.RData` file. ### Step 2. Visualise the results +From the `example_data` directory. ``` -cd leafvis/ -Rscript run_leafvis.R /results/_results.Rdata +cd ../leafviz/ +./run_leafviz.R ../example_data/leafviz.Rdata ``` - -This will load in the Rdata object and run the LeafCutter Visualisation App in your browser. +This will load in the Rdata object and run the LeafCutter Visualisation App in your browser. ### Features diff --git a/leafviz/prepare_results.R b/leafviz/prepare_results.R index 6fae0a9..230e9b5 100755 --- a/leafviz/prepare_results.R +++ b/leafviz/prepare_results.R @@ -5,21 +5,14 @@ ## and prepare for visualisation library(optparse) -require(leafcutter) -library(data.table) -library(stringr) -library(dplyr) -library(magrittr) - -options(echo=TRUE) option_parser=OptionParser( - usage="%prog [options] _perind_numers.counts.gz _cluster_significance.txt _effect_sizes.txt annotation_code \nThe annotation_code should be something like annotation_codes/annotation_codes/gencode_hg19/gencode_hg19.\nresults.RData", + usage="%prog [options] _perind_numers.counts.gz _cluster_significance.txt _effect_sizes.txt annotation_code \nThe annotation_code should be something like annotation_codes/gencode_hg19/gencode_hg19.", option_list=list( - make_option( c("-o","--output"), default="leafviz.RData", help="The output file that will be created ready for loading by run_leafviz.R"), - make_option( "--meta_data_file", default=NULL, help="The support file used in the differential splicing analysis. Columns should be file name and condition"), - make_option( c("-f","--FDR"), default=0.05, help = "the adjusted p value threshold to use"), - make_option( "--code", default="leafcutter_ds", help = "A name for this analysis (will be available in leafviz through the Summary tab.")) + make_option( c("-o","--output"), default="leafviz.RData", help="The output file that will be created ready for loading by run_leafviz.R [%default]"), + make_option( c("-m","--meta_data_file"), default=NULL, help="The support file used in the differential splicing analysis. Columns should be file name and condition"), + make_option( c("-f","--FDR"), default=0.05, help = "the adjusted p value threshold to use [%default]"), + make_option( c("-c","--code"), default="leafcutter_ds", help = "A name for this analysis (will be available in leafviz through the Summary tab). [%default]")) ) parsed_args <- parse_args(option_parser, positional_arguments = 4) @@ -34,6 +27,12 @@ results_file = parsed_args$options$output groups_file <- parsed_args$options$meta_data_file FDR_limit <- parsed_args$options$FDR +require(leafcutter) +library(data.table) +library(stringr) +library(dplyr) +library(magrittr) + cat("Preparing for visualisation\n") # annotation