Skip to content

Releases: myles-lewis/nestedcv

v0.7.12

29 Nov 09:29
Compare
Choose a tag to compare

New features

  • Analyse and plot variable importance by ranking of variables across outer CV
    folds and repeats.
  • Changed repeatcv to enable return of fitted models from the outer CV for
    variable importance or SHAP value calculation.

v0.7.10

16 Aug 17:24
Compare
Choose a tag to compare
  • Fixed oversized SVG figures in vignette.
  • Fixed bug in computing multi-class balanced accuracy. This is now calculated as the mean of the Recall for each class.
  • Added multi-class Matthew's correlation coefficient (MCC) and multi-class F1 macro score.

v0.7.9

04 Jul 15:59
Compare
Choose a tag to compare

Important change

  • Rsquared performance metric for regression/continuous outcomes was previously
    calculated using defaultSummary() function from caret which uses the square
    of Pearson correlation coefficient (r-squared), instead of the correct
    coefficient of determination which is calculated as 1 - rss/tss, where rss =
    residual sum of squares, tss = total sum of squares. The correct formula for
    R-squared is now being applied.

Bugfix

  • Prevent bug if x is a single predictor.

Other updates

  • Updated documentation.

v0.7.8

13 Mar 21:06
Compare
Choose a tag to compare
13/03/2024
  • Added prc() which enables easy building of precision-recall curves from 'nestedcv' models and repeatcv() results.
  • Added predict method for cva.glmnet.
  • Removed magrittr as an imported package. The standard R pipe |> can be used instead.
  • Added metrics() which gives additional performance metrics for binary classification models such as F1 score, Matthew's correlation coefficient and precision recall AUC.
  • Added pls_filter() which uses partial least squares regression to filter features.
  • Enabled parallelisation over repeats in repeatedcv() leading to significant improvement in speed.

v0.7.4

30 Jan 11:24
Compare
Choose a tag to compare
  • Fixed issue with xgboost on linux/windows in nestcv.train() with cv.cores >1.
  • Fixed major bug in multivariate Gaussian and Cox models in nestcv.glmnet()

v0.7.3

05 Dec 08:37
Compare
Choose a tag to compare
30/11/2023
  • Added new feature repeatcv() to apply repeated nested CV to the main
    nestedcv model functions for robust measurement of model performance.
  • Added new feature via modifyX argument to all nestedcv models. This allows
    more powerful manipulation of the predictors such as scaling, imputing missing
    values, adding extra columns through variable manipulations. Importantly these
    are applied to train and test input data separately.
  • Added predict() function for nestcv.SuperLearner()
  • Added pred_SuperLearner wrapper for use with fastshap::explain
  • Fixed parallelisation of nestcv.SuperLearner() on windows.
  • Added support for multivariate Gaussian and Cox models in nestcv.glmnet()

v0.6.9

25 Aug 21:49
Compare
Choose a tag to compare

New features

  • Added argument verbose in nestcv.train(), nestcv.glmnet() and
    outercv()to show progress.
  • Added argument multicore_fork in nestcv.train() and outercv() to allow
    choice of parallelisation between forked multicore processing using mclapply
    or non-forked using parLapply. This can help prevent errors with certain
    multithreaded caret models e.g. model = "xgbTree".
  • In one_hot() changed all_levels argument default to FALSE to be
    compatible with regression models by default.
  • Add coefficient column to lm_filter() full results table

Bug fixes

  • Fixed significant bug in lm_filter() where variables with zero variance were
    incorrectly reporting very low p-values in linear models instead of returning
    NA. This is due to how rank deficient models are handled by
    RcppEigen::fastLmPure. Default method for fastLmPure has been changed to 0
    to allow detection of rank deficient models.
  • Fixed bug in weight() caused by NA. Allow weight() to tolerate character
    vectors.

Latest release to CRAN

02 Jul 21:57
Compare
Choose a tag to compare

New features

  • Better handling of dataframes in filters. keep_factors option has been added to filters to control filtering of factors with 3 or more levels.
  • Added one_hot() for fast one-hot encoding of factors and character columns by creating dummy variables.
  • Added stat_filter() which applies univariate filtering to dataframes with mixed datatype (continuous & categorical combined).
  • Changed one-way ANOVA test in anova_filter() from Rfast::ftests() to matrixTests::col_oneway_welch() for much better accuracy

Bug fixes

  • Fixed bug caused by use of weights with nestcv.train()

v0.6.6

08 Jun 07:57
Compare
Choose a tag to compare

Latest release to CRAN

  • Fixed bug with fastshap package v0.1.0
  • Fixed bug with categorical variables in nestcv.train()

v0.6.4

30 May 12:38
Compare
Choose a tag to compare

Latest release to CRAN