Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 4.6.0 updates #205

Merged
merged 48 commits into from
Nov 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
c3903ed
Cleaning
ngreifer Oct 16, 2024
099df17
Improvements to subclass_scoot (now in Rcpp) and exactify
ngreifer Oct 16, 2024
1adff4b
dist_to_matrix removed (as.matrix now faster); Rcpp for computing n1x…
ngreifer Oct 16, 2024
1995398
Cleaning
ngreifer Oct 16, 2024
da05977
Improvements to NN matching algorithms including better support for m…
ngreifer Oct 16, 2024
37be607
Improvements
ngreifer Oct 16, 2024
4d2bb99
Improvements to NN matching using Rcpp; added caliper splitting featu…
ngreifer Oct 16, 2024
729b30f
Updates and improvements
ngreifer Oct 24, 2024
eb6a7a5
Added fast mm2subclassC
ngreifer Oct 24, 2024
0c44e4d
Code cleaning and small improvements
ngreifer Oct 24, 2024
c136ee7
Moved some aux_functions to utils; utils incorporates some from WeightIt
ngreifer Oct 24, 2024
bfe7300
Improvements
ngreifer Oct 24, 2024
1c45608
Added progressbar with ETA and EMA estimation
ngreifer Oct 24, 2024
8f1f131
Utility to speed up processing on large vectors
ngreifer Oct 24, 2024
abe0fb6
Reordering and minor cleaning
ngreifer Oct 24, 2024
8195b1d
Improvements and updates
ngreifer Oct 24, 2024
85fa5f9
Support for long vectors, ETA progress bar
ngreifer Oct 24, 2024
3b24a08
New matching for m.order = "closest" with mahcovs; computing full dis…
ngreifer Oct 24, 2024
18841e7
Updates to support new matching infrastructure in Rcpp
ngreifer Oct 24, 2024
cc0eb38
Doc and metadata updates
ngreifer Oct 24, 2024
ac11929
Rcpp updates
ngreifer Oct 24, 2024
d259d70
Improved tests
ngreifer Oct 24, 2024
87aad0e
Rcpp updates
ngreifer Oct 24, 2024
3ca53e4
Metadata updates
ngreifer Nov 12, 2024
16e9e60
Cleaning and improvements
ngreifer Nov 12, 2024
5e532e4
Improvements, added normalize option
ngreifer Nov 12, 2024
61df63a
k2k now uses nnmatch and allows m.order
ngreifer Nov 12, 2024
3beef17
New faster helper functions and improvements.
ngreifer Nov 12, 2024
f7645ac
test updates
ngreifer Nov 12, 2024
ad43269
Rcpp cleaning and updates
ngreifer Nov 12, 2024
8f900dc
Improvd eta estimation and speed for progress bar
ngreifer Nov 12, 2024
89d5f02
Improvements to matching algorithms, mostly using vector instead of R…
ngreifer Nov 12, 2024
f2e9482
Rewrote distmat matching to mirror other matching algorithms
ngreifer Nov 12, 2024
70aebd2
Rewrote matching for mahcovs and vec
ngreifer Nov 12, 2024
5086c71
Updates for performance and to use more STL
ngreifer Nov 12, 2024
1a86754
Rcpp updates
ngreifer Nov 12, 2024
583a687
Vignette updates
ngreifer Nov 12, 2024
2c69592
Doc updates
ngreifer Nov 12, 2024
d8a36f8
Rcpp updates
ngreifer Nov 12, 2024
2656f98
README updates
ngreifer Nov 12, 2024
d84ebf1
Added rhub workflow
ngreifer Nov 12, 2024
bcee422
Temp update without gurobi
ngreifer Nov 12, 2024
ccfb496
Vignette updates to improve checks
ngreifer Nov 12, 2024
b04a795
Vignette updates
ngreifer Nov 12, 2024
9282bb9
Cleaning and organization
ngreifer Nov 12, 2024
57426b3
Doc and vignette updates
ngreifer Nov 12, 2024
6c32225
Doc and vignette updates
ngreifer Nov 12, 2024
d3dfd92
Prop for submission
ngreifer Nov 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions .github/workflows/rhub.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# R-hub's generic GitHub Actions workflow file. It's canonical location is at
# https://github.com/r-hub/actions/blob/v1/workflows/rhub.yaml
# You can update this file to a newer version using the rhub2 package:
#
# rhub::rhub_setup()
#
# It is unlikely that you need to modify this file manually.

name: R-hub
run-name: "${{ github.event.inputs.id }}: ${{ github.event.inputs.name || format('Manually run by {0}', github.triggering_actor) }}"

on:
workflow_dispatch:
inputs:
config:
description: 'A comma separated list of R-hub platforms to use.'
type: string
default: 'linux,windows,macos'
name:
description: 'Run name. You can leave this empty now.'
type: string
id:
description: 'Unique ID. You can leave this empty now.'
type: string

jobs:

setup:
runs-on: ubuntu-latest
outputs:
containers: ${{ steps.rhub-setup.outputs.containers }}
platforms: ${{ steps.rhub-setup.outputs.platforms }}

steps:
# NO NEED TO CHECKOUT HERE
- uses: r-hub/actions/setup@v1
with:
config: ${{ github.event.inputs.config }}
id: rhub-setup

linux-containers:
needs: setup
if: ${{ needs.setup.outputs.containers != '[]' }}
runs-on: ubuntu-latest
name: ${{ matrix.config.label }}
strategy:
fail-fast: false
matrix:
config: ${{ fromJson(needs.setup.outputs.containers) }}
container:
image: ${{ matrix.config.container }}

steps:
- uses: r-hub/actions/checkout@v1
- uses: r-hub/actions/platform-info@v1
with:
token: ${{ secrets.RHUB_TOKEN }}
job-config: ${{ matrix.config.job-config }}
- uses: r-hub/actions/setup-deps@v1
with:
token: ${{ secrets.RHUB_TOKEN }}
job-config: ${{ matrix.config.job-config }}
- uses: r-hub/actions/run-check@v1
with:
token: ${{ secrets.RHUB_TOKEN }}
job-config: ${{ matrix.config.job-config }}

other-platforms:
needs: setup
if: ${{ needs.setup.outputs.platforms != '[]' }}
runs-on: ${{ matrix.config.os }}
name: ${{ matrix.config.label }}
strategy:
fail-fast: false
matrix:
config: ${{ fromJson(needs.setup.outputs.platforms) }}

steps:
- uses: r-hub/actions/checkout@v1
- uses: r-hub/actions/setup-r@v1
with:
job-config: ${{ matrix.config.job-config }}
token: ${{ secrets.RHUB_TOKEN }}
- uses: r-hub/actions/platform-info@v1
with:
token: ${{ secrets.RHUB_TOKEN }}
job-config: ${{ matrix.config.job-config }}
- uses: r-hub/actions/setup-deps@v1
with:
job-config: ${{ matrix.config.job-config }}
token: ${{ secrets.RHUB_TOKEN }}
- uses: r-hub/actions/run-check@v1
with:
job-config: ${{ matrix.config.job-config }}
token: ${{ secrets.RHUB_TOKEN }}
13 changes: 7 additions & 6 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Package: MatchIt
Version: 4.5.5.9000
Version: 4.6.0
Title: Nonparametric Preprocessing for Parametric Causal Inference
Description: Selects matched samples of the original treated and
control groups with similar covariate distributions -- can be
Expand Down Expand Up @@ -33,7 +33,8 @@ Imports:
Rcpp,
utils,
stats,
graphics
graphics,
grDevices
Suggests:
optmatch (>= 0.10.6),
Matching,
Expand All @@ -43,20 +44,20 @@ Suggests:
rpart,
mgcv,
CBPS (>= 0.17),
dbarts,
dbarts (>= 0.9-28),
randomForest (>= 4.7-1),
glmnet (>= 4.0),
gbm (>= 2.1.7),
gurobi,
cobalt (>= 4.2.3),
boot,
marginaleffects (>= 0.11.0),
marginaleffects (>= 0.19.0),
sandwich (>= 2.5-1),
survival,
RcppProgress (>= 0.4.2),
highs,
Rglpk,
Rsymphony,
gurobi,
knitr,
rmarkdown,
testthat (>= 3.0.0)
Expand All @@ -71,5 +72,5 @@ URL: https://kosukeimai.github.io/MatchIt/,
BugReports: https://github.com/kosukeimai/MatchIt/issues
VignetteBuilder: knitr
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
RoxygenNote: 7.3.2
Config/testthat/edition: 3
4 changes: 1 addition & 3 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,11 @@ export(scaled_euclidean_dist)
import(graphics)
import(stats)
importFrom(Rcpp,evalCpp)
importFrom(Rcpp,sourceCpp)
importFrom(grDevices,devAskNewPage)
importFrom(grDevices,nclass.FD)
importFrom(grDevices,nclass.Sturges)
importFrom(grDevices,nclass.scott)
importFrom(utils,capture.output)
importFrom(utils,combn)
importFrom(utils,setTxtProgressBar)
importFrom(utils,txtProgressBar)
importFrom(utils,hasName)
useDynLib(MatchIt, .registration = TRUE)
28 changes: 27 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,36 @@ output:
`MatchIt` News and Updates
======

# MatchIt (development version)
# MatchIt 4.6.0

Most improvements are related to performance. Some of these dramatically improve speeds for large datasets. Most come from improvements to `Rcpp` code.

* When using `method = "nearest"`, `m.order` can now be set to `"farthest"` to prioritize hard-to-match treated units. Note this **does not** implement "far matching" but simply changes the order in which the closest matches are selected.

* Speed improvements to `method = "nearest"`, especially when matching on a propensity score.

* Speed improvements to `summary()` when `pair.dist = TRUE` and a `match.matrix` component is not included in the output (e.g., for `method = "full"` or `method = "quick"`).

* Speed improvements to `method = "subclass"` with `min.n` greater than 0.

* A new `normalize` argument has been added to `matchit()`. When set to `TRUE` (the default, which used to be the only option), the nonzero weights in each treatment group are rescaled to have an average of 1. When `FALSE`, the weights generated directly by the matching are returned instead.

* When using `method = "nearest"` with `m.order = "closest"`, the full distance matrix is no longer computed, which increases support for larger samples. This uses an adaptation of an algorithm described by [Rassen et al. (2012)](https://doi.org/10.1002/pds.3263).

* When using `method = "nearest"` with `verbose = TRUE`, the progress bar now displays an estimate of how much time remains.

* When using `method = "nearest"` with `m.order = "closest"` and `ratio` greater than 1, all eligible units will receive their first match before any receive their second, etc. Previously, the closest pairs would be matched regardless of whether other units had been matched. This ensures consistency with other `m.order` arguments.

* Speed and memory improvements to `method = "cem"` with many covariates and a large sample size. Previous versions used a Cartesian expansion of all levels of factor variables, which could easily explode.

* When using `method = "cem"` with `k2k = TRUE`, `m.order` can be set to select the matching order. Allowable options include `"data"` (the default), `"closest"`, `"farthest"`, and `"random"`. `"closest"` is recommended, but `"data"` is the default for now to remain consistent with previous versions.

* Documentation updates.

* Fixed a bug when using `method = "optimal"` or `method = "full"` with `discard` specified and `data` given as a tibble (`tbl_df` object). (#185)

* Fixed a bug when using `method = "cardinality"` with a single covariate. (#194)

# MatchIt 4.5.5

* When using `method = "cardinality"`, a new solver, HiGHS, can be requested by setting `solver = "highs"`, which relies on the `highs` package. This is much faster and more reliable than GLPK and is free and easy to install as a regular R package with no additional requirements.
Expand Down
5 changes: 1 addition & 4 deletions R/MatchIt-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,16 @@
"_PACKAGE"

## usethis namespace: start
#'
#' @import graphics
#' @import stats
#' @importFrom grDevices devAskNewPage
#' @importFrom grDevices nclass.FD
#' @importFrom grDevices nclass.scott
#' @importFrom grDevices nclass.Sturges
#' @importFrom Rcpp evalCpp
#' @importFrom Rcpp sourceCpp
#' @importFrom utils capture.output
#' @importFrom utils combn
#' @importFrom utils setTxtProgressBar
#' @importFrom utils txtProgressBar
#' @importFrom utils hasName
#' @useDynLib MatchIt, .registration = TRUE
## usethis namespace: end
NULL
60 changes: 46 additions & 14 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
@@ -1,36 +1,68 @@
# Generated by using Rcpp::compileAttributes() -> do not edit by hand
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

dist_to_matrixC <- function(d) {
.Call(`_MatchIt_dist_to_matrixC`, d)
all_equal_to <- function(x, y) {
.Call(`_MatchIt_all_equal_to`, x, y)
}

nn_matchC <- function(treat_, ord_, ratio, discarded, reuse_max, distance_ = NULL, distance_mat_ = NULL, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, mah_covs_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC`, treat_, ord_, ratio, discarded, reuse_max, distance_, distance_mat_, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, mah_covs_, antiexact_covs_, unit_id_, disl_prog)
eucdistC_N1xN0 <- function(x, t) {
.Call(`_MatchIt_eucdistC_N1xN0`, x, t)
}

nn_matchC_closest <- function(distance_mat, treat, ratio, discarded, reuse_max, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC_closest`, distance_mat, treat, ratio, discarded, reuse_max, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, antiexact_covs_, unit_id_, disl_prog)
get_splitsC <- function(x, caliper) {
.Call(`_MatchIt_get_splitsC`, x, caliper)
}

nn_matchC_vec <- function(treat_, ord_, ratio_, discarded_, reuse_max, distance_, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC_vec`, treat_, ord_, ratio_, discarded_, reuse_max, distance_, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, antiexact_covs_, unit_id_, disl_prog)
has_n_unique <- function(x, n) {
.Call(`_MatchIt_has_n_unique`, x, n)
}

pairdistsubC <- function(x_, t_, s_, num_sub) {
.Call(`_MatchIt_pairdistsubC`, x_, t_, s_, num_sub)
nn_matchC_distmat <- function(treat_, ord, ratio, discarded, reuse_max, focal_, distance_mat, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC_distmat`, treat_, ord, ratio, discarded, reuse_max, focal_, distance_mat, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, antiexact_covs_, unit_id_, disl_prog)
}

subclass2mmC <- function(subclass, treat, focal) {
.Call(`_MatchIt_subclass2mmC`, subclass, treat, focal)
nn_matchC_distmat_closest <- function(treat, ratio, discarded, reuse_max, distance_mat, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, close = TRUE, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC_distmat_closest`, treat, ratio, discarded, reuse_max, distance_mat, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, antiexact_covs_, unit_id_, close, disl_prog)
}

nn_matchC_mahcovs <- function(treat_, ord, ratio, discarded, reuse_max, focal_, mah_covs, distance_ = NULL, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC_mahcovs`, treat_, ord, ratio, discarded, reuse_max, focal_, mah_covs, distance_, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, antiexact_covs_, unit_id_, disl_prog)
}

nn_matchC_mahcovs_closest <- function(treat, ratio, discarded, reuse_max, mah_covs, distance_ = NULL, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, close = TRUE, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC_mahcovs_closest`, treat, ratio, discarded, reuse_max, mah_covs, distance_, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, antiexact_covs_, unit_id_, close, disl_prog)
}

nn_matchC_vec <- function(treat_, ord, ratio, discarded, reuse_max, focal_, distance, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC_vec`, treat_, ord, ratio, discarded, reuse_max, focal_, distance, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, antiexact_covs_, unit_id_, disl_prog)
}

nn_matchC_vec_closest <- function(treat, ratio, discarded, reuse_max, distance, exact_ = NULL, caliper_dist_ = NULL, caliper_covs_ = NULL, caliper_covs_mat_ = NULL, antiexact_covs_ = NULL, unit_id_ = NULL, close = TRUE, disl_prog = FALSE) {
.Call(`_MatchIt_nn_matchC_vec_closest`, treat, ratio, discarded, reuse_max, distance, exact_, caliper_dist_, caliper_covs_, caliper_covs_mat_, antiexact_covs_, unit_id_, close, disl_prog)
}

pairdistsubC <- function(x, t, s) {
.Call(`_MatchIt_pairdistsubC`, x, t, s)
}

subclass2mmC <- function(subclass_, treat, focal) {
.Call(`_MatchIt_subclass2mmC`, subclass_, treat, focal)
}

mm2subclassC <- function(mm, treat, focal = NULL) {
.Call(`_MatchIt_mm2subclassC`, mm, treat, focal)
}

subclass_scootC <- function(subclass_, treat_, x_, min_n) {
.Call(`_MatchIt_subclass_scootC`, subclass_, treat_, x_, min_n)
}

tabulateC <- function(bins, nbins = NULL) {
.Call(`_MatchIt_tabulateC`, bins, nbins)
}

weights_matrixC <- function(mm, treat) {
.Call(`_MatchIt_weights_matrixC`, mm, treat)
weights_matrixC <- function(mm, treat_, focal = NULL) {
.Call(`_MatchIt_weights_matrixC`, mm, treat_, focal)
}

# Register entry points for exported C++ functions
Expand Down
Loading
Loading