Skip to content

Constraint use cases

Ben edited this page Aug 21, 2013 · 4 revisions

This page lists several use cases for users specifying constraints on their matching problem. Let us assume a user has created a distance matrix D with reasonable sparsity (e.g., through the use of caliper() or exactMatch()) created from a data.frame named mydata.

Balanced or roughly balanced data

Assume the user as roughly equal numbers of treated and control units.

1:1 groups

If the user wants 1:1 matched groups, the pairmatch function will compute the proper fraction of the treated or control data to drop and the match will likely succeed: pairmatch(D, mydata)

Minimum average distance

The user wants to minimize average distance within groups and is willing to throw away controls, subject to a minimal sample size level. For example, the user would be willing to throw away up to 1/2 of the observations to get the minimal average distance. The exact number of treated and control units per match can be allowed to vary, perhaps without restriction or perhaps within user-given constraints (specified using min.controls and/or max.controls).

The user may use fullmatch, communicating the fraction of the control group desired to be included in some match using either the omit.fraction or the mean.controls argument. (Having found the optimum solution in this was, such a user is often well advised to try increasing the fraction of the control group that's included, as this may lead to solutions that are near-optimal, in terms of average matched distance, but which utilize more of the control group, and/or to increase the min.controls argument to fullmatch, which tends to increase effective sample size.)

Controls greatly outnumber treated

In this set of use cases, the number of control units is much larger than the number of treated units. Naively running fullmatch may result in matches of tens or hundreds of controls matched to a single treatment. In addiiton to conditions above, the user may also be interested in some of the following cases:

1:k matches

Here the user wants precisely 1:k matches, discarding no treated units. This is the analog of pair matching in the unbalanced case, and the pairmatch function can applied by using the controls = k

1:1 to 1:k matches

Here the user must use fullmatch and set the min.controls and max.controls arguments. The user may specify mean.controls to throw away additional control cases, or may omit that argument and allow the infeasbility recovery mechanism to find a mean.controls level that allows matching. (Note that infeasibility recovery does not guarantee that it will find the largest possible mean.controls, only that it will attempt to find one that works. A user who wants to include more controls could start with the mean.controls that was used and attempt to find a higher value manually.)

k:1 to 1:k matches

In this use case, the user is willing to accept matched groups of size k + 1, with either controls or treatments being in the majority. Here the user can use fullmatch(D, min.controls = 1/k, max.controls = k, data = mydata), and again allow the infeasibility recovery to choose a suitable (though potentially non-optimal, as discussed above) mean.controls.