Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust ID of duplicate NA patterns #107

Open
benthestatistician opened this issue May 16, 2019 · 0 comments
Open

adjust ID of duplicate NA patterns #107

benthestatistician opened this issue May 16, 2019 · 0 comments

Comments

@benthestatistician
Copy link
Collaborator

benthestatistician commented May 16, 2019

One of the tasks of design_matrix() (aka model_matrix()) is to identify terms with common missingness patterns, so as to avoid storing the same information in two places. Some of the operations would be more succinct if they used duplicated.matrix(..., margin=2) to identify replicate columns of the data frame that is cast but not stored here --

null.record <- rowSums(as.data.frame(ccs.by.term))==0

-- rather than applying duplicated.default() to a list. They may well also be more efficient; the duplicated() help page bears a "Warning" reading:

Using this for lists is potentially slow, especially if the
elements are not atomic vectors (see ‘vector’) or differ only in
their attributes. In the worst case it is O(n^2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant