Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparseM based approach to issue 134 #136

Merged
merged 14 commits into from
Jun 27, 2024

Conversation

adamrauh
Copy link
Collaborator

Hi all,

I've implemented the approach outlined by @benthestatistician here to solve that issue. Ben's description there captures what's happening pretty well, but to recap things broadly:

  1. slm.fit.csr.fixed has been renamed slm_fit_csr, per Ben's recommendation

  2. The slm_fit_csr function now ensures that the matrix (specifically $x^\prime x$ or xprimex := t(x) %*% x )passed to SparseM::chol() is positive definite. To do this, scan the diagonal of xprimex for 0s and then remove the corresponding rows/columns from xprimex and xy. This is done by creating a sparse matrix (via SparseM) that can be used to subset xprimex and xy via some matrix multiplication. Things are then adjusted after the fact in order to fill in the removed elements. The helper functions SparseM_solve() and create_SparseM_reduction_matrix() assist with these procedures. The main selling point here is that everything is handled with sparse matrices.

  3. I've added some tests for this second helper function to test.utils.R and updated the documentation to reflect the new names/functions. All tests pass for me now.

(This PR also includes a fix for issue 124 -- perhaps we can close the request here and just include those changes here if that works.)

Copy link
Collaborator

@benthestatistician benthestatistician left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this nice new implementation, slm_fit_csr() returns a Cholesky decomp (chol) and coefficients corresponding to the $X'X$ submatrix with nonzero diagonal values. It would be good for it also to return info with which one could identify that submatrix of $X'X$; unless I'm missing something, it doesn't currently do this. So I'd like to suggest revising it and downstream functions in order to present this information in the named list, e.g. as an integer vector gramian_reduction_indices (set equal to which(!zeroes), with zeroes as in the body of SparseM_solve().

Relatedly, I'll suggest renaming create_SparseM_reduction_matrix() to gramian_reduction() or SparseM_gramian_reduction().

I appreciate the testing of create_SparseM_reduction_matrix(), and the code factoring that allowed it to happen. Thanks!

@adamrauh
Copy link
Collaborator Author

Including the index as part of the returned information is a good idea. I've added it as another element to the named list that gets returned from slm_fit_csr(). Thanks!

One thing to note -- As currently implemented, this index gets returned even when there are no zeroes on the diagonal. I can see an argument for only including it when necessary -- that is, when the reduction process is actually needed -- but my instinct would be to keep it there for consistency, even if slightly redundant.

@benthestatistician
Copy link
Collaborator

No, thank you! All this looks good to me now. I'll invite Jake to take a quick look, and if he's OK too then this can go into master.

Regarding Adam's "one thing to note," I think that in cases where no reduction of the Gramian was necessary, it's helpful to have this affirmative reflected in a gramian_reduction_index value of integer(0).

Copy link
Collaborator

@jwbowers jwbowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I especially like the tests.

@jwbowers jwbowers merged commit a660143 into markmfredrickson:main Jun 27, 2024
5 checks passed
benthestatistician added a commit that referenced this pull request Jun 27, 2024
@benthestatistician
Copy link
Collaborator

Thanks, Jake and Adam!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants