SparseM based approach to issue 134 #136

adamrauh · 2024-06-25T04:21:35Z

Hi all,

I've implemented the approach outlined by @benthestatistician here to solve that issue. Ben's description there captures what's happening pretty well, but to recap things broadly:

slm.fit.csr.fixed has been renamed slm_fit_csr, per Ben's recommendation
The slm_fit_csr function now ensures that the matrix (specifically $x^\prime x$ or xprimex := t(x) %*% x )passed to SparseM::chol() is positive definite. To do this, scan the diagonal of xprimex for 0s and then remove the corresponding rows/columns from xprimex and xy. This is done by creating a sparse matrix (via SparseM) that can be used to subset xprimex and xy via some matrix multiplication. Things are then adjusted after the fact in order to fill in the removed elements. The helper functions SparseM_solve() and create_SparseM_reduction_matrix() assist with these procedures. The main selling point here is that everything is handled with sparse matrices.
I've added some tests for this second helper function to test.utils.R and updated the documentation to reflect the new names/functions. All tests pass for me now.

(This PR also includes a fix for issue 124 -- perhaps we can close the request here and just include those changes here if that works.)

…arseM matrices

man/create_SparseM_reduction_matrix.Rd

benthestatistician

In this nice new implementation, slm_fit_csr() returns a Cholesky decomp (chol) and coefficients corresponding to the $X'X$ submatrix with nonzero diagonal values. It would be good for it also to return info with which one could identify that submatrix of $X'X$; unless I'm missing something, it doesn't currently do this. So I'd like to suggest revising it and downstream functions in order to present this information in the named list, e.g. as an integer vector gramian_reduction_indices (set equal to which(!zeroes), with zeroes as in the body of SparseM_solve().

Relatedly, I'll suggest renaming create_SparseM_reduction_matrix() to gramian_reduction() or SparseM_gramian_reduction().

I appreciate the testing of create_SparseM_reduction_matrix(), and the code factoring that allowed it to happen. Thanks!

adamrauh · 2024-06-26T19:16:49Z

Including the index as part of the returned information is a good idea. I've added it as another element to the named list that gets returned from slm_fit_csr(). Thanks!

One thing to note -- As currently implemented, this index gets returned even when there are no zeroes on the diagonal. I can see an argument for only including it when necessary -- that is, when the reduction process is actually needed -- but my instinct would be to keep it there for consistency, even if slightly redundant.

benthestatistician · 2024-06-26T20:42:46Z

No, thank you! All this looks good to me now. I'll invite Jake to take a quick look, and if he's OK too then this can go into master.

Regarding Adam's "one thing to note," I think that in cases where no reduction of the Gramian was necessary, it's helpful to have this affirmative reflected in a gramian_reduction_index value of integer(0).

jwbowers

This looks great. I especially like the tests.

cf #134, #136

benthestatistician · 2024-06-27T22:04:02Z

Thanks, Jake and Adam!

adamrauh added 10 commits May 19, 2024 00:50

potential patch for markmfredrickson#124

88d1f95

simplifying code for markmfredrickson#124 patch

edbe742

Merge branch 'markmfredrickson:main' into main

dc62789

implementing SparseM based fix to markmfredrickson#134

0323435

documentation changes

d73b972

handling edge case where all xprimex diagonal values are zero

58a6936

adding tests for helper function related to dimension reduction of Sp…

57246d3

…arseM matrices

renaming slm.fit.csr.fixed to slm_fit_csr

4fa526a

fixing naming issue

e3c46fc

fixing documentation issues

af95a34

benthestatistician reviewed Jun 25, 2024

View reviewed changes

man/create_SparseM_reduction_matrix.Rd Outdated Show resolved Hide resolved

documentation fixes

5927976

benthestatistician requested changes Jun 25, 2024

View reviewed changes

adamrauh added 3 commits June 26, 2024 14:49

return gramian reduction index

ea17c62

integrating renamed function

832bbc1

updating docs with new name

5728a27

benthestatistician approved these changes Jun 26, 2024

View reviewed changes

benthestatistician requested a review from jwbowers June 26, 2024 20:43

jwbowers approved these changes Jun 27, 2024

View reviewed changes

jwbowers merged commit a660143 into markmfredrickson:main Jun 27, 2024
5 checks passed

benthestatistician added a commit that referenced this pull request Jun 27, 2024

Docs edits

93e03b5

cf #134, #136

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparseM based approach to issue 134 #136

SparseM based approach to issue 134 #136

adamrauh commented Jun 25, 2024

benthestatistician left a comment

adamrauh commented Jun 26, 2024

benthestatistician commented Jun 26, 2024

jwbowers left a comment

benthestatistician commented Jun 27, 2024

SparseM based approach to issue 134 #136

SparseM based approach to issue 134 #136

Conversation

adamrauh commented Jun 25, 2024

benthestatistician left a comment

Choose a reason for hiding this comment

adamrauh commented Jun 26, 2024

benthestatistician commented Jun 26, 2024

jwbowers left a comment

Choose a reason for hiding this comment

benthestatistician commented Jun 27, 2024