You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I have a clunky enhancement that I made to make_cicero_cds that requires the use of Rcpp. I'm not the best at pull requests yet, so I'll report it here and if you choose to implement it, it's up to you :)
Basically, I am working with a very large set of imputed data (14,000 cells x 256,000 peaks). Normally this data is very sparse (~95%) but after imputation, less than half a percent is a 0. Previously make_cicero_cds has been able to (although not super efficiently) work with this by converting sparse matrices to dense ones within R. The specific step that was overloading my system (64 core 256GB RAM) was the Matrix multiplication on line 73. new_exprs <- exprs_old %*% mask
So, I decided to spend some time trying to find alternatives.
Rcpp Implementation
I figured that a more efficient way of accomplishing this would be to use C++, implemented through Rcpp. I searched the internet and found this script.
// [[Rcpp::depends(RcppArmadillo, RcppEigen)]]
#include <RcppArmadillo.h>
#include <RcppEigen.h>
// [[Rcpp::export]]
SEXP armaMatMult(arma::mat A, arma::mat B){
arma::mat C = A * B;
return Rcpp::wrap(C);
}
// [[Rcpp::export]]
SEXP eigenMatMult(Eigen::MatrixXd A, Eigen::MatrixXd B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
// [[Rcpp::export]]
SEXP eigenMapMatMult(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
Following the article, I saved this as MatrixMtp.cpp and loaded it into my R session.
Next, I edited the make_cicero_cds function:
Now, this function eigenMatMult works, but the faster version eigenMapMatMult didn't, due to the error
Error in eigenMapMatMult(as(exprs_old, "matrix"), mask) :
Wrong R type for mapped matrix
I'm not a C++ expert so I just used the other version.
It worked! Although it took between 50-125GB of RAM to complete the function, it was able to actually make the dataset. I think there may be other people struggling with this problem, so maybe this will help.
The text was updated successfully, but these errors were encountered:
Hi! I have a clunky enhancement that I made to make_cicero_cds that requires the use of Rcpp. I'm not the best at pull requests yet, so I'll report it here and if you choose to implement it, it's up to you :)
Basically, I am working with a very large set of imputed data (14,000 cells x 256,000 peaks). Normally this data is very sparse (~95%) but after imputation, less than half a percent is a 0. Previously make_cicero_cds has been able to (although not super efficiently) work with this by converting sparse matrices to dense ones within R. The specific step that was overloading my system (64 core 256GB RAM) was the Matrix multiplication on line 73.
new_exprs <- exprs_old %*% mask
So, I decided to spend some time trying to find alternatives.
Rcpp Implementation
I figured that a more efficient way of accomplishing this would be to use C++, implemented through Rcpp. I searched the internet and found this script.
Following the article, I saved this as
MatrixMtp.cpp
and loaded it into my R session.Next, I edited the make_cicero_cds function:
Now, this function
eigenMatMult
works, but the faster versioneigenMapMatMult
didn't, due to the errorI'm not a C++ expert so I just used the other version.
It worked! Although it took between 50-125GB of RAM to complete the function, it was able to actually make the dataset. I think there may be other people struggling with this problem, so maybe this will help.
The text was updated successfully, but these errors were encountered: