Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'rand > 0.9 is not TRUE' error in BioC 3.15 and 3.16 #4

Closed
hpages opened this issue May 13, 2022 · 7 comments
Closed

'rand > 0.9 is not TRUE' error in BioC 3.15 and 3.16 #4

hpages opened this issue May 13, 2022 · 7 comments

Comments

@hpages
Copy link
Contributor

hpages commented May 13, 2022

https://bioconductor.org/checkResults/3.15/books-LATEST/OSCA.basic/nebbiolo1-buildsrc.html
https://bioconductor.org/checkResults/3.16/books-LATEST/OSCA.basic/nebbiolo2-buildsrc.html

Using this issue to share progress on this and discuss a fix.

Code from the OSCA.basic book that leads to this error:

# Required packages: scRNAseq, scater, org.Mm.eg.db, scran, GSEABase, AUCell, bluster.

# Data loading

library(scRNAseq)
sce.zeisel <- ZeiselBrainData()

library(scater)
sce.zeisel <- aggregateAcrossFeatures(sce.zeisel,
    id=sub("_loc[0-9]+$", "", rownames(sce.zeisel)))

library(org.Mm.eg.db)
rowData(sce.zeisel)$Ensembl <- mapIds(org.Mm.eg.db,
    keys=rownames(sce.zeisel), keytype="SYMBOL", column="ENSEMBL")

# Quality control

stats <- perCellQCMetrics(sce.zeisel, subsets=list(
    Mt=rowData(sce.zeisel)$featureType=="mito"))
qc <- quickPerCellQC(stats, percent_subsets=c("altexps_ERCC_percent",
    "subsets_Mt_percent"))
sce.zeisel <- sce.zeisel[,!qc$discard]

# Normalization

library(scran)
set.seed(1000)
clusters <- quickCluster(sce.zeisel)
sce.zeisel <- computeSumFactors(sce.zeisel, cluster=clusters)
sce.zeisel <- logNormCounts(sce.zeisel)

# Assigning cell labels from gene sets

library(scran)
wilcox.z <- pairwiseWilcox(sce.zeisel, sce.zeisel$level1class, 
    lfc=1, direction="up")
markers.z <- getTopMarkers(wilcox.z$statistics, wilcox.z$pairs,
    pairwise=FALSE, n=50)
lengths(markers.z)

library(scRNAseq)
sce.tasic <- TasicBrainData()

library(GSEABase)
all.sets <- lapply(names(markers.z), function(x) {
    GeneSet(markers.z[[x]], setName=x)
})
all.sets <- GeneSetCollection(all.sets)

library(AUCell)
rankings <- AUCell_buildRankings(counts(sce.tasic),
    plotStats=FALSE, verbose=FALSE)
cell.aucs <- AUCell_calcAUC(all.sets, rankings)
results <- t(assay(cell.aucs))
head(results)

new.labels <- colnames(results)[max.col(results)]

library(bluster)
rand <- pairwiseRand(new.labels, sce.tasic$broad_type, mode="index")
rand
# [1] 0.02957151

stopifnot(rand > 0.9)

Run in about 1 min. on my laptop (Ubuntu 22.04 LTS, 16Gb of RAM).

The new rand value (0.02) is a drastic drop from the original one!

@hpages
Copy link
Contributor Author

hpages commented May 13, 2022

sessionInfo():

R version 4.2.0 Patched (2022-05-04 r82318)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04 LTS

Matrix products: default
BLAS:   /home/hpages/R/R-4.2.r82318/lib/libRblas.so
LAPACK: /home/hpages/R/R-4.2.r82318/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB              LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] bluster_1.7.0               AUCell_1.19.0              
 [3] GSEABase_1.59.0             graph_1.75.0               
 [5] annotate_1.75.0             XML_3.99-0.9               
 [7] scran_1.25.0                org.Mm.eg.db_3.15.0        
 [9] AnnotationDbi_1.59.0        scater_1.25.1              
[11] ggplot2_3.3.6               scuttle_1.7.0              
[13] scRNAseq_2.11.0             SingleCellExperiment_1.19.0
[15] SummarizedExperiment_1.27.1 Biobase_2.57.0             
[17] GenomicRanges_1.49.0        GenomeInfoDb_1.33.3        
[19] IRanges_2.31.0              S4Vectors_0.35.0           
[21] BiocGenerics_0.43.0         MatrixGenerics_1.9.0       
[23] matrixStats_0.62.0         

loaded via a namespace (and not attached):
  [1] AnnotationHub_3.5.0           BiocFileCache_2.5.0          
  [3] igraph_1.3.1                  lazyeval_0.2.2               
  [5] BiocParallel_1.31.3           digest_0.6.29                
  [7] ensembldb_2.21.1              htmltools_0.5.2              
  [9] viridis_0.6.2                 fansi_1.0.3                  
 [11] magrittr_2.0.3                memoise_2.0.1                
 [13] ScaledMatrix_1.5.0            cluster_2.1.3                
 [15] limma_3.53.0                  Biostrings_2.65.0            
 [17] R.utils_2.11.0                prettyunits_1.1.1            
 [19] colorspace_2.0-3              blob_1.2.3                   
 [21] rappdirs_0.3.3                ggrepel_0.9.1                
 [23] dplyr_1.0.9                   crayon_1.5.1                 
 [25] RCurl_1.98-1.6                glue_1.6.2                   
 [27] gtable_0.3.0                  zlibbioc_1.43.0              
 [29] XVector_0.37.0                DelayedArray_0.23.1          
 [31] BiocSingular_1.13.0           scales_1.2.0                 
 [33] DBI_1.1.2                     edgeR_3.39.1                 
 [35] Rcpp_1.0.8.3                  viridisLite_0.4.0            
 [37] xtable_1.8-4                  progress_1.2.2               
 [39] dqrng_0.3.0                   bit_4.0.4                    
 [41] rsvd_1.0.5                    metapod_1.5.0                
 [43] httr_1.4.3                    ellipsis_0.3.2               
 [45] pkgconfig_2.0.3               R.methodsS3_1.8.1            
 [47] dbplyr_2.1.1                  locfit_1.5-9.5               
 [49] utf8_1.2.2                    tidyselect_1.1.2             
 [51] rlang_1.0.2                   later_1.3.0                  
 [53] munsell_0.5.0                 BiocVersion_3.16.0           
 [55] tools_4.2.0                   cachem_1.0.6                 
 [57] cli_3.3.0                     generics_0.1.2               
 [59] RSQLite_2.2.14                ExperimentHub_2.5.0          
 [61] stringr_1.4.0                 fastmap_1.1.0                
 [63] yaml_2.3.5                    bit64_4.0.5                  
 [65] purrr_0.3.4                   KEGGREST_1.37.0              
 [67] AnnotationFilter_1.21.0       sparseMatrixStats_1.9.0      
 [69] mime_0.12                     R.oo_1.24.0                  
 [71] xml2_1.3.3                    biomaRt_2.53.1               
 [73] compiler_4.2.0                beeswarm_0.4.0               
 [75] filelock_1.0.2                curl_4.3.2                   
 [77] png_0.1-7                     interactiveDisplayBase_1.35.0
 [79] tibble_3.1.7                  statmod_1.4.36               
 [81] stringi_1.7.6                 GenomicFeatures_1.49.1       
 [83] lattice_0.20-45               ProtGenerics_1.29.0          
 [85] Matrix_1.4-1                  vctrs_0.4.1                  
 [87] pillar_1.7.0                  lifecycle_1.0.1              
 [89] BiocManager_1.30.17           BiocNeighbors_1.15.0         
 [91] data.table_1.14.2             bitops_1.0-7                 
 [93] irlba_2.3.5                   httpuv_1.6.5                 
 [95] rtracklayer_1.57.0            R6_2.5.1                     
 [97] BiocIO_1.7.1                  promises_1.2.0.1             
 [99] gridExtra_2.3                 vipor_0.4.5                  
[101] assertthat_0.2.1              rjson_0.2.21                 
[103] withr_2.5.0                   GenomicAlignments_1.33.0     
[105] Rsamtools_2.13.1              GenomeInfoDbData_1.2.8       
[107] parallel_4.2.0                hms_1.1.1                    
[109] grid_4.2.0                    beachmat_2.13.0              
[111] DelayedMatrixStats_1.19.0     shiny_1.7.1                  
[113] ggbeeswarm_0.6.0              restfulr_0.0.13              

@hpages
Copy link
Contributor Author

hpages commented May 13, 2022

And the culprit is... a revamping of the AUCell::AUCell_buildRankings() generic and methods between AUCell 1.17.0 and 1.18.0! What's scary is that the function now seems to produce completely different results. Taking a closer look now...

@hpages
Copy link
Contributor Author

hpages commented May 14, 2022

Found it! AUCell::AUCell_buildRankings() now ranks genes from lowest to highest expression instead of from highest to lowest expression. See aertslab/AUCell#27

@hpages
Copy link
Contributor Author

hpages commented May 19, 2022

@LTLA @vjcitn They say they've fixed AUCell. Let's check tomorrow's build report for OSCA.basic 🤞

@hpages
Copy link
Contributor Author

hpages commented May 20, 2022

Still broken 😞 Now it's because of this other regression introduced in the latest AUCell.

@hpages
Copy link
Contributor Author

hpages commented May 23, 2022

@PeteHaitch @lgeistlinger @alanocallaghan As mentioned on Slack, this issue is the last thing preventing the OSCA sub-books from being all green on the build reports:

I've tried one more time to convince the AUCell developers to avoid the kind of breaking change that they've introduced in the latest version of their package. But that's it. My 2-week interim of maintaining the OSCA book ends here 😉

I hope you guys can take it from there.

Thanks again for volunteering and let me know here or on Slack if you have any question.

H.

P.S.: Also please don't forget to update the maintainer in the DESCRIPTION files of the sub-books. Thanks again!

@hpages
Copy link
Contributor Author

hpages commented Oct 2, 2024

This seems to have been addressed for a while. Closing now...

@hpages hpages closed this as completed Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant