Cellij (pronounced as "zillīj", derived from Zellij: a style of mosaic tilework made from individually hand-chiseled tile pieces) is a versatile factor analysis framework for rapidly building and training a wide range of factor analysis models on multi-omics data. Cellij builds upon a Bayesian factor analysis skeleton that is designed to provide a wide-ranging customisability at all levels, ranging from likelihoods and optimisation procedures to sparsity-inducing priors.
Cellij is designed for rapid prototyping of custom factor analysis models, allowing users to efficiently define new models in an iterative fashion. The following code snippet shows an example how to setup and train a model with a predefined sparsity prior.
mdata = cellij.Importer().load_CLL()
# 1. We create a new Factor Analysis model
model = cellij.FactorModel(n_factors=10)
# 2. We add an MuData object to the model
model.add_data(mdata)
# 3. We can add some options if we wish
model.set_model_options(
weight_priors={
"drugs": "Horseshoe",
"methylation": "Horseshoe",
"mrna": "Horseshoe",
},
)
# 4. We train the model
model.fit(epochs=10000)
For basic tutorials on real-world data, please have a look at our notebook repository.
Cellij is a batteries included framework:
- Sparsity priors: Cellij comes with a variety of sparsity priors that you can directly leverage to obtain interpretable results.
- Integration of Covariates: Cellij can incorporate metadata, such as spatial or temporal dependencies between the samples to structure and align the latent space.
- Rapid Prototyping: Cellij is designed for rapid prototyping of custom FA models, allowing (also inexperienced) users to efficiently define new models in an iterative fashion.
- Flexibility: Through our interface, we provide a wide range of options to customize your factor analysis model at all levels.
- Missing values: We do not expect you to impute missing elements in your data with (unreasonable) values, because we can simply ignore them during inference.
Please refer to the documentation. In particular, the
You need to have Python 3.8 or newer installed on your system. If you don't have Python installed, we recommend installing Mambaforge.
There are several alternative options to install cellij:
- Install the latest development version:
pip install git+https://github.com/bioFAM/cellij.git@main
See the changelog.
We appreciate all contributions. If you found a bug, feel free to contribute back without any further discussion.
If you intend to introduce novel features, utility functions, or extensions to the core, we kindly request that you initiate a discussion by opening an issue. Prior dialogue allows us to align the proposed changes with our current development direction. Submitting a pull request without prior discussion could potentially lead to rejection, as it may not align with the core's intended direction, which you may not be aware of.
Cellij has a BSD-style license, as found in the LICENSE file.
If you use Cellij, please consider citing:
@proceedings{rohbeckcellij,
author = {Rohbeck, Martin and Qoku, Arber and Treis, Tim and Theis, Fabian J and Velten, Britta and Buettner, Florian and Stegle, Oliver},
title = {Cellij: A Modular Factor Model Framework for Interpretable and Accelerated Multi-Omics Data Integration},
series = {ICML Workshop on Computational Biology},
year = {2023},
url = {https://icml-compbio.github.io/2023/papers/WCBICML2023_paper124.pdf}
}