Bioconductor is a pivotal and highly influential initiative in the field of bioinformatics and computational biology. It plays a critical role in helping researchers by providing an extensive and comprehensive ecosystem of software tools, packages, and resources tailored specifically for the analysis and interpretation of high-throughput biological data.
Visualizations, especially visualizations of dimension reductions, are the workhorse of all scRNA-seq data analyses. They are hugely important during all stages of analysis, from exploring to presenting results. They often shape analysis decisions, like which batch correction approach to use, and can point to novel discoveries, like novel cell type marker genes or even new cell types. Unfortunately, currently employed visualization strategies struggle with the scale and sparsity of scRNA-seq data, sometimes with disastrous consequences.
Overplotting is not a new problem and it has plagued many different fields, many of which have come up with unique solutions. Here, we suggest the use of hexagonal binning (Carr, 1990; Carr et al., 1987) for dimension reduction representations of scRNA-seq data.
In order to make hexagonal plotting readily available to the scientific community, we developed schex (single-cell hexagonal plotting), an R package that allows users to produce hexagonal binning representations for dimension reductions of scRNA-seq.
However, maintaining research software and making it easy to use can be difficult for researchers in the long run. This project is to aim at finding out which areas of software maintenance can possibly abstracted away from the researcher and into the hands of a Research Software Engineer (RSE).
This project would suit a candidate who was interested in maintaining research software, wants to gain experience within an open source software environment like Bioconductor, and enjoys learning about complex processes.
This role is challenging as the person will need to learn about the data, the package, how to understand the output, as well as looking at the code and the software requirements for open source frameworks such as Bioconductor.
The role will include:
- Documenting how Bioconductor and the software maintenance process currently works,
- Identifying problem areas that could be reviewed,
- Brainstorming and prototyping solutions of ways to reduce maintenance costs by the researcher, and
- Suggest other options and prototype as needed.
To excel in this internship project, the ideal candidate should have a willingness to research and learn about Bioconductor and schex, an ability to learn quickly, an enthusiasm for trying new things, and a willingness to share and communicate information.
The benefits for students whilst undertaking the internship include:
The student will gain practical Research Software Engineering (RSE) experience with potential exposure to Bioconductor, single-cell RNASeq, and how to setup workflows for reproducibility.
The student will gain understanding of how real-world software is assessed, developed and how priorities and requirements are established within a research environment.
The student will have an opportunity to self-direct and be proactive in their approach to a new environment.