Skip to content

Cloud-SPAN/nerc-metagenomics01-file-directories

Repository files navigation

Project Status: Active – The project has reached a stable, usable state and is being actively developed. DOI

Getting started with High Performance Computing: FAIR training for environmental scientists

Cloud-SPAN is a collaboration between the Department of Biology at the University of York and The Software Sustainability Institute funded by the UKRI Innovation Scholars award. It aims to train researchers to effectively generate and analyse environemntal 'omics data using Cloud computing resources.

This "Getting started with High Performance Computing: FAIR training for environmental scientists" course is additionally funded by a NERC Advanced Training: Short Courses award.

This hands-on, online course teaches data analysis for metagenomics projects. It is aimed at environmental scientists with little or no experience of using high performance computing (HPC) for data analysis. In the course we will cover*:

  • navigating file directories and using the command line
  • logging into a remote cloud instance
  • using common commands and running analysis programs in the command line
  • what is metagenomics?
  • following a metagenomics analysis workflow including:
    • performing quality control on reads
    • assembly of reads into a metagenome
    • improving your assembly with polishing
    • binning into species/metagenome-assembled genomes (MAGs)
    • taxonomic assignment and functional annotation using your binned reads

*(Objectives highlighted in bold are covered in this repo's lessons).

The course is taught as a mixture of live coding, online lectures, self-study and drop-in sessions.

Prerequisites

This course assumes no prior experience with the tools covered in the workshop but learners are expected to have some familiarity with biological concepts, including the concept of genomes and microbiomes. Participants should bring their own laptops and plan to participate actively.

To get started, follow the directions in the Setup tab to get access to the required software and data for this workshop.

Acknowledgments

The site infrastructure is based on The Carpentries. The first two lessons are based on our Prenomics course; the rest are based on our Metagenomics course.