Skip to content

bjnielsen/valentina-sGithubActionsTutorial

Repository files navigation

GitHub Actions for Scientific Data Workflows

Tutorial presented at US-RSE'23 Conference.

Author: Valentina Staneva, eScience Institute, University of Washington

Abstract:

In this tutorial we will introduce Github Actions as a tool for lightweight automation of scientific data workflows. GitHub Actions have become a key tool of the software development lifecycle, however, many scientific programmers who are not involved in software deployment may not be familiar with their functionalities and/or do not know how they can be applied within their data pipeline. Through a sequence of examples, we will demonstrate some of GitHub Actions' applications to automating data processing tasks, such as scheduled deployment of algorithms to streaming data, updating visualizations based on new data, model versioning and performance benchmarking. For the demonstration we will access a public hydrophone stream and compute and visualize statistics of sound patterns. The goal is that participants will leave with their own ideas on how to integrate Github Actions in their own work.

Prerequisites: GitHub account, basic familiarity with git, Github, and version control, programming in a scripting language such as Python/R

Audience: scientific programmers interested in automating components of their workflows through existing tools for software continuous integration/deployment.

Key Learning Objectives:

  • Learners distinguish between Github Actions and Workflows and understand their role within the software development cycle
  • Learners are capable of triggering GitHub Action Workflows in several different ways and can determine which method could be useful in typical data science applications
  • Learners can export (data) outputs of Github Action Workflows, e.g. tables, plots.

GitHub Actions Introduction

Setup

GitHub Actions Python Environment Workflow

First, we will run a basic workflow which creates a python environment with a few scientific packages and prints out their version

Orcasound Spectrogram Visualization Workflow

Next, we will demonstrate how GitHub Actions can be used to display a spectrogram of a snippet from an underwater audio stream.

After the workflow is executed a spec.png file is updated in the repo and is visualized below. alt text

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages