Local Humpback Whales Vocalization

This repository contains a series of Jupyter notebooks designed to guide through the process of data preparation for building a humpback whale vocalization model. The notebooks cover everything from setting up the development environment to data acquisition, data revision, and data preprocessing.

This repository uses the labeled data of humpback whale vocalizations from Orcasound's AWS open data repository. The dataset was prepared by Emily Vierling. It includes ~9,000 labels and is based on ~YY hours of audio data from 3 days during October 03-28, 2021.

Introduction

Humpback whales are known for their complex vocalizations. Understanding these vocalizations can provide valuable insights into their behavior, social structure, and even their emotional states. This project aims to facilitate the building of a machine learning model to predict and retrieve humpback whale vocalizations from raw audio files.

Getting Started

Prerequisites

Python 3.x
IDE: Jupyter Notebook, Jupyter Lab, Visual Studio Code, web IDE (e.g. Google Colaboratory) or any other

Installation

If you are using a local development environment, please follow steps below:

Clone this repository:

git clone https://github.com/LianaN/local_humpback_vocalization.git

Navigate to the project directory:
```
cd local_humpback_vocalization
```

Create a new Python virtual environment:

python -m venv venv
source venv/bin/activate

Install the required packages:
```
pip install -r requirements.txt
```
Launch your preferred IDE to access the notebooks
Or if you are using linux, simply:
```
jupyter notebook notebooks/
```

If you are using Google Colaboratory as your web IDE, please follow instructions from notebooks/0_dev_environment_setup.ipynb to get started.

Notebooks

0_dev_environment_setup.ipynb

Note: Execute this notebook only if you are using Google Colaboratory as your development environment.

This notebook guides through setting up the development environment on Google Colaboratory. It includes instructions for installing necessary packages and setting up Google Drive for data storage.

1_data_acquisition.ipynb

This notebook covers the steps required to acquire humpback whale vocalization data. It includes code for downloading the datasets (annotation and raw audio files).

2_data_revision.ipynb

In this notebook, the starter code for the revision of the acquired data is provided. This includes visualizing audio waveforms, listening to audio samples, and identifying potential issues in the dataset.

3_data_preprocessing.ipynb

This notebook focuses on extracting the humpback whales vocalizations from raw audio data to prepare the data for machine learning.

Contributing

Contributions are welcome! Please read the CONTRIBUTING.md for details on how to contribute to this project.

For any questions or concerns, please open an issue or submit a pull request. Happy modeling!

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
notebooks		notebooks
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local Humpback Whales Vocalization

Table of Contents

Introduction

Getting Started

Prerequisites

Installation

Notebooks

0_dev_environment_setup.ipynb

1_data_acquisition.ipynb

2_data_revision.ipynb

3_data_preprocessing.ipynb

Contributing

About

Releases

Packages

Contributors 2

Languages

LianaN/local_humpback_vocalization

Folders and files

Latest commit

History

Repository files navigation

Local Humpback Whales Vocalization

Table of Contents

Introduction

Getting Started

Prerequisites

Installation

Notebooks

0_dev_environment_setup.ipynb

1_data_acquisition.ipynb

2_data_revision.ipynb

3_data_preprocessing.ipynb

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages