Skip to content

Commit

Permalink
WIP: dockerize
Browse files Browse the repository at this point in the history
  • Loading branch information
gordonkoehn committed Nov 18, 2024
1 parent 0f59b15 commit f05e352
Show file tree
Hide file tree
Showing 7 changed files with 85 additions and 13 deletions.
29 changes: 29 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Use the official Miniconda3 image as a parent image
FROM continuumio/miniconda3

# Set the working directory in the container
WORKDIR /app

# Copy the environment.yml file into the container at /app
COPY environment.yml /app/environment.yml

# Create the environment and activate it
RUN conda env create -f environment.yml

# Make RUN commands use the new environment
SHELL ["conda", "run", "-n", "sr2silo", "/bin/bash", "-c"]

# Copy the current directory contents into the container at /app
COPY . /app

# Install the sr2silo package
RUN pip install -e .

# Make port 80 available to the world outside this container
EXPOSE 80

# Define environment variable
ENV NAME sr2silo

# Ensure the environment is activated and run vp_deamon.py when the container launches
ENTRYPOINT ["bash", "-c", "source activate sr2silo && python scripts/vp_daemon.py --config /app/scripts/vp_config.json"]
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@

This project will wrangle short-read genomic alignments, for example from wastewater-sampling, into a format for easy import into the SILO sequencing database.

### Usage of the V-Pipe Deamon
`sr2silo` provides a daemon to process files as they arrive. See `scripts/README.md` for details.

## Project Organization

- `.github/workflows`: Contains GitHub Actions used for building, testing, and publishing.
Expand Down
4 changes: 4 additions & 0 deletions docker-compose.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
SAMPLE_DIR=../../../data/sr2silo/daemon_test/samples
TIMELINE_FILE=../../../data/sr2silo/daemon_test/timeline.tsv
DATABASE_DIR=database
BACKUP_DIR=backups
22 changes: 22 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
version: '3.8'

services:
sr2silo:
build: .
volumes:
- ${SAMPLE_DIR}:/app/samples
- ${TIMELINE_FILE}:/app/timeline.tsv
- ${DATABASE_DIR}:/app/database
- ${BACKUP_DIR}:/app/backups
- ./scripts/vp_config.json:/app/scripts/vp_config.json
environment:
- PYTHONUNBUFFERED=1
- SAMPLE_DIR=${SAMPLE_DIR}
- TIMELINE_FILE=${TIMELINE_FILE}
- DATABASE_FILE=/app/database/processed_files.db
- BACKUP_DIR=${BACKUP_DIR}
command: python scripts/vp_daemon.py --config /app/scripts/vp_config.json

volumes:
database:
backups:
32 changes: 23 additions & 9 deletions scripts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

This directory contains scripts for processing and managing sample data. Below is an explanation of the two main scripts in this directory.

## vp_deamon.py
## vp_daemon.py

`vp_deamon.py` is a daemon script that processes new samples from the timeline file and stores the processed samples in the result directory. It performs the following tasks:
`vp_daemon.py` is a daemon script that processes new samples from the timeline file and stores the processed samples in the result directory. It performs the following tasks:

1. **Load Configuration**: Loads configuration settings from a JSON file using Pydantic for validation.
2. **Initialize Database**: Initializes a SQLite database to keep track of processed samples.
Expand All @@ -17,20 +17,24 @@ This directory contains scripts for processing and managing sample data. Below i
To run the daemon script, execute the following command:

```sh
python vp_deamon.py --config scripts/vp_config.json
python vp_daemon.py --config scripts/vp_config.json
```
Ensure that the configuration file vp_config.json is present in the scripts directory with the necessary settings.
Ensure that the configuration file `vp_config.json` is present in the scripts directory with the necessary settings.

## vp_transformer.py
`vp_transformer.py` is a script that contains the core processing logic for transforming sample data. This script is used by `vp_deamon.py` to process new samples.

## Usage
This script is typically not run directly. Instead, it is imported and used by `vp_deamon.py`.
`vp_transformer.py` is a script that contains the core processing logic for transforming sample data. This script is used by `vp_daemon.py` to process new samples.

### Usage

This script is typically not run directly. Instead, it is imported and used by `vp_daemon.py`.

## Legacy Notice

The core processing logic in these scripts is based on the dgivec scripts, which were the foundation of this package. These scripts are retained here for legacy reasons and to ensure compatibility with existing workflows.

## Configuration

The configuration file `vp_config.json` should have the following structure:

```json
Expand All @@ -41,15 +45,25 @@ The configuration file `vp_config.json` should have the following structure:
"nextclade_reference": "The reference to use for nextclade.",
"database_file": "The path to the database file.",
"backup_dir": "The directory where the backups are stored.",
"deamon_interval_m": "The interval in minutes to run the daemon."
"daemon_interval_m": "The interval in minutes to run the daemon."
}
```

- `sample_dir`: The directory where the samples are stored.
- `timeline_file`: The path to the timeline file.
- `result_dir`: The directory where the results are stored.
- `nextclade_reference`: The reference to use for nextclade.
- `database_file`: The path to the database file.
- `backup_dir`: The directory where the backups are stored.
- `deamon_interval_m`: The interval in minutes to run the daemon.
- `daemon_interval_m`: The interval in minutes to run the daemon.

Ensure that all paths are correctly set in the configuration file before running the scripts.

## Run from Docker

Adjust paths in `docker-compose.env` to match the `scripts/vp_config.json`, then run with:

```sh
docker-compose --env-file docker-compose.env build
docker-compose --env-file docker-compose.env up
```
8 changes: 4 additions & 4 deletions scripts/vp_config.json
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{
"sample_dir": "../../../data/sr2silo/deamon_test/samples/",
"result_dir": "deamon_test/results/",
"timeline_file": "../../../data/sr2silo/deamon_test/timeline.tsv",
"sample_dir": "/app/samples",
"result_dir": "daemon_test/results/",
"timeline_file": "/app/timeline.tsv",
"nextclade_reference": "sars-cov-2",
"database_file": "processed_files.db",
"backup_dir": "backups/",
"deamon_interval_m": 1
}
}
File renamed without changes.

0 comments on commit f05e352

Please sign in to comment.