SODA Image Segmentation

This repository contains the code and resources necessary to replicate the image segmentation experiments for SourceData-NLP multimodal segmentation of compound figures. The project includes scripts for training and evaluating models, managing dependencies, and running the experiments in a Dockerized environment.

Project Overview

The SODA Image Segmentation project aims to train a multimodal model to separate compound scientific figures into their constituent panels and match them to the correspondent panel captions. We followed a two-step procedure to achieve this goal. We first used object detection algorithms to separate the figure into its panels. Second, we used a multimodal LLM to extract the correspondent panel description from the figure caption, ensuring that the panel caption is understandable on its own, without the need of the context of the full figure caption. This repository provides all necessary scripts, notebooks, and configurations to replicate the experiments conducted in this project.

Installation

Prerequisites

Python 3.9 or higher
Docker (optional, but recommended for environment consistency)

Steps

Clone the repository:

git clone https://github.com/yourusername/soda_image_segmentation.git
cd soda_image_segmentation

Set up the environment:

You can set up the environment using `virtualenv` or `conda`, or you can use Docker.

Using `virtualenv` or `conda`:
```
python -m venv venv
source venv/bin/activate  # On Windows, use \`venv\Scripts\activate\`
pip install -r requirements.txt
```
Using Docker:
```
docker-compose up --build
```
This will build and start a Docker container with all dependencies installed.

The docker container will initialized a JupyterLab server that can be used to interact with the code.

Usage

Running the Experiment

Dataset:

The data needed for the dataset is contained in the folder data/. The original data can be found in the EMBO HuggingFace Hub.

The data can also be generated running:

python src/extract_figure_captions.py --input data/annotated_data.json --output data/figure_captions.jsonl
Finetune the object detection model:

From the doker environment:
```
python src/train_object_detection.py
```
Evaluate the model performance on the SourceData dataset
```
python src/evaluate_on_soda.py
```
Match the extracted panels to their correspondent panel captions
```
 python src/panel_label_matching.py
```
Then check the results using the notebook provided on notebooks/panel_matching_accuracy.ipynb

Project Structure

`src/`: Contains the main source code for training and evaluating the model.
`notebooks/`: Jupyter notebooks for analysis and evaluation.
`data/`: Directory where datasets should be placed.
`runs/`: Contains the outputs of the training runs, including model weights and evaluation metrics.
`Dockerfile` and `docker-compose.yml`: Docker configurations for setting up the environment.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
notebooks		notebooks
runs		runs
src		src
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
setup.py		setup.py
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SODA Image Segmentation

Table of Contents

Project Overview

Installation

Prerequisites

Steps

Usage

Running the Experiment

Project Structure

License

About

Releases

Packages

Languages

source-data/soda_image_segmentation

Folders and files

Latest commit

History

Repository files navigation

SODA Image Segmentation

Table of Contents

Project Overview

Installation

Prerequisites

Steps

Usage

Running the Experiment

Project Structure

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages