This repository contains examples of DocETL pipelines. Currently, it includes:
Analyzes reviews from ICLR 2024 conference submissions to identify common themes in reviewer feedback, particularly focusing on strengths and weaknesses mentioned across papers.
The review data is stored using Git LFS (Large File Storage) due to its size. To work with the data:
- Install Git LFS if you haven't already:
brew install git-lfs
git lfs install
- Clone the repository with LFS support:
git clone https://github.com/ucbepic/docetl-examples.git
- Pull the large files:
git lfs pull