Skip to content

Commit

Permalink
readme... again
Browse files Browse the repository at this point in the history
  • Loading branch information
bonjarlow committed Jun 19, 2024
1 parent dfffa32 commit 31b74fb
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions annotation_pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@ This Python script automates the process of crawling for relevant URLs, scraping
`pip install pandas argparse huggingface-hub`

2. Setup Environment variables in annotation_pipeline/dev.env
LABEL_STUDIO_ACCESS_TOKEN=...
LABEL_STUDIO_PROJECT_ID=...
LABEL_STUDIO_ORGANIZATION=...
- LABEL_STUDIO_ACCESS_TOKEN=...
- LABEL_STUDIO_PROJECT_ID=...
- LABEL_STUDIO_ORGANIZATION=...

As well as in data_source_identification/.env
HUGGINGFACE_ACCESS_TOKEN=...
LABEL_STUDIO_ACCESS_TOKEN=...
LABEL_STUDIO_PROJECT_ID=...
LABEL_STUDIO_ORGANIZATION=...
- HUGGINGFACE_ACCESS_TOKEN=...
- LABEL_STUDIO_ACCESS_TOKEN=...
- LABEL_STUDIO_PROJECT_ID=...
- LABEL_STUDIO_ORGANIZATION=...

## Usage

Expand All @@ -38,4 +38,4 @@ This Python script automates the process of crawling for relevant URLs, scraping
- `--pages num_pages`: Number of pages to search
- `--record-type record_type` (optional): Assumed rescord type for pre-annotation.

e.g. `python annotation_pipeline.py CC-MAIN-2024-10 '*.gov' arrest --pages 2 --record-type Arrest Records`
e.g. `python annotation_pipeline.py CC-MAIN-2024-10 '*.gov' arrest --pages 2 --record-type 'Arrest Records'`

0 comments on commit 31b74fb

Please sign in to comment.