Skip to content

Corpus related to the laws of the Republic of China

Notifications You must be signed in to change notification settings

yezhengkai/roc-law-corpus

Repository files navigation

roc-law-corpus

Install dependencies and current project, but not development dependencies

Use poetry

poetry install --without dev
roc-law-corpus jl-instantiate

Use pip

pip install -r requirements.txt
pip install -e .
roc-law-corpus jl-instantiate

Install dependencies and current project

Use poetry

poetry install
roc-law-corpus jl-instantiate

Use pip

pip install -r requirements_dev.txt
pip install -e .
roc-law-corpus jl-instantiate

Operating on corpus of Judicial Yuan QA

Scraping corpus

roc-law-corpus judicial-yuan-qa scraping data/judicial_yuan_qa_raw.json

Clean corpus

roc-law-corpus judicial-yuan-qa clean data/judicial_yuan_qa_raw.json data/judicial_yuan_qa.json

Operating on corpus of moex exam

Scraping pdfs

roc-law-corpus moex scraping data/moex/ data/moex.json

Extract pdf content

roc-law-corpus moex extract data/moex/ data/moex.json

About

Corpus related to the laws of the Republic of China

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages