Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Update Inception format list. #20

Merged
merged 3 commits into from
Dec 1, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 21 additions & 10 deletions docs/api/formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,33 @@ Documents, annotations and exports can be downloaded/created in different format

INCEpTION doesn't specify in their documentation which formats are supported, but the following have been found and included in `pycaprio`:

* `webanno`: Webanno. This is the default format INCEpTION uses
* `nif`: NIF
* `bin`: Binary.
* `conll2000`: CONLL 2000
* `conll2006`: CONLL 2006
* `conll2009`: CONLL 2009
* `conllcorenlp`: CONLL Core NLP
* `conllu`: CONLLu
* `ctsv`: CTSV
* `ctsv3`: CTSV3
* `dkpro-core-tei`: Dkpro Core TEI
* `html`: HTML
* `lif`: LIF
* `dkpro-core-tei`: TEI
* `perseus_2.1`: Perseus
* `conllu`: Conllu
* `text`: Plain text
* `json`: Json
* `xmi`: XMI
* `nif`: NIF
* `pdf`: PDF
* `perseus_2.1`: Perseus 2.1
* `pubannotation-sections`: Pubannotation sections
* `tcf`: TCF
* `text`: Plain text (**DEFAULT**)
* `textlines`: Text lines
* `tsv`: TSV - Webanno format


You can find a class with all the formats in `pycaprio.core.mappings.DocumentFormats`:
You can find a class with all the formats in `pycaprio.core.mappings.InceptionFormat`:

```python
from pycaprio.core.mappings import InceptionFormat

InceptionFormat.DEFAULT
InceptionFormat.DEFAULT # Defaults to `text`
InceptionFormat.TEI
...
```
32 changes: 25 additions & 7 deletions pycaprio/core/mappings.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,35 @@


class InceptionFormat:
DEFAULT = 'webanno'
WEBANNO = 'webanno'
NIF = 'nif'
LIF = 'lif'
TEI = 'dkpro-core-tei'
PERSEUS = 'perseus_2.1'
DEFAULT = 'text'

BIN = 'bin'
CONLL2000 = 'conll2000'
CONLL2006 = 'conll2006'
CONLL2009 = 'conll2009'
CONLLCORENLP = 'conllcorenlp'
CONLLU = 'conllu'
CTSV = 'ctsv'
CTSV3 = 'ctsv3'
DKPRO_CORE_TEI = 'dkpro-core-tei'
TEI = 'dkpro-core-tei'
HTML = 'html'
LIF = 'lif'
NIF = 'nif'
PDF = 'pdf'
PERSEUS_2_1 = 'perseus_2.1'
PUBANNOTATION_SECTIONS = 'pubannotation-sections'
TCF = 'tcf'
TEXT = 'text'
JSON = 'json'
TEXTLINES = 'textlines'
TSV = 'tsv'

XMI = 'xmi'

PERSEUS = 'perseus_2.1'
WEBANNO = 'tsv'
JSON = 'json'


class AnnotationState:
DEFAULT = 'NEW'
Expand Down