Skip to content

Commit

Permalink
better introductory documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
proycon committed Oct 18, 2023
1 parent 9cbf9d6 commit f7de534
Show file tree
Hide file tree
Showing 3 changed files with 67 additions and 4 deletions.
28 changes: 24 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,35 @@

# STAM Python binding

[STAM](https://github.com/annotation/stam) is a data model for stand-off text annotation and described in detail [here](https://github.com/annotation/stam). This is a python library (to be more specific; a python binding written in Rust) to work with the model.

This library offers a higher-level interface than the underlying Rust library. Implementation is currently in a preliminary stage. We aim to implement the full model and most extensions.
[STAM](https://github.com/annotation/stam) is a data model for stand-off text
annotation and described in detail [here](https://github.com/annotation/stam).
This is a python library (to be more specific; a python binding written in
Rust) to work with the model.

**What can you do with this library?**

* Keep, build and manipulate an efficient in-memory store of texts and annotations on texts
* Search in annotations, data and text:
* Search annotations by data, textual content, relations between text fragments (overlap, embedding, adjacency, etc),
* Search in text (incl. via regular expressions) and find annotations targeting found text selections.
* Search in data (set,key,value) and find annotations that use the data.
* Elementary text operations with regard for text offsets (splitting text on a delimiter, stripping text).
* Convert between different kind of offsets (absolute, relative to other structures, UTF-8 bytes vs unicode codepoints, etc)
* Read and write resources and annotations from/to STAM JSON, STAM CSV, or an optimised binary (CBOR) representation
* The underlying [STAM model](https://github.com/annotation/stam) aims to be clear and simple. It is flexible and
does not commit to any vocabulary or annotation paradigm other than stand-off annotation.

This STAM library is intended as a foundation upon which further applications
can be built that deal with stand-off annotations on text. We implement all
the low-level logic in dealing this so you no longer have to and can focus on your
actual application.

## Installation

``$ pip install stam``

Or if you feel adventurous and have the necessary build-time dependencies installed (Rust), you can try the latest development release from Github:
Or if you feel adventurous and have the necessary build-time dependencies
installed (Rust), you can try the latest development release from Github:

``$ pip install git+https://github.com/annotation/stam-python``

Expand Down
25 changes: 25 additions & 0 deletions stam.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,31 @@ from __future__ import annotations

from typing import Iterator, List, Optional, Union

"""
[STAM](https://github.com/annotation/stam) is a data model for stand-off text
annotation and described in detail [here](https://github.com/annotation/stam).
This is a python library (to be more specific; a python binding written in
Rust) to work with the model.
**What can you do with this library?**
* Keep, build and manipulate an efficient in-memory store of texts and annotations on texts
* Search in annotations, data and text:
* Search annotations by data, textual content, relations between text fragments (overlap, embedding, adjacency, etc),
* Search in text (incl. via regular expressions) and find annotations targeting found text selections.
* Search in data (set,key,value) and find annotations that use the data.
* Elementary text operations with regard for text offsets (splitting text on a delimiter, stripping text).
* Convert between different kind of offsets (absolute, relative to other structures, UTF-8 bytes vs unicode codepoints, etc)
* Read and write resources and annotations from/to STAM JSON, STAM CSV, or an optimised binary (CBOR) representation
* The underlying [STAM model](https://github.com/annotation/stam) aims to be clear and simple. It is flexible and
does not commit to any vocabulary or annotation paradigm other than stand-off annotation.
This STAM library is intended as a foundation upon which further applications
can be built that deal with stand-off annotations on text. We implement all
the low-level logic in dealing this so you no longer have to and can focus on your
actual application.
"""

class AnnotationStore:
"""
An Annotation Store is an unordered collection of annotations, resources and
Expand Down
18 changes: 18 additions & 0 deletions tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,24 @@
"that allow you to work with it directly. In this tutorial we will be using Python and\n",
"the Python library `stam`.\n",
"\n",
"**What can you do with this library?**\n",
"\n",
"* Keep, build and manipulate an efficient in-memory store of texts and annotations on texts\n",
"* Search in annotations, data and text:\n",
" * Search annotations by data, textual content, relations between text fragments (overlap, embedding, adjacency, etc),\n",
" * Search in text (incl. via regular expressions) and find annotations targeting found text selections.\n",
" * Search in data (set,key,value) and find annotations that use the data.\n",
" * Elementary text operations with regard for text offsets (splitting text on a delimiter, stripping text).\n",
" * Convert between different kind of offsets (absolute, relative to other structures, UTF-8 bytes vs unicode codepoints, etc)\n",
"* Read and write resources and annotations from/to STAM JSON, STAM CSV, or an optimised binary (CBOR) representation\n",
" * The underlying [STAM model](https://github.com/annotation/stam) aims to be clear and simple. It is flexible and \n",
" does not commit to any vocabulary or annotation paradigm other than stand-off annotation.\n",
"\n",
"This STAM library is intended as a foundation upon which further applications\n",
"can be built that deal with stand-off annotations on text. We implement all \n",
"the low-level logic in dealing this so you no longer have to and can focus on your \n",
"actual application.\n",
"\n",
"**Note**: The STAM Python library is a so-called Python binding to a STAM library\n",
"written in Rust. This means the library is not written in Python but is\n",
"compiled to machine code and as such offers much better performance. Nevertheless, it\n",
Expand Down

0 comments on commit f7de534

Please sign in to comment.