Skip to content

Commit

Permalink
docs: provide QUDT annotation demonstration
Browse files Browse the repository at this point in the history
Add a demonstration of the QUDT unit annotation process for EML records,
providing a practical guide for users.
  • Loading branch information
clnsmth authored Nov 13, 2024
1 parent e5a96a5 commit f52a448
Showing 1 changed file with 52 additions and 0 deletions.
52 changes: 52 additions & 0 deletions docs/source/user/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,55 @@ For the latest development version::
$ pip install git+https://github.com/EDIorg/spinneret.git@development


Adding QUDT Annotations to an EML File
---------------------------------------

.. code-block:: python
from spinneret.datasets import get_example_eml_dir
from spinneret.workbook import create, delete_unannotated_rows
from spinneret.utilities import load_eml, write_eml
from spinneret.annotator import add_qudt_annotations_to_workbook, \
annotate_eml, get_qudt_annotation
Starting with an example EML file without QUDT annotations

.. code-block:: python
eml_file = get_example_eml_dir() + "/edi.3.9.xml"
We initialize a "workbook" to store the annotations in a tabular format. The
workbook contents are later added to the EML file as annotation elements.

.. code-block:: python
workbook = create(eml_file, elements=["attribute"])
A QUDT workbook "annotator" then searches through the EML for data entity
attributes with EML standard units (or custom units) that may be mapped to
QUDT equivalents via https://vocab.lternet.edu/webservice/unitsws.php, and adds
successful matches to the workbook.

.. code-block:: python
eml = load_eml(eml_file)
workbook = add_qudt_annotations_to_workbook(workbook, eml)
workbook = delete_unannotated_rows(workbook) # a little cleanup
We can now transfer the annotations from the workbook to the EML and write it
to file.

.. code-block:: python
annotated_eml = annotate_eml(eml_file, workbook)
output_path = "/Users/me/Data/edi.3.9.xml" # or wherever you'd like
write_eml(annotated_eml, output_path)
If you prefer a more rudimentary implementation, you can get QUDT label and URI
for specified input text.

.. code-block:: python
get_qudt_annotation("degree")
>>> [{'label': 'Degree', 'uri': 'http://qudt.org/vocab/unit/DEG'}]

0 comments on commit f52a448

Please sign in to comment.