Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/feature/TED-1422-1' into feature…
Browse files Browse the repository at this point in the history
…/TED-1422
  • Loading branch information
duprijil committed Sep 28, 2023
2 parents 726a299 + 7c6df08 commit dc20e8d
Show file tree
Hide file tree
Showing 2 changed files with 79 additions and 42 deletions.
59 changes: 17 additions & 42 deletions docs/antora/modules/ROOT/pages/index.adoc
Original file line number Diff line number Diff line change
@@ -1,68 +1,43 @@
= TED-SWS End-User Documentation

Although TED notice data is already available to the general public
through the search API provided by the TED website, the current offering
has many limitations that impede access to and reuse of the data. One
such important impediment is for example the current format of the data.

Historical TED data come in various XML formats that evolved together
with the standard TED XML schema. The imminent introduction of eForms
will also introduce further diversity in the XML data formats available
through TED's search API. This makes it practically impossible for users
to consume and process data that span across several years, as
their information systems must be able to process several different
flavours of the available XML schemas as well as to keep up with the
schema's continuous evolution. Their search capabilities are therefore
confined to a very limited set of metadata.

The TED Semantic Web Service will remove these barriers by providing one
common format for accessing and reusing all TED data. Coupled with the
eProcurement Ontology, the TED data will also have semantics attached to
them allowing users to directly link them with other datasets.
Moreover, users will now be able to perform much more elaborate
queries directly on the data source (through the SPARQL endpoint). This
will reduce their need for data warehousing in order to perform complex
queries.

These developments, by lowering the barriers, will give rise to a vast
number of new use-cases that will enable stakeholders and end-users to
benefit from increased availability of analytics. The ability to perform
complex queries on public procurement data will be equally open to large
information systems as well as to simple desktop users with a copy of
Excel and an internet connection.

To summarize, the TED Semantic Web Service (TED SWS) is a pipeline
system that continuously converts the public procurement notices (in XML
format) available on the TED Website into RDF format, publishes them
into CELLAR and makes them available to the public through CELLAR’s
SPARQL endpoint.
TED Semantic Web Service (TED-SWS) is a pipeline system that continuously
converts the public procurement notices (in XML format) available on the
TED Website into RDF format based on the eProcurement Ontology, and publishes
them into CELLAR repository, hance making them available to the public
through CELLAR’s SPARQL endpoint.

The TED Semantic Web Service (TED-SWS) is plugging together
the TED infrastructure for the collection and publication of public procurement
notices with the infrastructure of http://data.europa.eu/[data.europa.eu]
in order to make public procurement data accessible and reusable as
Linked Open Data (LOD) by users and stakeholders (see xref:motivation.adoc[the detailed motivation]).

== Audience

This documentation is written for a wide audience, with different interests in the TED-SWS project, and different levels of expertise Semantic Web, EU e-Procurement and software infrastructure. More specifically this documentation can be of interest to:

- *Semantic Engineers* interested in understanding and writing mappings from XML to RDF, in particular in the EU eProcurement domain;
- *Software Engineers* interested in integrating mapping suite packages into processing pipelines;
- *End-Users*, such as *Semantic Web Practitioners* or *Experts in eProcurement Domain*, who are interested in understanding how the RDF representation of the e-procurement notices look like, and how this representation conforms to the eProcurement Ontology (ePO).
- *Software Engineers* interested in integrating mapping suite packages into processing pipelines;
- *Semantic Engineers* interested in understanding and writing mappings from XML to RDF, in particular in the EU eProcurement domain;

== Contents

[.tile-container]
--

[.tile]
.Mapping Suite
.Mapping Suites
****
The TED-RDF Mappings are mainly the transformation rules needed by the TED-RDF Conversion Pipeline (both of which are part of the TED Semantic Web Services, aka TED-SWS system) to convert TED notices available in XML format to RDF.
The TED-RDF Mappings are the transformation rules needed by the TED-RDF Conversion Pipeline (both of which are part of the TED Semantic Web Services, aka TED-SWS system) to convert TED notices available in XML format to RDF.
<<ted-rdf-docs:ROOT:mapping_suite/index.adoc#, Read the docs>>
****


[.tile]
.Sample application
.Sample applications
****
Sample application represents a set of examples that shows how to interact with TED-SWS Data using tools like Python, R or Excel.
Sample application represents a set of examples that shows how to interact with TED RDF Data (available in CELLAR) using tools like Python, R or Excel.
<<ted-rdf-docs:ROOT:sample_app/index.adoc#, Read the docs>>
****
Expand Down
62 changes: 62 additions & 0 deletions docs/antora/modules/ROOT/pages/motivation.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# TED-SWS motivation

In its Strategic Plan for 2020-2024, the Publications Office has
defined specific Ojective 1 on the "European public procurement space"
as part of its general Objective 2 "A Europe fit for the digital age".

In this context the Publications Office has identified the need for reliable and
complete data on public procurement in the EU as being essential
for transparency and accountability of public spending. The ongoing
investments of the Publications Office for the transition to eForms,
and the continued development of the eProcurement Ontology are
identified by the Strategic Plan as being central for
improved data quality and enhanced automation of data processing
and interoperability.

Additionally, in the context of specific objective 2 on the
"European data space", the Publications Office identifies the gap that
still exists between the available wealth of open data, spread across
multiple outlets, and the effort required to discover, access and reuse it.

To bridge this gap, the Strategic Plan for 2020-2024, commits to
generate and share new knowledge as linked open data, through
an ecosystem of datasets, data models, ontologies and specialised services
accessible through a single entry point (http://data.europa.eu/[data.europa.eu])
following a "data-as-a-public-service" approach.

Although TED notice data is already available to the general public
through the search API provided by the TED website, the current offering
has many limitations that impede access to and reuse of the data. One
such important impediment is for example the current format of the data.

Historical TED data come in various XML formats that evolved together
with the standard TED XML schema. The imminent introduction of eForms
will also introduce further diversity in the XML data formats available
through TED's search API. This makes it practically impossible for users
to consume and process data that span across several years, as
their information systems must be able to process several different
flavours of the available XML schemas as well as to keep up with the
schema's continuous evolution. Their search capabilities are therefore
confined to a very limited set of metadata.

The TED Semantic Web Service removes these barriers by providing one
common format for accessing and reusing all TED data. Coupled with the
eProcurement Ontology, the TED data will also have semantics attached to
them allowing users to directly link them with other datasets.
Moreover, users will now be able to perform much more elaborate
queries directly on the data source (through the SPARQL endpoint). This
will reduce their need for data warehousing in order to perform complex
queries.

These developments, by lowering the barriers, will give rise to a vast
number of new use-cases that will enable stakeholders and end-users to
benefit from increased availability of analytics. The ability to perform
complex queries on public procurement data will be equally open to large
information systems as well as to simple desktop users with a copy of
Excel and an internet connection.

To summarize, the TED Semantic Web Service (TED SWS) is a pipeline
system that continuously converts the public procurement notices (in XML
format) available on the TED Website into RDF format, publishes them
into CELLAR and makes them available to the public through CELLAR’s
SPARQL endpoint.

0 comments on commit dc20e8d

Please sign in to comment.