Skip to content

RDF Format

Dan LaRocque edited this page Sep 5, 2014 · 9 revisions
This is the documentation for Faunus 0.4.
Faunus was merged into Titan and renamed Titan-Hadoop in version 0.5.
Documentation for the latest Titan version is available at http://s3.thinkaurelius.com/docs/titan/current.

  • InputFormat: com.thinkaurelius.faunus.formats.edgelist.rdf.RDFInputFormat

The Semantic Web community is one of the original promoters of the graph as an approach to data modeling. Their efforts have led to the development of the RDF data model. While there are many serialization formats for RDF, an RDF graph is composed of RDF triples, in which a subject is connected to an object by a predicate. For instance:

<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .

In this way, RDF is an edge list data model. Faunus, on the other hand, makes use of an adjacency list in its representation. Therefore, for these two formats to interoperate, the RDFInputFormat provided by Faunus contains a MapReduce job that converts an edge list into a adjacency list.

Conversion Parameters

RDF Format

faunus.graph.input.rdf.format

There are numerous RDF serialization formats. Faunus currently supports the following formats:

NOTE: Faunus makes use of LineRecordReader to read statements from an RDF file. If a line (\n) does not contain a complete legal RDF fragment, then an exception is thrown by the RDF parser.

Literal as Property

faunus.graph.input.rdf.literal-as-property

There are two types of triples to be aware of — one in which the object is a URI or blank node, and one in which the object is a literal value. The two types of triples are exemplified below.

<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#age> "32"^^<http://www.w3.org/2001/XMLSchema#int> .

If the above Faunus property is set to true, then the Hercules vertex has an age property with an integer value of 32.

Use Local Name

faunus.graph.input.rdf.use-localname

The theoretically infinite RDF graph is embedded with the infinite address space of URIs. In many situations, the full URI is not desired and as such, if the above property is set to true, then

<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .

Generates vertices with name hercules and jupiter connected by a father edge.

As Properties

faunus.graph.input.rdf.as-properties

RDF is a triple data model — there are no properties, only vertices and edges. In some situations, an object URI should be treated as a property of the vertex. For instance, when http://www.w3.org/1999/02/22-rdf-syntax-ns#type is specified in the String list of the property above, then the triple

<http://thinkaurelius.com#hercules> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://thinkaurelius.com#demigod>

yields a Hercules vertex with type-property demigod. A typical setting for this property is below.

faunus.input.format.rdf.as-properties=http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2000/01/rdf-schema#label