-
Notifications
You must be signed in to change notification settings - Fork 27
Caching
In order to get closer to the performance of relational database-backed Web applications, we developed an approach for improving the performance of triple stores by caching query results and even complete application objects. The selective invalidation of cache objects, following updates of the underlying knowledge bases, is based on analysing the graph patterns of cached SPARQL queries in order to obtain information about what kind of updates will change the query result.
The PHP implementation of the cache integrated into the Erfurt layer of OntoWiki. The caching component is part of the latest release and is also used by other web applications build on the Erfurt middleware. This implementation furthermore supports application specific object caching with SPARQL dependencies.
/owRoot/libraries/Erfurt/Erfurt/Cache/Frontend/
For developers, that wanna use the cache here is a short example of how to use only the Query Cache:
$query = "SELECT * WHERE {?s ?p ?o .}" $queryCache = Erfurt_App::getInstance()->getQueryCache(); $sparqlResult = $queryCache->load( (string) $query, 'plain'); if ($sparqlResult === Erfurt_Cache_Frontend_QueryCache::ERFURT_CACHE_NO_HIT) { $startTime = microtime(true); $sparqlResult = $this->_sparqlQuery($query); $duration = microtime(true) - $startTime; $queryCache->save( (string) $query, 'plain' , $sparqlResult, $duration); } // do whatever you want with your SPARQL result
If you use OntoWiki and Erfurt SPARQL interfaces, the cache is used by default.
The more interesting part for OntoWiki and Extension developers is that Object Cache. If application object, created on the basis of a SPARQL query result, have to be cached, it is recommended to use that cache. That cache maintains automatically the actuality of object cache elements. If an SPARQL query result is being invalidated because of store changes, the ObjectCache is being notified to invalidate also all object cache entries, that were created on the basis of that newly invalidated SPARQL query. That is possible, if you keep in mind how to store the data in the cache, which is exemplarily visialized on the following depiction.
The following example try to explain how to create a set of application objects with the help of the cache, which could maybe be used to create a resource list. Doing it in this way, that resources are only created one time. After the creation they can be received from the cached until they are invalidated because of store changes.
$erfurt = Erfurt_App::getInstance(); $queryCache = $erfurt->getQueryCache(); $objectCache = $erfurt->getCache(); foreach ($uris as $uri ) { if ( $resource = $objectCache->load((md5($uri)))) { $collection[ $uri ] = $resource; } else { $queryCache->startTransaction((md5($uri))); //that is a very Magic call -> much much SPARQL queries are used to create //that object only by creating it with a given URI $resource = new Model_Resource( $uri ) ; $collection[ $uri ] = $resource; $objectCache->save ( $resource, (md5($uri))) ; $queryCache->endTransaction((md5($uri))); } }
Important on that example are two lines of code:
$queryCache->startTransaction((md5($uri))); ... $queryCache->endTransaction((md5($uri)));
These two lines create the aggretion between the object cache entries and all therefor used SPARQL queries.
In that way it is also possible to make nested calls of start/end-Transactions , maybe to aggregate multiple cached application objects to the same query cache entry.
In the OntoWiki default.ini located in folder owRoot/application/config/ exists a section with default values of the cache configuration. Please copy that stuff into your personel OntoWiki config.ini.
;; Erfurt Query Cache cache.query.enable = true ;; logging is not recommended (performance) ;cache.query.logging = 0 cache.query.type = database ; only database caching at the moment ;; Erfurt Object Cache cache.enable = true ; clear the cache if you switch from 0 to 1! cache.type = database ; database, sqllite
If you want to count how often a cache hit arises or is being invalidated, please enable the cache.query.logging variable
The Query Cache Frontend returns a constant if no cache hit was found:
const ERFURT_CACHE_NO_HIT = "fae0b27c451c728867a567e8c1bb4e53";
For a SPARQL query with a result, whose hash equals that, you have no caching. TODO: does something like "SELECT 123" return the value itself?