logo

DIP D2.5: Ontology Repository

DIP WP 2 Deliverable 31 December 2005

This version:
http://sw.deri.org/2005/03/diprdf/wp2.5-20051231
Latest version:
http://sw.deri.org/2005/03/diprdf/wp2.5
Previous version:
n/a
Authors:
Andreas Harth, Erel Sharf
Editor:
Andreas Harth

Copyright © 2005 DIP. All Rights Reserved. DIP liability, trademark, document use, and software licensing rules apply.


Executive Summary

Ontologies provide more expressive modeling primitives than traditional relational database systems. Relational database schemas are rigid and instances have to strictly adhere to the schema. Ontologies have a less restrictive notion of a schema, which is of importance for integrating data in an open environment such as the Web. In D2.5, we have implemented various ways of storing and retrieving ontologies. The main requirement is to be able to handle large-scale data sets with millions of concepts.

One piece of the WP2 development is ORDI, an abstraction layer and API facilitating the use of WSML ontologies in Java programs. Being able to handle and process large-scale ontologies is one of the main requirements for WP2. To this end, we have implemented three repositories for persistent storage facilities of WSML ontologies. All three repositories are using the ORDI API as an interface to other parts of the WP2 ontology management suite.

There are three repository implementations available: Sesame, an already available repository serves as reference implementation for ORDI; FOR, which focuses on corporate environments where commercial databases are used; and YARS, which provides provenance tracking which is especially important in a data integration scenario.

Sesame is an example of an RDF repository based on open source relational databases such as MySQL or PostgreSQL. Sirma has implemented an ORDI Repository API interface to leverage Sesame as a DIP ontology repository. The Sesame ORDI link serves as reference point for the other developments; using an already existing repository allowed for rapid implementation of the ORDI persistence layer.

Unicorn has implemented 'FOR' - Flexible Ontology Repository, which is a scalable RDB-backed repository that works together with commercial database solutions. Being able to store and retrieve WSML ontologies in commercial systems is useful in environments where systems such as Oracle, DB2 and SQLServer are already in production use, which allows for easy transition and migration towards ontology-based systems. The flexibility of the repository is achieved by implementing a generic schema that can store any meta-modeled objects. In addition the Hibernate ORM tool is used to streamline application development and to create an abstraction layer between the application logic and the actual RDB vendor.

NUIG is implementing YARS, which is an RDF ontology repository in Java. The main use case that triggered the development of YARS is in the area of data integration. Current repositories lack the feature of tracking provenance of data which is mandatory in data integration scenarios where data quality is judged by the origin of the data. In addition to the provenance tracking feature, YARS is easy to install and can be embedded into applications only incurring a minor space overhead. YARS is packaged either as JAR file or as a web application, and is released under a BSD-style open-source license.

One reason to develop multiple repositories is to address the different requirements and usage scenarios for ontology management. As additional benefit, we are able to compare scalability between alternative architectures. All repositories are integrated in DIP by implementing the ORDI Repository API layer and therefore applications on top of ORDI can switch between repositories easily.

YARS Fact Sheet
http://sw.deri.org/2005/03/diprdf/FactSheet
FOR Fact Sheet
http://www.unicorn.com/dip/for/v0.2/20060101/UnicornRepositoryFactSheet.html

Valid XHTML 1.1!

$Id: wp2.5-20051231.html 2076 2006-01-01 15:14:45Z aharth $

webmaster