logo

DIP D2.5: Ontology Repository

DIP WP 2 Deliverable 30 June 2006

This version:
http://sw.deri.org/2005/03/diprdf/wp2.5-20060630
Latest version:
http://sw.deri.org/2005/03/diprdf/wp2.5
Previous version:
http://sw.deri.org/2005/03/diprdf/wp2.5-20051231 

Author:
Andreas Harth, NUIG, andreas.harth@deri.org
Reviewers :
Gábor Nagypál, FZI Karlsruhe, Nagypal@fzi.de
Damyan Ognyanoff, Sirma, damyan@sirma.bg

This document is also available in non-normative PDF version.
Copyright © 2006 by DIP. All Rights Reserved. DIP liability, trademark, document use, and software licensing rules apply.


Document Information

IST Project Number FP6 – 507483 Acronym DIP
Full Title Data, Information, and Process Integration with Semantic Web Services
Project URL http://dip.semanticweb.org
Document URL http://sw.deri.org/2005/03/diprdf/wp2.5
EU Project Officer Kai Tullius

Deliverable Number 2.5 Title Ontology Repository
Work package Number 2 Title Ontology Management

Date of Delivery contractual M30 actual 30-June-2006
Status version 1.0 final
Nature
Prototype Report Dissemination Ontology
Dissemination Level
Public Consortium

Authors Andreas Harth (National University of Ireland, Galway)
Responsible Author
Andreas Harth Email andreas.harth@deri.org
Partner NUIG Phone +353 85 702 1881

Abstract
(for dissemination)
This document contains an executive summary covering two ontology repository implementations that can be used in conjunction with the Ontology Representation and Data Integration (ORDI) Framework.
Keywords Ontology repository

Version Log
issue date (dd-mm-yy) revision no. author change
31-12-05 001 Andreas Harth first internal version (version 1.0)
09-06-06 002 Andreas Harth final version for internal review
30-06-06 003 Andreas Harth final submitted version (version 3.0)

Reviewer Information
1
Gábor Nagypál Email Nagypal@fzi.de
Partner FZI Karlsruhe Phone +49-721-9654-714
2
Damyan Ognyanoff Email damyan@sirma.bg
Partner Sirma Phone +359 2 9768 303

 

Executive Summary

Ontologies provide more expressive modeling primitives than traditional relational database systems. Relational database schemas are rigid and instances have to strictly adhere to the schema. Ontologies have a less restrictive notion of a schema, which is of importance for integrating data in an open environment such as the Web. In D2.5, we have implemented various ways of storing and retrieving ontologies. The main requirement is to be able to handle large-scale data sets with millions of concepts.

One piece of the WP2 development is ORDI, an abstraction layer and API facilitating the use of WSML ontologies in Java programs. Being able to handle and process large-scale ontologies is one of the main requirements for WP2. To this end, we have implemented three repositories for persistent storage facilities of WSML ontologies. All three repositories are using the ORDI API as an interface to other parts of the WP2 ontology management suite.

As part of WP2.5 we developed two repository implementations: FOR, which focuses on corporate environments where commercial databases are used; and YARS, which provides provenance tracking which is especially important in a data integration scenario. The reference implementation used by ORDI is Sesame which was already available at the beginning of the project. The Sesame ORDI link serves as reference point for the other developments; using an already existing repository allowed for rapid implementation of the ORDI persistence layer.

Unicorn implemented 'FOR' - Flexible Ontology Repository, which is a scalable RDB-backed repository that works together with commercial database solutions. Being able to store and retrieve WSML ontologies in commercial systems is useful in environments where systems such as Oracle, DB2 and SQLServer are already in production use, which allows for easy transition and migration towards ontology-based systems. The flexibility of the repository is achieved by implementing a generic schema that can store any ORDI object. In addition the Hibernate ORM tool is used to streamline application development and to create an abstraction layer between the application logic and the actual RDB vendor.

NUIG implemented YARS, which is an RDF ontology repository in Java. The main use case that triggered the development of YARS is in the area of data integration. Current repositories lack the feature of tracking provenance of data which is mandatory in data integration scenarios where data quality is judged by the origin of the data. In addition to the provenance tracking feature, YARS is easy to install and can be embedded into applications only incurring a minor space overhead. YARS is packaged either as JAR file or as a web application, and is released under a BSD-style open-source license.

One reason to develop multiple repositories is to address the different requirements and usage scenarios for ontology management. For example, YARS provides easy setup and installation, keyword-based searches and transaction processing features allowing for multiple threads to access the repository. FOR is built on top of relational databases which have higher setup and installation costs but are well-acepted in industry. All repositories are integrated in DIP by implementing the ORDI Repository API layer and therefore applications on top of ORDI can switch between repositories easily.

ORDI Fact Sheet
http://www.omwg.org/tools/ordi/v0.21/FactSheet.html
YARS Fact Sheet
http://sw.deri.org/2005/03/diprdf/FactSheet
FOR Fact Sheet
http://sw.deri.org/2005/03/diprdf/UnicornRepositoryFactSheet.html

$Id: wp2.5.html 3831 2006-06-29 13:36:52Z aharth $