MultiCrawler: A framework for crawling and indexing semantic web data

Introduction

MultiCrawler is a high modulare framework for crawling and indexing semantic web data.
For indexing the data, we are transforming all the data in RDF and and storing them in YARS.

The framework is based on a five step processing pipeline. Each module in this pipeline can be run on a single server, so we can easily raise the performance of the pipeline adding new servers to the pipeline.

It is also possible to run the whole framework on a single server.


Contributors

Subversion

There is a subversion repository with full source code available.



$Id: index.html 3590 2006-06-13 10:05:43Z juergen $