MultiCrawler:
A framework
for crawling and indexing semantic web data
Introduction
MultiCrawler is a high modulare framework for crawling and indexing
semantic web data.
For indexing the data, we are transforming all the data in RDF and and storing them in
YARS.
The framework is based on a five step processing pipeline. Each module
in this pipeline can be run on a single server, so we can easily raise
the performance of the pipeline adding new servers to the pipeline.
It is also possible to run the whole framework on a single server.
Contributors
- Stefan Decker
- Andreas Harth
- Jürgen Umbrich
Subversion
There is a subversion
repository with full source code available.
$Id: index.html 3590 2006-06-13 10:05:43Z juergen $