In this paper, we propose the "Democratic Classifier", a simple, democracy-inspired patternbased classification algorithm that uses very short patterns for classificatio...
It is necessary to provide a method to store Web information effectively so it can be utilised as a future knowledge resource. A commonly adopted approach is to classify the retri...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
We present a web computing library (PUBWCL) in Java that allows to execute tightly coupled, massively parallel algorithms in the bulk-synchronous (BSP) style on PCs distributed ove...
Olaf Bonorden, Joachim Gehweiler, Friedhelm Meyer ...
Current search engines rely on centralized page ranking algorithms which compute page rank values as single (global) values for each Web page. Recent work on topic-sensitive PageRa...
Paul-Alexandru Chirita, Daniel Olmedilla, Wolfgang...