The proliferation of digital libraries and the large amount of existing documents raise important issues in efficient handling of documents. Printed texts in documents need to be...
We present an approach for detecting link spam common in blog comments by comparing the language models used in the blog post, the comment, and pages linked by the comments. In co...
Organizing Web search results into clusters facilitates users' quick browsing through search results. Traditional clustering techniques are inadequate since they don't g...
Rooted in electronic publishing, XML is now widely used for modelling and storing structured text documents. Especially in the WWW, retrieval of XML documents is most useful in co...
Felix Weigel, Holger Meuss, Klaus U. Schulz, Fran&...
This paper focuses on the creation of a first order predicate calculus based regulation compliance-assistance system built upon an XML framework. Two areas of research that suppor...