We present a probabilistic model for a document corpus that combines many of the desirable features of previous models. The model is called “GaP” for Gamma-Poisson, the distri...
The semantic web is based on ontologies and metadata that indexes resources using ontologies. This indexing is called annotation. Ontology based information retrieval is an operati...
The cultural heritage domain dealing with digital surrogates of rare and fragile historic artifacts is one of the most promising areas for establishing collaboratories, i.e. shared...
Defining the boundaries of a web-site, for (say) archiving or information retrieval purposes, is an important but complicated task. In this paper a web-page clustering approach to...
We introduce a method for learning query transformations that improves the ability to retrieve answers to questions from an information retrieval system. During the training stage...