We argue that the advent of large volumes of full-length text, as opposed to short texts tracts and newswire, should be accompanied by corresponding new approaches to information ...
In this paper we will briefly describe the approaches taken by Berkeley for the main GeoCLEF 2007 tasks (Mono and Bilingual retrieval). This year we used only a single system in ...
Many information retrieval systems use the inverted file as indexing structure. The inverted file, however, is not suited to supporting incremental updates when new documents are ...
In the past few years, the fast proliferation of available XML documents has stimulated a great deal of interest in discovering hidden and nontrivial knowledge from XML repositori...
Ling Chen 0002, Sourav S. Bhowmick, Liang-Tien Chi...
Low-dimensional topic models have been proven very useful for modeling a large corpus of documents that share a relatively small number of topics. Dimensionality reduction tools s...