Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which...
Yiming Yang, Jian Zhang, Jaime G. Carbonell, Chun ...
We propose a generative model based on latent Dirichlet allocation for mining distinct topics in document collections by integrating the temporal ordering of documents into the ge...
Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee G...
Abstract. Social bookmarking has become an important web2.0 application recently, which is concerned with the dual user behavior to search - tagging. Although social bookmarking we...
Similarity measures for text have historically been an important tool for solving information retrieval problems. In many interesting settings, however, documents are often closel...
The XML language is a W3C standard sustained by both the industry and the scientific community. Therefore, the available information annotated in XML keeps and will keep increasing...
Eugen Popovici, Pierre-Francois Marteau, Gildas M&...