Sciweavers

8316 search results - page 368 / 1664
» Web Document Modeling
Sort
View
ERCIMDL
2009
Springer
117views Education» more  ERCIMDL 2009»
16 years 1 months ago
A Visualization Tool of Probabilistic Models for Information Access Components
An effective graphic interface is a key tool to improve the fruition of the results retrieved by an Information Retrieval (IR) system. In this work, we describe a two-dimensional...
Lorenzo De Stefani, Giorgio Maria Di Nunzio, Giorg...
ICML
2006
IEEE
16 years 7 months ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan
KDD
2005
ACM
135views Data Mining» more  KDD 2005»
16 years 7 months ago
A hybrid unsupervised approach for document clustering
We propose a hybrid, unsupervised document clustering approach that combines a hierarchical clustering algorithm with Expectation Maximization. We developed several heuristics to ...
Mihai Surdeanu, Jordi Turmo, Alicia Ageno
SDM
2009
SIAM
208views Data Mining» more  SDM 2009»
16 years 4 months ago
Topic Evolution in a Stream of Documents.
Document collections evolve over time, new topics emerge and old ones decline. At the same time, the terminology evolves as well. Much literature is devoted to topic evolution in ...
Alexander Hinneburg, Andrè Gohr, Myra Spili...
ICDAR
2009
IEEE
16 years 1 months ago
Spatial and Spectral Based Segmentation of Text in Multispectral Images of Ancient Documents
In this paper we propose a character segmentation method for multispectral images of ancient documents. Due to the low quality of the images the main idea of this study is to comb...
Martin Lettner, Robert Sablatnig