Sciweavers

3530 search results - page 502 / 706
» Technology of Text Mining
Sort
View
IRAL
2000
ACM
15 years 11 months ago
Construction of a Chinese-English WordNet and its application to CLIR
This paper integrates five linguistic resources, including Cilin, a Chinese-English dictionary, ASBC corpus, SemCor, and WordNet, to construct a Chinese-English WordNet. The resul...
Hsin-Hsi Chen, Chi-Ching Lin, Wen-Cheng Lin
SIGIR
2000
ACM
15 years 11 months ago
An investigation of linguistic features and clustering algorithms for topical document clustering
We investigate four hierarchical clustering methods (single-link, complete-link, groupwise-average, and single-pass) and two linguistically motivated text features (noun phrase he...
Vasileios Hatzivassiloglou, Luis Gravano, Ankineed...
SIGIR
1999
ACM
15 years 11 months ago
Probabilistic Latent Semantic Indexing
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fit...
Thomas Hofmann
SIGIR
1999
ACM
15 years 11 months ago
The Decomposition of Human-Written Summary Sentences
We define the problem of decomposing human-written summary sentences and propose a novel Hidden Markov Model solution to the problem. Human summarizers often rely on cutting and ...
Hongyan Jing, Kathleen McKeown
CIKM
1999
Springer
15 years 10 months ago
Metadata and Data Structures for the Historical Newspaper Digital Library
We examine metadata and data-structure issues for the Historical Newspaper Digital Library. This project proposes to digitize and then do OCR and linguisting processing on several...
Robert B. Allen, John Schalow