Sciweavers

2929 search results - page 229 / 586
» Models of English Text
Sort
View
ACL
2003
15 years 8 months ago
tRuEcasIng
Truecasing is the process of restoring case information to badly-cased or noncased text. This paper explores truecasing issues and proposes a statistical, language modeling based ...
Lucian Vlad Lita, Abraham Ittycheriah, Salim Rouko...
EMNLP
2009
15 years 4 months ago
Learning Term-weighting Functions for Similarity Measures
Measuring the similarity between two texts is a fundamental problem in many NLP and IR applications. Among the existing approaches, the cosine measure of the term vectors represen...
Wen-tau Yih
KDD
2009
ACM
269views Data Mining» more  KDD 2009»
16 years 7 months ago
Extracting discriminative concepts for domain adaptation in text mining
One common predictive modeling challenge occurs in text mining problems is that the training data and the operational (testing) data are drawn from different underlying distributi...
Bo Chen, Wai Lam, Ivor Tsang, Tak-Lam Wong
KDD
2009
ACM
262views Data Mining» more  KDD 2009»
16 years 7 months ago
Sentiment analysis of blogs by combining lexical knowledge with text classification
The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the disc...
Prem Melville, Wojciech Gryc, Richard D. Lawrence
SIGMOD
2009
ACM
190views Database» more  SIGMOD 2009»
16 years 6 months ago
Optimizing complex extraction programs over evolving text data
Most information extraction (IE) approaches have considered only static text corpora, over which we apply IE only once. Many real-world text corpora however are dynamic. They evol...
Fei Chen 0002, Byron J. Gao, AnHai Doan, Jun Yang ...