Sciweavers

2827 search results - page 310 / 566
» Marking Text Documents
Sort
View
ICDE
2012
IEEE
205views Database» more  ICDE 2012»
13 years 9 months ago
Optimizing Statistical Information Extraction Programs over Evolving Text
—Statistical information extraction (IE) programs are increasingly used to build real-world IE systems such as Alibaba, CiteSeer, Kylin, and YAGO. Current statistical IE approach...
Fei Chen, Xixuan Feng, Christopher Re, Min Wang
WSDM
2009
ACM
138views Data Mining» more  WSDM 2009»
16 years 1 months ago
Adaptive subjective triggers for opinionated document retrieval
This paper proposes a novel application of a statistical language model to opinionated document retrieval targeting weblogs (blogs). In particular, we explore the use of the trigg...
Kazuhiro Seki, Kuniaki Uehara
CIKM
2008
Springer
15 years 8 months ago
Identifying table boundaries in digital documents via sparse line detection
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
Ying Liu, Prasenjit Mitra, C. Lee Giles
WWW
2009
ACM
16 years 7 months ago
Extracting article text from the web with maximum subsequence segmentation
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
Jeff Pasternack, Dan Roth
KDD
2002
ACM
170views Data Mining» more  KDD 2002»
16 years 7 months ago
Enhanced word clustering for hierarchical text classification
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...