—Statistical information extraction (IE) programs are increasingly used to build real-world IE systems such as Alibaba, CiteSeer, Kylin, and YAGO. Current statistical IE approach...
This paper proposes a novel application of a statistical language model to opinionated document retrieval targeting weblogs (blogs). In particular, we explore the use of the trigg...
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...