Sciweavers

2190 search results - page 233 / 438
» Unweaving a web of documents
Sort
View
SIGIR
2004
ACM
16 years 1 days ago
Constructing a text corpus for inexact duplicate detection
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Jack G. Conrad, Cindy P. Schriber
DOCENG
2003
ACM
15 years 12 months ago
Set-at-a-time access to XML through DOM
To support the rapid growth of the web and e-commerce, W3C developed DOM as an application programming interface that the abstract, logical tree structure of an XML document. In t...
Hai Chen, Frank Wm. Tompa
AIRS
2006
Springer
15 years 10 months ago
Learning to Separate Text Content and Style for Classification
Many text documents naturally have two kinds of labels. For example, we may label web pages from universities according to their categories, such as "student" or "fa...
Dell Zhang, Wee Sun Lee
EMNLP
2008
15 years 8 months ago
One-Class Clustering in the Text Domain
Having seen a news title "Alba denies wedding reports", how do we infer that it is primarily about Jessica Alba, rather than about weddings or reports? We probably reali...
Ron Bekkerman, Koby Crammer
IALP
2010
15 years 1 months ago
Multiple Factors-Based Opinion Retrieval and Coarse-to-Fine Sentiment Classification
With more and more reviews on the web, browsing through a mass of the related reviews becomes a heavy work. How to effectively analyzing and organizing these reviews attracts more...
Shu Zhang, Wen-Jie Jia, Yingju Xia, Yao Meng, Hao ...