In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain ...
The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, ...
Given a set D = {d1, d2, ..., dD} of D strings of total length n, our task is to report the "most relevant" strings for a given query pattern P. This involves somewhat mo...
Text superimposed on the video frames provides supplemental but important information for video indexing and retrieval. Many efforts have been made for videotext detection and rec...
In this paper, we present an analysis based on linguistic and typographic features that allows for the identification of titles in web documents. We focus in particular on procedu...