This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...
Blog posts containing many personal experiences or perspectives toward specific subjects are useful. Blogs allow readers to interact with bloggers by placing comments on specific ...
Scientific equations embedded in computer programs must obey the rules for dimensional consistency. Many efforts have been made to enforce these rules within computer programs. So...
We examine the precision with which the cumulative score from a suite of test cases ranks participants in the International Olympiad in Informatics (IOI). Our concern is the abilit...
The field of market basket analysis, the search for meaningful associations in customer purchase data, is one of the oldest areas of data mining. The typical solution involves th...