As performance gains in automatic speech recognition systems plateau, improvements to existing applications of speech recognition technology seem more likely to come from better u...
As an effective technique for improving retrieval effectiveness, relevance feedback (RF) has been widely studied in both monolingual and cross-language information retrieval (CLIR...
SALSA is a link-based ranking algorithm that takes the result set of a query as input, extends the set to include additional neighboring documents in the web graph, and performs a...
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...