Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacity of any single machine. To handle the necessary data volumes and query through...
In this paper we examine the retrieval performance of adjacent and concurrent n-grams generated from polyphonic music data. We deploy a method to index polyphonic music using a wo...
Versioned textual collections are collections that retain multiple versions of a document as it evolves over time. Important large-scale examples are Wikipedia and the web collect...
A weakly-supervised extraction method identifies concepts within conceptual hierarchies, at the appropriate level of specificity (e.g., Bank vs. Institution), to which attribute...
Abstract. Latent semantic indexing (LSI) is an application of numerical method called singular value decomposition (SVD), which discovers latent semantic in documents by creating c...