A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank no...
Farial Shahnaz, Michael W. Berry, V. Paul Pauca, R...
The retrieval performance of an information retrieval system usually increases when it uses the relationships among the terms contained in a given document collection. However, th...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
For the convenient reuse of large-scale 3D motion capture data, browsing and searching methods for the data should be explored. In this paper, an efficient indexing and retrieval...
Word Sense Disambiguation (WSD) is an intermediate task that serves as a means to an end defined by the application in which it is to be used. However, different applications have...