In this paper we report our work on building a POS tagger for a morphologically rich language- Hindi. The theme of the research is to vindicate the stand that- if morphology is st...
This paper presents the Topic-Aspect Model (TAM), a Bayesian mixture model which jointly discovers topics and aspects. We broadly define an aspect of a document as a characteristi...
Thesaurus is a collection of words classified according to some relatedness measures among them. In this paper, we lay the theoretical foundations of thesaurus construction through...
Abstract— Given an unstructured collection of captioned images of cluttered scenes featuring a variety of objects, our goal is to simultaneously learn the names and appearances o...
Michael Jamieson, Afsaneh Fazly, Suzanne Stevenson...
Inverted indexes are the most fundamental and widely used data structures in information retrieval. For each unique word occurring in a document collection, the inverted index sto...
Manish Patil, Sharma V. Thankachan, Rahul Shah, Wi...