The word error rate of any optical character recognition system (OCR) is usually substantially below its component or character error rate. This is especially true of Indic langua...
Venkat Rasagna, Anand Kumar 0002, C. V. Jawahar, R...
In this paper we explore robustness and domain adaptation issues for Word Sense Disambiguation (WSD) using Singular Value Decomposition (SVD) and unlabeled data. We focus on the s...
This paper proposes a novel method to extract named entities including unfamiliar words which do not occur or occur few times in a training corpus using a large unannotated corpus...
Word discovery is the task of discovering and collecting occurrences of repeating words in the absence of prior acoustic and linguistic knowledge, or training material. The capabi...
We introduce perturbation kernels, a new class of similarity measure for information retrieval that casts word similarity in terms of multi-task learning. Perturbation kernels mode...