Mihalcea [1] discusses self-training and co-training in the context of word sense disambiguation and shows that parameter optimization on individual words was important to obtain g...
Many researchers are trying to use information extraction (IE) to create large-scale knowledge bases from natural language text on the Web. However, the primary approach (supervis...
We propose two hashing-based solutions to the problem of fast and effective personal names spelling correction in People Search applications. The key idea behind our methods is to...
In many applications, replacing a complex word form by its stem can reduce sparsity, revealing connections in the data that would not otherwise be apparent. In this paper, we focu...
Shane Bergsma, Aditya Bhargava, Hua He, Grzegorz K...
We examine effects that empty categories have on machine translation. Empty categories are elements in parse trees that lack corresponding overt surface forms (words) such as drop...