We report grammar inference experiments on partially parsed sentences taken from the Wall Street Journal corpus using the inside-outside algorithm for stochastic context-free gram...
Abstract. The Mongue-Elkan method is a general text string comparison method based on an internal character-based similarity measure (e.g. edit distance) combined with a token leve...
Sergio Jimenez, Claudia Becerra, Alexander F. Gelb...
We present a flexible skeleton for implementing distributed work pools in our parallel functional language Eden. The skeleton manages a pool of tasks (work pool) in a distributed ...
Abstract. Most common feature selection techniques for document categorization are supervised and require lots of training data in order to accurately capture the descriptive and d...
Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domai...