Typical statistical machine translation systems are trained with static parallel corpora. Here we account for scenarios with a continuous incoming stream of parallel training data...
Abby Levenberg, Chris Callison-Burch, Miles Osborn...
This paper presents and evaluates several original techniques for the latent classification of biographic attributes such as gender, age and native language, in diverse genres (co...
In this paper we show how to train statistical machine translation systems on reallife tasks using only non-parallel monolingual data from two languages. We present a modificatio...
The problem of finding quality information and services on the Web is analyzed. We present two user-centered evaluation methodologies to characterize the quality of the Web docume...
With the increase in popularity of online review sites comes a corresponding need for tools capable of extracting the information most important to the user from the plain text da...