While many visualization tools exist that offer sophisticated functions for charting complex data, they still expect users to possess a high degree of expertise in wielding the to...
Yiwen Sun, Jason Leigh, Andrew E. Johnson, Sangyoo...
Frequency counts from very large corpora, such as the Web 1T dataset, have recently become available for language modeling. Omission of low frequency n-gram counts is a practical ...
: We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and...
We present a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition. The algorithm uses large margin estimati...
We describe a preliminary implementation of the high-level modelling language Zinc. This language supports a modelling methodology in which the same Zinc model can be automatically...