Many contemporary language technology systems are characterized by long pipelines of tools with complex dependencies. Too often, these workflows are implemented by ad hoc scripts;...
In this paper, a new language model, the Multi-Class Composite N-gram, is proposed to avoid a data sparseness problem for spoken language in that it is difficult to collect traini...
The standard approach to the classification of objects is to consider the examples as independent and identically distributed (iid). In many real world settings, however, this ass...
Unlike conventional data or text, Web pages typically contain a large amount of information that is not part of the main contents of the pages, e.g., banner ads, navigation bars, ...
Data intensive service functions such as memory allocation/de-allocation, data prefetching, and data relocation can pollute processor cache in conventional systems since the same ...