We address the problem of formatting the output of an automatic speech recognition (ASR) system for readability, while preserving wordlevel timing information of the transcript. O...
It is well known that pragmatic knowledge is useful and necessary in many difficult language processing tasks, but because this knowledge is difficult to acquire and process autom...
Machine Science, or Data-driven Research, is a new and interesting scientific methodology that uses advanced computational techniques to identify, retrieve, classify and analyse da...
Abstract. We present a possibly great improvement while performing semisupervised learning tasks from training data sets when only a small fraction of the data pairs is labeled. In...
The paper describes the data model used to implement the SIM Content Management Server, an SGML/XML-native content server designed to support extremely fast data access to and dyn...
Timothy Arnold-Moore, Michael Fuller, Alan J. Kent...