We describe a set of supervised machine learning experiments centering on the construction of statistical models of WH-questions. These models, which are built from shallow lingui...
I present a method of identifying cognates in the vocabularies of related languages. I show that a measure of phonetic similarity based on multivalued features performs better tha...
Trigrams'n'Tags (TnT) is an efficient statistical part-of-speech tagger. Contrary to claims found elsewhere in the literature, we argue that a tagger based on Markov mod...
Background: A critical step in processing oligonucleotide microarray data is combining the information in multiple probes to produce a single number that best captures the express...
Activity recognition in video is dominated by low- and mid-level features, and while demonstrably capable, by nature, these features carry little semantic meaning. Inspired by the...