Seed sampling is critical in semi-supervised learning. This paper proposes a clusteringbased stratified seed sampling approach to semi-supervised learning. First, various clusteri...
It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that...
Slav Petrov, Pi-Chuan Chang, Michael Ringgaard, Hi...
Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treeba...
We present a new phrase-based conditional exponential family translation model for statistical machine translation. The model operates on a feature representation in which sentenc...
Recently system combination has been shown to be an effective way to improve translation quality over single machine translation systems. In this paper, we present a simple and ef...