We show that the automatically induced latent variable grammars of Petrov et al. (2006) vary widely in their underlying representations, depending on their EM initialization point...
This paper analyzes the topic identification stage of single-document automatic text summarization across four different domains, consisting of newswire, literary, scientific and ...
Hakan Ceylan, Rada Mihalcea, Umut O'zertem, Elena ...
It is well-known that, given a probability distribution over n characters, in the worst case it takes (n log n) bits to store a prefix code with minimum expected codeword length. H...
By analogy with an approach widely used in physics, we consider a discrete set of base stations (BS) as a continuum of transmitters. This model allows us to establish a simple clos...
In this paper we present a confidence measure for word alignment based on the posterior probability of alignment links. We introduce sentence alignment confidence measure and alig...