We prove that longest common prefix (LCP) information can be stored in much less space than previously known. More precisely, we show that in the presence of the text and the su...
A novel backwards viewpoint of Principal Component Analysis is proposed. In a wide variety of cases, that fall into the area of Object Oriented Data Analysis, this viewpoint is se...
Wikipedia articles in different languages are connected by interwiki links that are increasingly being recognized as a valuable source of cross-lingual information. Unfortunately,...
We describe a Chinese temporal annotation experiment that produced a sizable data set for the TempEval-2 evaluation campaign. We show that while we have achieved high inter-annota...
We perform a study of existing dialogue corpora to establish the theoretical maximum performance of the selection approach to simulating human dialogue behavior in unseen dialogue...