We argue that groups of unannotated texts with overlapping and non-contradictory semantics represent a valuable source of information for learning semantic representations. A simp...
Self-indexes can represent a text in asymptotically optimal space under the k-th order entropy model, give access to text substrings, and support indexed pattern searches. Their ti...
Text summarization is a data reduction process. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core inform...
Lawrence H. Reeve, Hyoil Han, Saya V. Nagori, Jona...
We present an algorithm that minimizes the expected cost of indirect binary search for data with non-constant access costs, such as disk data. Indirect binary search means that sor...
Eduardo F. Barbosa, Gonzalo Navarro, Ricardo A. Ba...
Finding biological entities (such as genes or proteins) that satisfy certain conditions from texts is an important and challenging task in biomedical information retrieval and tex...