Government regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulation...
In an outsourced data framework, we introduce and demonstrate mechanisms for securely storing a set of data items (documents) on an un-trusted server, while allowing for subsequen...
Radu Sion, Sumeet Bajaj, Bogdan Carbunar, Stefan K...
The availability of summary data for XML documents has many applications, from providing users with quick feedback about their queries, to cost-based storage design and query opti...
Juliana Freire, Jayant R. Haritsa, Maya Ramanath, ...
Ink-bleed interference is undesirable as it reduces the legibility and aesthetics of affected documents. We present a novel approach to reduce ink-bleed interference using functio...
Grani Adiwena Hanasusanto, Michael Brown, Zheng Wu
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive framework to model, visualize and summarize large document collections in a co...
Ramesh Nallapati, Amr Ahmed, William W. Cohen, Eri...