Exact substring matching queries on large data collections can be answered using q-gram indices, that store for each occurring q-byte pattern an (ordered) posting list with the po...
Enhanced biomedical image scanning technology and growing network accessibility have created a need for faster and more efficient data exchange over the Internet and in closed net...
Finding good representations of text documents is crucial in information retrieval and classification systems. Today the most popular document representation is based on a vector ...
What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
In this paper, we propose a novel approach for composing existing web services to satisfy the correctness constraints to the design, including freeness of deadlock and unspecified...
Ting Deng, Jinpeng Huai, Xianxian Li, Zongxia Du, ...