An increasing number of tasks require people to explore, navigate and search extremely complex data sets visualized as graphs. Examples include electrical and telecommunication ne...
Nelson Wong, M. Sheelagh T. Carpendale, Saul Green...
String comparison is a critical issue in many application domains, including speech recognition, contents search, and bioinformatics. The similarity between two strings of lengths...
In this paper we present StarNT, a dictionary-based fast lossless text transform algorithm. With a static generic dictionary, StarNT achieves a superior compression ratio than alm...
The number of software systems is increasing at a rapid rate. For example, SourceForge currently has about sixty thousand software systems registered, twenty-two thousand of which...
Shinji Kawaguchi, Pankaj K. Garg, Makoto Matsushit...
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...