It is not uncommon for modern systems to be composed of a variety of interacting services, running across multiple machines in such a way that most developers do not really unders...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Random projection (RP) is a common technique for dimensionality reduction under L2 norm for which many significant space embedding results have been demonstrated. However, many si...
Real-time materialized view maintenance has become increasingly popular, especially in real-time data warehousing and data streaming environments. Upon updates to base relations, ...
Background: Accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, a comparative study of pairwise stati...