This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
This paper presents a systematic approach based on robust statistical techniques for development of a data-driven soft sensor, which is an important component of the process analy...
Managing large-scale software projects involves a number of activities such as viewpoint extraction, feature detection, and requirements management, all of which require a human a...
Many applications in surveillance, monitoring, scientific discovery, and data cleaning require the identification of anomalies. Although many methods have been developed to iden...
In functional connectivity analysis, networks of interest are defined based on correlation with the mean time course of a user-selected `seed' region. In this work we propose ...