We describe a large-scale application of methods for finding plagiarism and self-plagiarism in research document collections. The methods are applied to a collection of 284,834 d...
Daria Sorokina, Johannes Gehrke, Simeon Warner, Pa...
We have analyzed 607 sentences of spontaneous human-computer speech data containing repairs, drawn from a total corpus of 10,718 sentences. We present here criteria and techniques...
This paper presents a robust adaptive goodness-of-fit (GOF) χ2 test event detector for non-intrusive load monitoring applications. We derive a closed form for the decision thres...
Yuanwei Jin, Eniye Tebekaemi, Mario Berges, Lucio ...
Outlier detection can uncover malicious behavior in fields like intrusion detection and fraud analysis. Although there has been a significant amount of work in outlier detection, ...
We present a diff algorithm for XML data. This work is motivated by the support for change control in the context of the Xyleme project that is investigating dynamic warehouses ca...