We develop a novel framework for the page-level template detection problem. Our framework is built on two main ideas. The first is the automatic generation of training data for a ...
Discovering associations between elements occurring in a stream is applicable in numerous applications, including predictive caching and fraud detection. These applications requir...
Abstract—In this paper, a novel architecture for a streaming intrusion detection system for Grid computing environments is presented. Detection mechanisms based on traditional lo...
Matthew Smith, Fabian Schwarzer, Marian Harbach, T...
Defining the boundaries of a web-site, for (say) archiving or information retrieval purposes, is an important but complicated task. In this paper a web-page clustering approach to...
Rare category detection is an open challenge for active learning, especially in the de-novo case (no labeled examples), but of significant practical importance for data mining - ...