Many classification tasks, such as spam filtering, intrusion detection, and terrorism detection, are complicated by an adversary who wishes to avoid detection. Previous work on ad...
Dyadic data matrices, such as co-occurrence matrix, rating matrix, and proximity matrix, arise frequently in various important applications. A fundamental problem in dyadic data a...
In this paper, we discuss a problem of finding risk patterns in medical data. We define risk patterns by a statistical metric, relative risk, which has been widely used in epidemi...
Jiuyong Li, Ada Wai-Chee Fu, Hongxing He, Jie Chen...
How do real graphs evolve over time? What are "normal" growth patterns in social, technological, and information networks? Many studies have discovered patterns in stati...
Jure Leskovec, Jon M. Kleinberg, Christos Faloutso...
In the paper we show that diagnostic classes in cancer gene expression data sets, which most often include thousands of features (genes), may be effectively separated with simple ...
Gregor Leban, Minca Mramor, Ivan Bratko, Blaz Zupa...
In this paper, we show that stylistic text features can be exploited to determine an anonymous author's native language with high accuracy. Specifically, we first use automat...
Web users display their preferences implicitly by navigating through a sequence of pages or by providing numeric ratings to some items. Web usage mining techniques are used to ext...
The problem of finding frequent patterns from graph-based datasets is an important one that finds applications in drug discovery, protein structure analysis, XML querying, and soc...