Clio is a joint research project between the University of Toronto and IBM Almaden Research Center started in 1999 to address both foundational and systems issues related to the ma...
Data mining aims at extraction of previously unidentified information from large databases. It can be viewed as an automated application of algorithms to discover hidden patterns a...
Similarity search methods are widely used as kernels in various data mining and machine learning applications including those in computational biology, web search/clustering. Near...
Random walk graph and Markov chain based models are used heavily in many data and system analysis domains, including web, bioinformatics, and queuing. These models enable the desc...
From Proc. CAiSE05 LNCS 3520, Pages 460-474 c Springer-Verlag 2005 Semi-structured data sources, such as XML, HTML or CSV files, present special problems when performing data int...