Given a quarter of petabyte click log data, how can we estimate the relevance of each URL for a given query? In this paper, we propose the Bayesian Browsing Model (BBM), a new mod...
We study an algorithm for feature selection that clusters attributes using a special metric and then makes use of the dendrogram of the resulting cluster hierarchy to choose the m...
Richard Butterworth, Gregory Piatetsky-Shapiro, Da...
Frequent coherent subgraphscan provide valuable knowledgeabout the underlying internal structure of a graph database, and mining frequently occurring coherent subgraphs from large...
Zhiping Zeng, Jianyong Wang, Lizhu Zhou, George Ka...
Most data mining algorithms and tools stop at discovered customer models, producing distribution information on customer profiles. Such techniques, when applied to industrial pro...
This paper introduces a new approach to a problem of data sharing among multiple parties, without disclosing the data between the parties. Our focus is data sharing among two parti...