Implementations of map-reduce are being used to perform many operations on very large data. We examine strategies for joining several relations in the map-reduce environment. Our ...
This paper introduces LDA-G, a scalable Bayesian approach to finding latent group structures in large real-world graph data. Existing Bayesian approaches for group discovery (suc...
A lift curve, with the true positive rate on the y-axis and the customer pull (or contact) rate on the x-axis, is often used to depict the model performance in many data mining ap...
Recently the concept of a clarity score was introduced in order to measure the ambiguity of a query in relation to the collection in which the query issuer is seeking information ...
Abstract. The term Deep Web (sometimes also called Hidden Web) refers to the data content that is created dynamically as the result of a specific search on the Web. In this respec...