Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting or style comparison. I...
Finding latent patterns in high dimensional data is an important research problem with numerous applications. Existing approaches can be summarized into 3 categories: feature selec...
The problem of measuring similarity between web pages arises in many important Web applications, such as search engines and Web directories. In this paper, we propose a novel neig...
Optimal Component Analysis (OCA) is a linear method for feature extraction and dimension reduction. It has been widely used in many applications such as face and object recognitio...
Abstract We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank, which has been shown to be empiricall...
Qiang Wu, Christopher J. C. Burges, Krysta Marie S...