Collecting large consistent data sets for real world software projects is problematic. Therefore, we explore how little data are required before the predictor performance plateaus...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
We describe an application of machine learning to the problem of geomorphic mapping of planetary surfaces. Mapping landforms on planetary surfaces is an important task and the fi...
Tomasz F. Stepinski, Soumya Ghosh, Ricardo Vilalta
One of the main difficulties in facial age estimation is the lack of sufficient training data for many ages. Fortunately, the faces at close ages look similar since aging is a slo...
Regularized Least Squares (RLS) algorithms have the ability to avoid over-fitting problems and to express solutions as kernel expansions. However, we observe that the current RLS ...