In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
Data warehousing involves complex processes that transform source data through several stages to deliver suitable information ready to be analysed. Though many techniques for visua...
Employees depend on other people in the enterprise for rapid access to important information. But current systems for finding experts do not adequately address the social implicat...
Kate Ehrlich, Ching-Yung Lin, Vicky Griffiths-Fish...
One of the reasons large-scale software development is difficult is the number of dependencies that software engineers face. These dependencies create a need for communication and...
Cleidson R. B. de Souza, Stephen Quirk, Erik Train...
Topic distillation aims at finding key resources which are high-quality pages for certain topics. With analysis in non-content features of key resources, a pre-selection method is ...