In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
Semantic Web endeavors have mainly focused on issues pertaining to knowledge representation and ontology design. However, besides understanding information metadata stated by subj...
The ease of deployment and the infrastructure less nature of Mobile Ad hoc Networks (MANETs) make them highly desirable for the present day multi media communications. Traditional ...
A recently proposed approach to address privacy concerns in storing web search querylogs is bundling logs of multiple users together. In this work we investigate privacy leaks tha...
Our first objective in participating in this domain-specific evaluation campaign is to propose and evaluate various indexing and search strategies for the German, English and Russ...