New types of document collections are being developed by various web services. The service providers keep track of non-textual features such as click counts. In this paper, we pre...
Jiwoon Jeon, W. Bruce Croft, Joon Ho Lee, Soyeon P...
We describe an evaluation of result set filtering techniques for providing ultra-high precision in the task of presenting related news for general web queries. In this task, the n...
Steven M. Beitzel, Eric C. Jensen, Abdur Chowdhury...
Web spam is behavior that attempts to deceive search engine ranking algorithms. TrustRank is a recent algorithm that can combat web spam. However, TrustRank is vulnerable in the s...
Current search engines do not fully leverage semantically rich datasets, or specialise in indexing just one domainspecific dataset. We present a search engine that uses the RDF da...
While semantic search technologies have been proven to work well in specific domains, they still have to confront two main challenges to scale up to the Web in its entirety. In th...