—A huge portion of todays Web consists of web pages filled with information from myriads of online databases. This part of the Web, known as the deep Web, is to date relatively ...
We consider the problem of building a P2P-based search engine for massive document collections. We describe a prototype system called ODISSEA (Open DIStributed Search Engine Archi...
Users’ past search behaviour provides a rich context that an information retrieval system can use to tailor its search results to suit an individual’s or a community’s infor...
Crawl selection policy has a direct influence on Web search effectiveness, because a useful page that is not selected for crawling will also be absent from search results. Yet th...
Web-search queries are known to be short, but little else is known about their structure. In this paper we investigate the applicability of part-of-speech tagging to typical Engli...