Large web search engines have to answer thousands of queries per second with interactive response times. Due to the sizes of the data sets involved, often in the range of multiple...
In this paper we present the Infocious Web search engine [23]. Our goal in creating Infocious is to improve the way people find information on the Web by resolving ambiguities pre...
The huge amount of data available from Internet information sources has focused much attention on the sharing of distributed information through Peer Data Management Systems (PDMS...
The term web genre denotes the type of a given web resource, in contrast to the topic of its content. In this research, we focus on recognizing the web genres blog, wiki and forum...
How often do tags recur? How hard is predicting tag recurrence? What tags are likely to recur? We try to answer these questions by analysing the RSDC08 dataset, in both individual...