An indexing model is the heart of an Information Retrieval (IR) system. Data structures such as term based inverted indices have proved to be very effective for IR using vector sp...
Abstract. Recent research has suggested that there is no general similarity measure, which can be applied on arbitrary databases without any parameterization. Hence, the optimal co...
With the increasing importance of search in guiding today's web traffic, more and more effort has been spent to create search engine spam. Since link analysis is one of the m...
Our work examines Web revisitation patterns. Everybody revisits Web pages, but their reasons for doing so can differ depending on the particular Web page, their topic of interest,...
Abstract. Automated Text Categorization has reached the levels of accuracy of human experts. Provided that enough training data is available, it is possible to learn accurate autom...