Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
Defining the boundaries of a web-site, for (say) archiving or information retrieval purposes, is an important but complicated task. In this paper a web-page clustering approach to...
Missing web pages, URIs that return the 404 “Page Not Found” error or the HTTP response code 200 but dereference unexpected content, are ubiquitous in today’s browsing exper...
Martin Klein, Jeffery L. Shipman, Michael L. Nelso...
The increasing number of personal digital photos on the Web makes their management, retrieval and visualization a difficult task. To annotate these images using Semantic Web techno...
For the CLEF 2004 ImageCLEF St Andrew’s Collection task the Dublin City University group carried out three sets of experiments. We carried out standard cross-language informatio...
Gareth J. F. Jones, Declan Groves, Anna Khasin, Ad...