In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by...
Hongwen Kang, Kuansan Wang, David Soukal, Fritz Be...
Analyzing data obtained from web server logs, so-called “clickstreams”, is rapidly becoming one of the most important activities for companies in any sector as most businesses...
Jesper Andersen, Anders Giversen, Allan H. Jensen,...
We present a novel framework for automated extraction and approximation of numerical object attributes such as height and weight from the Web. Given an object-attribute pair, we d...
We present first steps towards intelligent retrieval of music album covers from the web. The continuous growth of electronic music distribution constantly increases the interest in...
Markus Schedl, Peter Knees, Tim Pohle, Gerhard Wid...
Abstract. One of the effects of the general Internet growth is an immense number of user accesses to WWW resources. These accesses are recorded in the web server log files, which...