Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
Data delivered today over the web reflects rapid and unpredictable changes in the world around us. We are increasingly relying on content that provides dynamic, interactive, person...
Research on information extraction from Web pages (wrapping) has seen much activity in recent times (particularly systems implementations), but little work has been done on formal...
Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...
Abstract. This paper presents first steps towards building a music information system like last.fm, but with the major difference that the data is automatically retrieved from the ...
Markus Schedl, Peter Knees, Tim Pohle, Gerhard Wid...