Focused crawlers are programs that wander in the Web, using its graph structure, and gather pages that belong to a specific topic. The most critical task in Focused Crawling is the...
Ioannis Partalas, Georgios Paliouras, Ioannis P. V...
We propose a new graph-based semisupervised learning (SSL) algorithm and demonstrate its application to document categorization. Each document is represented by a vertex within a ...
Information extraction from HTML pages has been conventionally treated as plain text documents extended with HTML tags. However, the growing maturity and correct usage of HTML/XHT...
This paper intends to describe an ongoing experiment that provides a collaborative learning method applied to a case study in higher education. With this purpose, a computer-based...
Elena Campo Montalv, Vicente Martinez-Orga, Nelson...
Temporal reasoners for document understanding typically assume that a document’s creation date is known. Algorithms to ground relative time expressions and order events often re...