Web2.0 is a conceptual framework that aims at enhancing the World Wide Web with semantic and social functionnalities. For this framework to fully develop, there is a need for concr...
Abstract. Extracting data from web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. In this paper, we propose a...
We present here the current prototype of the text understanding system HELENE. The objective of this system is to achieve a deep understanding of small reports dealing with a rest...
The existence of large image datasets such as the set of photos on the World Wide Web make it possible to build powerful generic models for low-level image attributes like color u...
Term signal is an existing text representation that depicts a term as a vector of frequencies of occurrences in a number of user-defined partitions of a document. Although term si...
Supphachai Thaicharoen, Tom Altman, Krzysztof J. C...