1 This paper defines a new stacked generalization framework in the context of information extraction (IE) from online sources. The proposed setting removes the constraint of apply...
Large collections of documents containing various types of multimedia, are made available to the WWW. Unfortunately, due to the un-structuredness of Internet environments it is ha...
In order to evaluate the performance of information retrieval and extraction algorithms, we need test collections. A test collection consists of a set of documents, a clearly form...
Abstract. We investigate the application of classification techniques to the problem of information extraction (IE). In particular we use support vector machines and several differ...
Given a set D = {d1, d2, ..., dD} of D strings of total length n, our task is to report the "most relevant" strings for a given query pattern P. This involves somewhat mo...