We develop new algorithms for learning monadic node selection queries in unranked trees from annotated examples, and apply them to visually interactive Web information extraction. ...
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stopp...
Abstract. Topic Maps and RDF are two independently developed paradigms and standards for the representation, interchange, and exploitation of model-based data on the web. Each para...
As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
The rapid growth of the Web has increased the importance of decentralized metadata creation. Resource authors must create their own metadata to enable enhanced information seeking...