We propose a weakly-supervised approach for extracting class attributes from structured text available within Web documents. The overall precision of the extracted attributes is a...
We developed and tested a heuristic technique for extracting the main article from news site Web pages. We construct the DOM tree of the page and score every node based on the amo...
Software developers increasingly rely on information from the Web, such as documents or code examples on Application Programming Interfaces (APIs), to facilitate their development...
Jinhan Kim, Sanghoon Lee, Seung-won Hwang, Sunghun...
This paper presents a real-world application for assisting medical diagnosis and drug prescription, which relies on the exclusive use of machine learning techniques. We have autom...
To carry ecologically-relevant biodiversity research, one must collect chunks of information on species and their habitats from a large number of institutions and correlate them us...