This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
A new application of tuple-space-based coordination systems is in knowledge communication and representation. This knowledge is being published on the Web (the so-called “Semant...
Lyndon J. B. Nixon, Robert Tolksdorf, Alan Wood, R...
Building and maintaining thesauri are complex and laborious tasks. PoolParty is a Thesaurus Management Tool (TMT) for the Semantic Web, which aims to support the creation and maint...
The success of Web search is often limited by a variety of factors. Typical queries are vague and imprecise. At the same time, the Web is a dynamic and unmoderated collection and ...
In this paper we describe a new approach to extract element labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to retrieve a...