This paper describes how use the HTMLEditorKit to perform web data mining on EDGAR (Electronic Data-Gathering, Analysis, and Retrieval system). EDGAR is the SEC's (U.S. Secur...
With the emergence of the World Wide Web, analyzing and improving Web communication has become essential to adapt the Web content to the visitors’ expectations. Web communicatio...
Published data is prone to privacy attacks. Sanitization methods aim to prevent these attacks while maintaining usefulness of the data for legitimate users. Quantifying the trade-...
The paper proposes identifying relevant information sources from the history of combined searching and browsing behavior of many Web users. While it has been previously shown that...
In order to be adopted within corporate environments, Semantic Web applications must provide tangible short-/medium-term gains. Although corporate Semantic Web offers enterprises n...
Robert Tolksdorf, Radoslaw Oldakowski, Thomas Hopp...