The domain of Digital Libraries presents specific challenges for unsupervised information extraction to support both the automatic classification of documents and the enhancement ...
Mikalai Krapivin, Maurizio Marchese, Andrei Yadran...
Digital video databases have become more pervasive and finding video clips quickly in large databases becomes a major challenge. Due to the nature of video, accessing contents of v...
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
The management of reservoir simulation models has been an important need of engineers in petroleum industry. However, due to data sharing among reservoir simulation models, data r...
Source repositories are a promising database of information about software projects. This paper proposes a tool to extract and summarize information from CVS logs in order to ident...