This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
Uniform random sample is often useful in analyzing data. Usually taking a uniform sample is not a problem if the entire data resides in one location. However, if the data is distr...
Content-based dissemination of data using pub/sub systems is an effective means to deliver relevant data to interested data consumers. With the emergence of XML as the standard f...
Identifying the patterns of large data sets is a key requirement in data mining. A powerful technique for this purpose is the principal component analysis (PCA). PCA-based clusteri...
As Data Grids become more commonplace, large data sets are being replicated and distributed to multiple sites, leading to the problem of determining which replica can be accessed ...
Sudharshan Vazhkudai, Jennifer M. Schopf, Ian T. F...