Sciweavers

5640 search results - page 368 / 1128
» Parallelizing the Data Cube
Sort
View
IPPS
2008
IEEE
16 years 1 months ago
Multi-threaded data mining of EDGAR CIKs (Central Index Keys) from ticker symbols
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
Dougal A. Lyon
ICDCS
2007
IEEE
16 years 1 months ago
Uniform Data Sampling from a Peer-to-Peer Network
Uniform random sample is often useful in analyzing data. Usually taking a uniform sample is not a problem if the entire data resides in one location. However, if the data is distr...
Souptik Datta, Hillol Kargupta
ICDCS
2006
IEEE
16 years 22 days ago
Content-based Dissemination of Fragmented XML Data
Content-based dissemination of data using pub/sub systems is an effective means to deliver relevant data to interested data consumers. With the emergence of XML as the standard f...
Chee Yong Chan, Yuan Ni
APPT
2005
Springer
16 years 7 days ago
Principal Component Analysis for Distributed Data Sets with Updating
Identifying the patterns of large data sets is a key requirement in data mining. A powerful technique for this purpose is the principal component analysis (PCA). PCA-based clusteri...
Zheng-Jian Bai, Raymond H. Chan, Franklin T. Luk
IPPS
2002
IEEE
15 years 11 months ago
Predicting the Performance of Wide Area Data Transfers
As Data Grids become more commonplace, large data sets are being replicated and distributed to multiple sites, leading to the problem of determining which replica can be accessed ...
Sudharshan Vazhkudai, Jennifer M. Schopf, Ian T. F...