A common technique used to minimize I/O in data intensive applications is data declustering over parallel servers. This technique involves distributing data among several disks so...
Hakan Ferhatosmanoglu, Ali Saman Tosun, Guadalupe ...
This paper describes how use the HTMLEditorKit to perform web data mining on EDGAR (Electronic Data-Gathering, Analysis, and Retrieval system). EDGAR is the SEC's (U.S. Secur...
No search engine is perfect. A typical type of imperfection is the preference misalignment between search engines and end users, e.g., from time to time, web users skip higherrank...
A major source of information (often the most crucial and informative part) in scholarly articles from scientific journals, proceedings and books are the figures that directly pro...
Amr Ahmed, Eric P. Xing, William W. Cohen, Robert ...
The success of popular algorithms such as k-means clustering or nearest neighbor searches depend on the assumption that the underlying distance functions reflect domain-specific n...