Massive amounts of raw data are currently being generated by biologists while sequencing organisms. Outside of the largest, high-pro le projects such as the Human Genome Project, ...
XML is an emerging standard for data representation and exchange on the World-Wide Web. Due to the nature of information on the Web and the inherent flexibility of XML, we expect...
Several pattern discovery methods proposed in the data mining literature have the drawbacks that they discover too many obvious or irrelevant patterns and that they do not leverag...
Abstract. This paper describes a new way of implementing an intelligent web caching service, based on an analysis of usage. Since the cache size in software is limited, and the sea...
We present a general framework for the task of extracting specific information “on demand” from a large corpus such as the Web under resource-constraints. Given a database wit...