The growth in electronic and digital publishing on the World Wide Web has led to the development of a wide range of tools for generating metadata. As a result, it can be difficult...
Most text analysis is designed to deal with the concept of a “document”, namely a cohesive presentation of thought on a unifying subject. By contrast, individual nodes on the ...
The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
Understanding the source, data, and documentation files associated with legacy systems in preparation for maintenance or reengineering is an increasingly important problem for man...
We consider the problem of compressing graphs of the link structure of the World Wide Web. We provide efficient algorithms for such compression that are motivated by recently prop...