While much of the data on the web is unstructured in nature, there is also a significant amount of embedded structured data, such as product information on e-commerce sites or sto...
When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive proce...
In this paper, we consider the problem of combining link and content analysis for community detection from networked data, such as paper citation networks and Word Wide Web. Most ...
The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the disc...
XML has become a popular method of data representation both on the web and in databases in recent years. One of the reasons for the popularity of XML has been its ability to encod...
Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua F...