It is observed that a better approach to Web information understanding is to base on its document framework, which is mainly consisted of (i) the title and the URL name of the pag...
Enabling keyword queries over relational databases (KQDB) benefits a large population of users who have difficulty in understanding the database schema or using SQLs. However, sin...
: Integrated access to distributed data is an important problem faced in scientific and commercial applications. A data integration system provides a unified view for users to subm...
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Social media are becoming increasingly popular and have attracted considerable attention from spammers. Using a sample of more than ninety thousand known spam Web sites, we found ...