We present a generative model for determining the information content of a message without analyzing the message content. Such a tool is useful for automated analysis of the vast ...
Yingjie Zhou, Malik Magdon-Ismail, William A. Wall...
This article introduces a scheme for clustering complex and linearly non-separable datasets, without any prior knowledge of the number of naturally occurring groups in the data. T...
We present SEMANDAQ, a prototype system for improving the quality of relational data. Based on the recently proposed conditional functional dependencies (CFDs), it detects and rep...
Unlike traditional database queries, keyword queries do not adhere to predefined syntax and are often dirty with irrelevant words from natural languages. This makes accurate and e...
Previous works about privacy preserving serial data publishing on dynamic databases have relied on unrealistic assumptions of the nature of dynamic databases. In many applications...
Yingyi Bu, Ada Wai-Chee Fu, Raymond Chi-Wing Wong,...