We consider the problem of finding duplicates in data streams. Duplicate detection in data streams is utilized in various applications including fraud detection. We develop a solu...
Finding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. A number of algorithms have been proposed to process ...
With the fast increase in Web activities, Web data mining has recently become an important research topic. However, most previous studies of mining path traversal patterns are bas...
Maximizing only the relevance between queries and documents will not satisfy users if they want the top search results to present a wide coverage of topics by a few representative...
Yi Liu, Benyu Zhang, Zheng Chen, Michael R. Lyu, W...
This paper explains our research and implementations of manual, automatic and deep annotations of provenance logs for e-Science in silico experiments. Compared to annotating gener...