Previous work on text mining has almost exclusively focused on a single stream. However, we often have available multiple text streams indexed by the same set of time points (call...
Xuanhui Wang, ChengXiang Zhai, Xiao Hu, Richard Sp...
Recent years have witnessed a dramatic increase in the quantity of image data collected, due to advances in fields such as medical imaging, reconnaissance, surveillance, astronomy...
This paper considers the framework of the so-called "market basket problem", in which a database of transactions is mined for the occurrence of unusually frequent item s...
An approximate search query on a collection of strings finds those strings in the collection that are similar to a given query string, where similarity is defined using a given si...
Entity Resolution (ER) is the problem of identifying which records in a database refer to the same real-world entity. An exhaustive ER process involves computing the similarities b...
Steven Euijong Whang, David Menestrina, Georgia Ko...