E-mails concerning the development issues of a system constitute an important source of information about high-level design decisions, low-level implementation concerns, and the s...
A recently proposed approach to address privacy concerns in storing web search querylogs is bundling logs of multiple users together. In this work we investigate privacy leaks tha...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
Search engine logs are an emerging new type of data that offers interesting opportunities for data mining. Existing work on mining such data has mostly attempted to discover knowl...
This paper proposes new extensions of the digital book concept together with the required approaches to support their automatic generation. Most best-sellers have often inspired o...