We consider the problem of estimating occurrence rates of rare events for extremely sparse data, using pre-existing hierarchies to perform inference at multiple resolutions. In pa...
Deepak Agarwal, Andrei Z. Broder, Deepayan Chakrab...
Many websites have large collections of pages generated dynamically from an underlying structured source like a database. The data of a category are typically encoded into similar...
Studies have shown that program comprehension takes up to 45% of software development costs. Such high costs are caused by the lack-of documented specification and further aggrava...
The output of a data mining algorithm is only as good as its inputs, and individuals are often unwilling to provide accurate data about sensitive topics such as medical history an...
Time-series of count data are generated in many different contexts, such as web access logging, freeway traffic monitoring, and security logs associated with buildings. Since this...