Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification tech...
Tie-Yan Liu, Yiming Yang, Hao Wan, Qian Zhou, Bin ...
Government regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulation...
In this paper we describe a novel approach for searching large data sets from a mobile phone. Existing interfaces for mobile search require keyword text entry and are not suited f...
Amy K. Karlson, George G. Robertson, Daniel C. Rob...
This paper introduces Mycrocosm, a microblogging site in which users communicate via statistical graphics, rather than the usual short text statements. Users of Mycrocosm can reco...
We present Luminoso, a tool that helps researchers to visualize and understand a dimensionality-reduced semantic space by exploring it interactively. It also streamlines the proce...
Robert Speer, Catherine Havasi, K. Nichole Treadwa...