We consider the use of Bayesian topic models in the analysis of computer network traffic. Our approach utilizes latent Dirichlet allocation and time-varying dynamic latent Dirich...
Open Information Extraction extracts relations from text without requiring a pre-specified domain or vocabulary. While existing techniques have used only shallow syntactic featur...
Janara Christensen, Mausam, Stephen Soderland, Ore...
We study the fundamental problem of computing distances between nodes in large graphs such as the web graph and social networks. Our objective is to be able to answer distance que...
Atish Das Sarma, Sreenivas Gollapudi, Marc Najork,...
In order to cope with the expected size of the Semantic Web (SW) in the coming years, we need to benchmark existing SW tools (e.g., query language interpreters) in a credible manne...
Yannis Theoharis, George Georgakopoulos, Vassilis ...
The Web is a valuable source of language speci c resources but the process of collecting, organizing and utilizing these resources is di cult. We describe CorpusBuilder, an approa...