We describe a method to retrieve images found on web pages with specified object class labels, using an analysis of text around the image and of image appearance. Our method dete...
In order to artificially boost the rank of commercial pages in search engine results, search engine optimizers pay for links to these pages on other websites. Identifying paid lin...
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF8 to UTF-...
This poster presents ongoing research on how discursive and editing behaviors are regulated on Wikipedia by means of documented rules and practices. Our analysis focuses on three ...
Jonathan T. Morgan, Katie Derthick, Toni Ferro, El...
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...