Recent progress in information extraction technology has enabled a vast array of applications that rely on structured data that is embedded in natural-language text. In particular...
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
Querying the Web today can be a frustrating activity because the results delivered by syntactically oriented search engines often do not match the intentions of the user. The DARP...
Grit Denker, Jerry R. Hobbs, David L. Martin, Srin...
Ranking search results is a fundamental problem in information retrieval. In this paper we explore whether the use of proximity and phrase information can improve web retrieval ac...
Building content-based search tools for feature-rich data has been a challenging problem because feature-rich data such as audio recordings, digital images, and sensor data are in...
Qin Lv, William Josephson, Zhe Wang, Moses Charika...