In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
A web page may be relevant to multiple topics; even when nominally on a single topic, the page may attract attention (and thus links) from multiple communities. Instead of indiscr...
This study examines the facets and patterns of multiple Web query reformulations with a focus on reformulation sequences. Based on IR interaction models, it was presumed that quer...
The use of RDF data published on the Web for applications is still a cumbersome and resource-intensive task due to the limited software support and the lack of standard programmin...
Danh Le Phuoc, Axel Polleres, Manfred Hauswirth, G...
There has been a great amount of work on query-independent summarization of documents. However, due to the success of Web search engines query-specific document summarization (que...