Abstract: Recently a growing demand has arisen for methods for the development of smalland medium scale Web Information Systems (WIS). Web applications are being built in a rapidly...
We propose a novel approach to find aliases of a given name from the web. We exploit a set of known names and their aliases as training data and extract lexical patterns that conv...
Documents in the Web are often organized using category trees by information providers (e.g. CNN, BBC) or search engines (e.g. Google, Yahoo!). Such category trees are commonly kn...
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
In most web sites, web-based applications (such as web portals, emarketplaces, search engines), and in the file systems of personal computers, a wide variety of schemas (such as t...
Paolo Bouquet, Luciano Serafini, Stefano Zanobini,...