Abstract--Domain knowledge for web applications is currently being made available as domain ontology with the advent of the semantic web, in which semantics govern relationships am...
Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and mul...
Finding a set of web pages relevant to a user’s information goal is difficult due to the enormous size of the Internet. Search engines are able to find a set of pages that mat...
Fishnet is a web browser that always displays web pages in their entirety, independent of their size. Fishnet accomplishes this by using a fisheye view, i.e. by showing a focus re...
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...