An important factor for discriminating between webpages is their genre (e.g., blogs, personal homepages, e-shops, online newspapers, etc). Webpage genre identification has a great...
Large collections of documents containing various types of multimedia, are made available to the WWW. Unfortunately, due to the un-structuredness of Internet environments it is ha...
An unsupervised clustering of the webpages on a website is a primary requirement for most wrapper induction and automated data extraction methods. Since page content can vary dras...
Measuring the similarity between implicit semantic relations is an important task in information retrieval and natural language processing. For example, consider the situation whe...
Today’s web is so huge and diverse that it arguably reflects the real world. For this reason, searching the web is a promising approach to find things in the real world. This ...