When search results against digital libraries and web resources have limited metadata, augmenting them with meaningful and stable category information can enable better overviews ...
In this paper we introduce a new data gathering method “Web/URL Citation” and use it and Google Scholar as a basis to compare traditional and Web-based citation patterns acros...
This paper describes a new research proposal of multi-document summarization of dynamic content in web pages. Much information is lost in the Web due to the temporal character of w...
Web spam is behavior that attempts to deceive search engine ranking algorithms. TrustRank is a recent algorithm that can combat web spam. However, TrustRank is vulnerable in the s...
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...