Technological advances in the collection, storage and analysis of data have increased the ease with which businesses can make profitable use of information about individuals. Som...
Ram Gopal L., Robert S. Garfinkel, Manuel A. Nunez...
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
In this poster, we present an information extraction engine for web-based forums. The engine analyzes the HTML files crawled from web forums, deduces the wrapper (template) of the...
Hanny Yulius Limanto, Nguyen Ngoc Giang, Vo Tan Tr...
An important issue arising from P2P applications is how to accurately and efficiently retrieve the required Web services from large-scale repositories. This paper resolves this is...
We present an approach for detecting link spam common in blog comments by comparing the language models used in the blog post, the comment, and pages linked by the comments. In co...