We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
This paper examines how spam behavior was impacted by the shutdown of McColo, a service provider known for its lax security enforcement. Since the shutdown, a variety of sources h...
Steve DiBenedetto, Daniel Massey, Christos Papadop...
: Data mining is the process of posing queries and extracting patterns, often previously unknown from large quantities of data using pattern matching or other reasoning techniques....
Rule bases are increasingly being used as repositories of knowledge content on the Semantic Web. As the size and complexity of these rule bases s, developers and end users need met...