We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Creation of the reusable learning content in the process of work is a challenging but promising trend in e-learning and knowledge management. While the main research focus nowadays...
Abstract. Hypertext categorization is the task of automatically assigning category labels to hypertext units. Comparable to text categorization it stays in the area of function lea...
In many occasions would one encounter the task of maintaining the consistency of two pieces of structured data that are related by some transform — synchronising bookmarks in diï...
With the increasing availability of Web-enabled mobile devices, we are facing the problem to effectively adapt Web content for those devices. For adaptation, Web page structures r...
Robbie Schaefer, Andreas Dangberg, Wolfgang Mü...