We develop a novel framework for the page-level template detection problem. Our framework is built on two main ideas. The first is the automatic generation of training data for a ...
It is now a common practice for e-commerce Web sites to enable their customers to write reviews of products that they have purchased. Such reviews provide valuable sources of info...
By far, the support vector machines (SVM) achieve the state-of-theart performance for the text classification (TC) tasks. Due to the complexity of the TC problems, it becomes a ch...
Focused crawlers are considered as a promising way to tackle the scalability problem of topic-oriented or personalized search engines. To design a focused crawler, the choice of s...
Abstract. Translating XML data into ontologies is the problem of finding an instance of an ontology, given an XML document and a specification of the relationship between the XML...