To solve this problem, we devised the HS-bitmap index, which is hierarchically comprised of compressed data of summary bits. A summary bit in an upper matrix is obtained by logical...
The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...
Determining the similarity of short text snippets, such as search queries, works poorly with traditional document similarity measures (e.g., cosine), since there are often few, if...
With the fast increase in Web activities, Web data mining has recently become an important research topic. However, most previous studies of mining path traversal patterns are bas...
The Web Ontology Language (OWL) defines three classes of documents: Lite, DL and Full. All RDF/XML documents are OWL Full documents, some OWL Full documents are also OWL DL docume...