Sentence level novelty detection aims at spotting sentences with novel information from an ordered sentence list. In the task, sentences appearing later in the list with no new me...
This paper presents a novel algorithm for document clustering based on a combinatorial framework of the Principal Direction Divisive Partitioning (PDDP) algorithm [1] and a simpli...
In this paper, we present an extension of PHIL, a declarative language for filtering information from XML data. The proposed approach allows us to extract relevant data as well a...
Conventional wisdom and current research suggest that the Internet will lower electronic commerce (EC) product prices by causing intense competition among vendors. However, this d...
Structure analysis of table form documents is an important issue because a printed document and even an electronic document do not provide logical structural information but merely...