We present DiTaBBu, Digital Talking Books Builder, a framework for automatic production of time-based hypermedia for the Web, focusing on the Digital Talking Books domain. Deliver...
Addressed in this paper is the issue of semantic relationship extraction from semi-structured documents. Many research efforts have been made so far on the semantic information ex...
In this paper, we proposed an online algorithm, called FQT-Stream (Frequent Query Trees of Streams), to mine the set of all frequent tree patterns over a continuous XML data strea...
An important issue arising from large scale data integration is how to efficiently select the top-K ranking answers from multiple sources while minimizing the transmission cost. T...
An important issue arising from Peer-to-Peer applications is how to accurately and efficiently retrieve a set of K best matching data objects from different sources while minimizi...
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
Web pages include extraneous material that may be viewed as undesirable by a user. Increasingly many Web sites also require users to register to access either all or portions of t...
Contextual search refers to proactively capturing the information need of a user by automatically augmenting the user query with information extracted from the search context; for...
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
The use of Semantic Web Service (SWS) technologies have been suggested to enable more dynamic B2B integration of heterogeneous systems and partners. We present how we add semantic...