Recent work has shown the feasibility and promise of templateindependent Web data extraction. However, existing approaches use decoupled strategies ? attempting to do data record ...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...
Abstract— Maintaining semantic consistency of data is a significant problem in distributed information systems, particularly those on which a business may depend. Our current wo...
Jeremy W. Bryans, John S. Fitzgerald, Alexander Ro...
Computing the degree of semantic relatedness of words is a key functionality of many language applications such as search, clustering, and disambiguation. Previous approaches to c...
Kira Radinsky, Eugene Agichtein, Evgeniy Gabrilovi...
Web based applications offer a mainstream channel for businesses to manage their activities. We model such business activity in a grammar-based framework. The Backus Naur form not...
The requirements imposed on information retrieval systems are increasing steadily. The vast number of documents in today's large databases and especially on World Wide Web ca...