In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
Abstract. Parallel corpora are a valuable resource for machine translation, but at present their availability and utility is limited by genreand domain-speci city, licensing restri...
The Web has many sites where users can exchange goods and services. Often, the end-users must write free-text descriptions of the goods and services they have available, or the goo...
Link-translating proxies are widely used for anonymous browsing, policy circumvention and WebVPN functions. These are implemented by encoding the destination URL in the path of th...
Web applications rely heavily on client-side computation to examine and validate form inputs that are supplied by a user (e.g., “credit card expiration date must be valid”). T...