The crawler engines of today cannot reach most of the information contained in the Web. A great amount of valuable information is "hidden" behind the query forms of onli...
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
It has frequently been observed that most of the world’s data lies outside database systems. The reason is that database systems focus on structured data, leaving the unstructur...
Alon Y. Halevy, Oren Etzioni, AnHai Doan, Zachary ...
Redocumentation is the recovery and recording of software comprehension. Since software comprehension is the most expensive part of software maintenance, redocumentation is the ke...
The extensible access control markup language (XACML) is the standard access control policy specification language of the World Wide Web. XACML does not provide exclusive accesse...