As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
Modern critical editions of ancient works generally include manually created indices of other sources quoted in the text. Since indices can be considered as a form of domain speci...
This paper describes an attempt to model the method of generating a fact from the opinions of other persons or of institutions as a process which is based on knowledge about these...
We introduce two new index structures based on the q-gram index. The new structures index substrings of variable length instead of q-grams of fixed length. For both of the new ind...
Influenced by the linking model which is implicit in HTML, today’s publishing model on the Web is contentcentered, with the emphasis of publishing on content rather than links....