As the World Wide Web in China grows rapidly, mining knowledge in Chinese Web pages becomes more and more important. Mining Web information usually relies on the machine learning ...
Maintaining and extending large thesauri is an important challenge facing digital libraries and IT businesses alike. In this paper we describe a method building on and extending ex...
Robert Meusel, Mathias Niepert, Kai Eckert, Heiner...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
The Web has been rapidly "deepened" by myriad searchable databases online, where data are hidden behind query interfaces. As an essential task toward integrating these m...
In relation extraction, distant supervision seeks to extract relations between entities from text by using a knowledge base, such as Freebase, as a source of supervision. When a s...