In order for agents to act on behalf of users, they will have to retrieve and integrate vast amounts of textual data on the World Wide Web. However, much of the useful data on the...
In this paper we investigate unsupervised population of a biomedical ontology via information extraction from biomedical literature. Relationships in text seldom connect simple ent...
Cartic Ramakrishnan, Pablo N. Mendes, Shaojun Wang...
In this paper, we report on our experience with the creation of an automated, human-assisted process to extract metadata from documents in a large (>100,000), dynamically growi...
Jianfeng Tang, Kurt Maly, Steven J. Zeil, Mohammad...
Metasearch engine, Comparison-shopping and Deep Web crawling applications need to extract search result records enwrapped in result pages returned from search engines in response ...
How can we cull the facts we need from the overwhelming mass of information and misinformation that is the Web? The TextRunner extraction engine represents one approach, in which ...