A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of rev...
With the page explosion of WWW, how to cover more useful information with limited storage and computation resources becomes more and more important in web IR research. Using web p...
The increased importance of XML as a universal data representation format has led to several proposals for enabling the development of applications that operate on XML data. These...
Igor Peshansky, Michael G. Burke, Mukund Raghavach...
This paper explains our research and implementations of manual, automatic and deep annotations of provenance logs for e-Science in silico experiments. Compared to annotating gener...
The research in information extraction (IE) regards the generation of wrappers that can extract particular information from semistructured Web documents. Similar to compiler gener...