Extensible Markup Language (XML) is becoming the de facto standard for exchanging information over the Internet, which results in the proliferation of XML documents. This has led ...
Many private and/or public organizations have been reported to create and monitor targeted Twitter streams to collect and understand users’ opinions about the organizations. Tar...
Recent research in domain-independent information extraction holds the promise of an automatically-constructed structured database derived from the Web. A query system based on th...
Background: The indexing of scientific literature and content is a relevant and contemporary requirement within life science information systems. Navigating information available ...
Christopher J. O. Baker, Kanagasabai Rajaraman, We...
In a corpus of jokes, a human might judge two documents to be the "same joke" even if characters, locations, and other details are varied. A given joke could be retold w...