Many of the documents in large text collections are duplicates and versions of each other. In recent research, we developed new methods for finding such duplicates; however, as the...
Abstract. Several projects have brought rich data semantics to collaborative wikis, but blogging platforms remain primarily limited to text. As blogs comprise a significant portion...
Edward Benson, Adam Marcus 0002, Fabian Howahl, Da...
As the Internet continues to play an important role in many business applications, it becomes vital to increase the competitive edge by offering geographically tailored contents t...
In this paper we describe a Cross Document Summarizer XDoX designed specifically to summarize large document sets (50-500 documents and more). Such sets of documents are typically...
Speech synthesis options on assistive communication devices are very limited and do not reflect the user’s vocal quality or personality. Previous work suggests that speakers wit...