We have developed a web-repository crawler that is used for reconstructing websites when backups are unavailable. Our crawler retrieves web resources from the Internet Archive, Go...
Plagiarism of material from the Internet is a widespread and growing problem. Computer science students, and those in other science and engineering courses, can sometimes get away...
The paper describes the first version of the TextMOLE (Text Mining Operations Library and Environment) system for textual data mining. Currently TextMOLE acts as an advanced inde...
Current paper-based interfaces such as PapierCraft, provide very little feedback and this limits the scope of possible interactions. So far, there has been little systematic explo...
Noun phrases of a document usually are the main information bearers. Thus, the detection of these units is crucial in many applications related to information retrieval, such as co...