This paper presents two methods which automatically produce annotated corpora for text summarisation on the basis of human abstracts. Both methods identify a set of sentences from ...
Text categorization and retrieval tasks are often based on a good representation of textual data. Departing from the classical vector space model, several probabilistic models have...
Bug triage, deciding what to do with an incoming bug report, is taking up increasing amount of developer resources in large open-source projects. In this paper, we propose to appl...
Kernel Methods are a class of algorithms for pattern analysis with a number of convenient features. They can deal in a uniform way with a multitude of data types and can be used to...
Image warping is a common problem when one scans or photocopies a document page from a thick bound volume, resulting in shading and curved text lines in the spine area of the boun...