The Mixed Raster Content (MRC) ITU document compression standard (T.44) specifies a multilayer decomposition model for compound documents into two contone image layers and a binar...
The Mixed Raster Content (MRC) document compression is a well documented standard. Its efficiency for representing sharp text and graphics over a background has been extensively p...
Text classification categories Web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consumi...
as a tree which specifies the presentation in an abstract, machineindependent way. This specification is created and edited using an authoring system; it is mapped to a particula...
Guido van Rossum, Jack Jansen, K. Sjoerd Mullender...
Abstract. An increasing and overwhelming amount of biomedical information is available in the research literature mainly in the form of free-text. Biologists need tools that automa...