Sciweavers

2827 search results - page 344 / 566
» Marking Text Documents
Sort
View
JCDL
2006
ACM
167views Education» more  JCDL 2006»
16 years 17 days ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma
MM
2000
ACM
87views Multimedia» more  MM 2000»
15 years 11 months ago
Giving meanings to WWW images
Images are increasingly being embedded in HTML documents on the WWW. Such documents over the WWW essentially provides a rich source of image collection from which users can query....
Heng Tao Shen, Beng Chin Ooi, Kian-Lee Tan
CIKM
2008
Springer
15 years 8 months ago
Learning to link with wikipedia
This paper describes how to automatically cross-reference documents with Wikipedia: the largest knowledge base ever known. It explains how machine learning can be used to identify...
David N. Milne, Ian H. Witten
DAS
2008
Springer
15 years 8 months ago
Truthing for Pixel-Accurate Segmentation
We discuss problems in developing policies for ground truthing document images for pixel-accurate segmentation. First, we describe ground truthing policies that apply to four diff...
Michael A. Moll, Henry S. Baird, Chang An
LREC
2008
114views Education» more  LREC 2008»
15 years 8 months ago
A Semantically Annotated Swedish Medical Corpus
With the information overload in the life sciences there is an increasing need for annotated corpora, particularly with biological and biomedical entities, which is the driving fo...
Dimitrios Kokkinakis