Sciweavers

1163 search results - page 139 / 233
» An Index Structure To Retrieve Documents With Geographic Inf...
Sort
View
WEBI
2004
Springer
15 years 11 months ago
Semi-Structured Complex List Extraction
The semi-structured information available in HTML and similar documents provide valuable information that can be used for information extraction applications. This information tog...
Anders Arpteg
WEBI
2005
Springer
15 years 12 months ago
Integrating Element and Term Semantics for Similarity-Based XML Document Clustering
Structured link vector model (SLVM) is a recently proposed document representation that takes into account both structural and semantic information for measuring XML document simi...
Jianwu Yang, William K. Cheung, Xiaoou Chen
USENIX
1993
15 years 7 months ago
Essence: A Resource Discovery System Based on Semantic File Indexing
Discovering different types of file resources (such as documentation, programs, and images) in the vast amount of data contained within network file systems is useful for both u...
Darren R. Hardy, Michael F. Schwartz
DOCENG
2010
ACM
15 years 7 months ago
Picture detection in document page images
We present a method for picture detection in document page images, which can come from scanned or camera images, or rendered from electronic file formats. Our method uses OCR to s...
Patrick Chiu, Francine Chen, Laurent Denoue
WWW
2006
ACM
16 years 7 months ago
Towards practical genre classification of web documents
Classification of documents by genre is typically done either using linguistic analysis or term frequency based techniques. The former provides better classification accuracy than...
George Ferizis, Peter Bailey