Sciweavers

680 search results - page 42 / 136
» From Structured Documents to Novel Query Facilities
Sort
View
DOCENG
2009
ACM
16 years 21 days ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
IPPS
2008
IEEE
16 years 18 days ago
Multi-threaded data mining of EDGAR CIKs (Central Index Keys) from ticker symbols
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
Dougal A. Lyon
EDOC
2003
IEEE
15 years 11 months ago
MQL: a Powerful Extension to OCL for MOF Queries
The Meta-Object Facility (MOF) provides a standardised framework for object-oriented models. An instance of a MOF model contains objects and links whose interfaces are entirely de...
David Hearnden, Kerry Raymond, Jim Steel
CVPR
2007
IEEE
16 years 8 months ago
Multi-scale Structural Saliency for Signature Detection
Detecting and segmenting free-form objects from cluttered backgrounds is a challenging problem in computer vision. Signature detection in document images is one classic example an...
Guangyu Zhu, Yefeng Zheng, David S. Doermann, Stef...
AAAI
2008
15 years 8 months ago
Extracting Relevant Snippets for Web Navigation
Search engines present fix-length passages from documents ranked by relevance against the query. In this paper, we present and compare novel, language-model based methods for extr...
Qing Li, K. Selçuk Candan, Qi Yan