Sciweavers

7495 search results - page 360 / 1499
» Intelligent Document Processing
Sort
View
SIGMOD
2003
ACM
144views Database» more  SIGMOD 2003»
16 years 6 months ago
Exchanging Intensional XML Data
XML is becoming the universal format for data exchange between applications. Recently, the emergence of Web services as standard means of publishing and accessing data on the Web ...
Tova Milo, Serge Abiteboul, Bernd Amann, Omar Benj...
169
Voted
EDBT
2006
ACM
112views Database» more  EDBT 2006»
16 years 6 months ago
Indexing Shared Content in Information Retrieval Systems
Abstract. Modern document collections often contain groups of documents with overlapping or shared content. However, most information retrieval systems process each document separa...
Andrei Z. Broder, Nadav Eiron, Marcus Fontoura, Mi...
ICDAR
2009
IEEE
16 years 1 months ago
High Performance Chinese/English Mixed OCR with Character Level Language Identification
Currently, there have been several high performance OCR products for Chinese or for English. However, no one OCR technique can be simultaneously fit for both the English and the C...
Kai Wang, Jianming Jin, Qingren Wang
SIGIR
2006
ACM
16 years 21 days ago
Near-duplicate detection by instance-level constrained clustering
For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
Hui Yang, James P. Callan
191
Voted
PKDD
1998
Springer
113views Data Mining» more  PKDD 1998»
15 years 11 months ago
Text Mining at the Term Level
Knowledge Discovery in Databases (KDD) focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on...
Ronen Feldman, Moshe Fresko, Yakkov Kinar, Yehuda ...