Sciweavers

800 search results - page 121 / 160
» Information Entropy Measure for Evaluation of Image Quality
Sort
View
CIKM
2008
Springer
15 years 8 months ago
Achieving both high precision and high recall in near-duplicate detection
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Lian'en Huang, Lei Wang, Xiaoming Li
CIKM
2008
Springer
15 years 8 months ago
A densitometric approach to web page segmentation
Web Page segmentation is a crucial step for many applications in Information Retrieval, such as text classification, de-duplication and full-text search. In this paper we describe...
Christian Kohlschütter, Wolfgang Nejdl
LREC
2010
203views Education» more  LREC 2010»
15 years 7 months ago
Arabic Parsing Using Grammar Transforms
We investigate Arabic Context Free Grammar parsing with dependency annotation comparing lexicalised and unlexicalised parsers. We study how morphosyntactic as well as function tag...
Lamia Tounsi, Josef van Genabith
SDM
2008
SIAM
135views Data Mining» more  SDM 2008»
15 years 7 months ago
A Spamicity Approach to Web Spam Detection
Web spam, which refers to any deliberate actions bringing to selected web pages an unjustifiable favorable relevance or importance, is one of the major obstacles for high quality ...
Bin Zhou 0002, Jian Pei, ZhaoHui Tang
LREC
2008
120views Education» more  LREC 2008»
15 years 7 months ago
Portuguese-English Word Alignment: some Experiments
In this paper we describe some studies of Portuguese-English word alignment, focusing on (i) measuring the importance of the coupling between dictionaries and corpus; (ii) assessi...
Diana Santos, Alberto Simões