Sciweavers

97 search results - page 12 / 20
» Highly Scalable Algorithms for Robust String Barcoding
Sort
View
166
Voted
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
16 years 20 days ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
163
Voted
EISWT
2007
15 years 7 months ago
Schema Based XML Compression
XML has grown into a widely used and highly developed technology, due in part to the subcomponents built around the technology (advanced parsers, frameworks, libraries, etc). The ...
Naphtali Rishe, Ouri Wolfson, Ben Wongsaroj, Damia...
153
Voted
COLCOM
2005
IEEE
15 years 11 months ago
On-demand overlay networking of collaborative applications
We propose a new overlay network, called Generic Identifier Network (GIN), for collaborative nodes to share objects with transactions across affiliated organizations by merging th...
Cheng-Jia Lai, Richard R. Muntz
149
Voted
CIKM
2009
Springer
16 years 12 days ago
Combining labeled and unlabeled data with word-class distribution learning
We describe a novel simple and highly scalable semi-supervised method called Word-Class Distribution Learning (WCDL), and apply it the task of information extraction (IE) by utili...
Yanjun Qi, Ronan Collobert, Pavel Kuksa, Koray Kav...
159
Voted
ISCC
2006
IEEE
129views Communications» more  ISCC 2006»
15 years 12 months ago
A Semantic Overlay Network for P2P Schema-Based Data Integration
Abstract— Today data sources are pervasive and their number is growing tremendously. Current tools are not prepared to exploit this unprecedented amount of information and to cop...
Carmela Comito, Simon Patarin, Domenico Talia