Search Sciweavers | Sciweavers

2190 search results - page 176 / 438

» Unweaving a web of documents

172

click to vote

KDD
2006
ACM

185views Data Mining» more KDD 2006»

Understanding Content Reuse on the Web: Static and Dynamic Analyses

16 years 7 months ago

Download homepages.dcc.ufmg.br

Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...

Ricardo A. Baeza-Yates, Álvaro R. Pereira J...

claim paper

Read More »

218

click to vote

WWW
2011
ACM

201views Internet Technology» more WWW 2011»

Two-stream indexing for spoken web search

15 years 29 days ago

Download ebiquity.umbc.edu

This paper presents two-stream processing of audio to index the audio content for Spoken Web search. The ﬁrst stream indexes the meta-data associated with a particular audio doc...

Jitendra Ajmera, Anupam Joshi, Sougata Mukherjea, ...

claim paper

Read More »

184

click to vote

CPM
2000
Springer

177views Combinatorics» more CPM 2000»

Identifying and Filtering Near-Duplicate Documents

15 years 11 months ago

Download www.cs.princeton.edu

Abstract. The mathematical concept of document resemblance captures well the informal notion of syntactic similarity. The resemblance can be estimated using a ﬁxed size “sketch...

Andrei Z. Broder

claim paper

Read More »

180

click to vote

DOCENG
2007
ACM

143views Document Analysis» more DOCENG 2007»

Elimination of junk document surrogate candidates through pattern recognition

15 years 10 months ago

Download research.cs.tamu.edu

A surrogate is an object that stands for a document and enables navigation to that document. Hypermedia is often represented with textual surrogates, even though studies have show...

Eunyee Koh, Daniel Caruso, Andruid Kerne, Ricardo ...

claim paper

Read More »

161

click to vote

WWW
2001
ACM

171views Internet Technology» more WWW 2001»

Algorithms and programming models for efficient representation of XML for Internet applications

16 years 7 months ago

Download www10.org

XML is poised to take the World-Wide-Web to the next level of innovation. XML data, large or small, with or without associated schema, will be exchanged between increasing number ...

Neel Sundaresan, Reshad Moussa

claim paper

Read More »

« Prev « First page 176 / 438 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers