Search Sciweavers | Sciweavers

1647 search results - page 145 / 330

» Radial Structure of the Internet

150

click to vote

WWW
2007
ACM

162views Internet Technology» more WWW 2007»

Detecting near-duplicates for web crawling

16 years 7 months ago

Download infolab.stanford.edu

Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...

Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma

claim paper

Read More »

167

click to vote

WWW
2007
ACM

131views Internet Technology» more WWW 2007»

U-REST: an unsupervised record extraction system

16 years 7 months ago

Download people.csail.mit.edu

In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...

Yuan Kui Shen, David R. Karger

claim paper

Read More »

166

click to vote

WWW
2007
ACM

137views Internet Technology» more WWW 2007»

Classifying web sites

16 years 7 months ago

Download www2007.org

In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality....

Christoph Lindemann, Lars Littig

claim paper

Read More »

137

click to vote

WWW
2006
ACM

114views Internet Technology» more WWW 2006»

XPath filename expansion in a Unix shell

16 years 7 months ago

Download dret.net

Locating files based on file system structure, file properties, and maybe even file contents is a core task of the user interface of operating systems. By adapting XPath's po...

Kaspar Giger, Erik Wilde

claim paper

Read More »

162

click to vote

WWW
2006
ACM

147views Internet Technology» more WWW 2006»

Mining clickthrough data for collaborative web search

16 years 7 months ago

Download www.ra.ethz.ch

This paper is to investigate the group behavior patterns of search activities based on Web search history data, i.e., clickthrough data, to boost search performance. We propose a ...

Jian-Tao Sun, Xuanhui Wang, Dou Shen, Hua-Jun Zeng...

claim paper

Read More »

« Prev « First page 145 / 330 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers