Sciweavers

WWW
2008
ACM
16 years 7 months ago
User behavior oriented web spam detection
Combating Web spam has become one of the top challenges for Web search engines. State-of-the-art spam detection techniques are usually designed for specific known types of Web spa...
Yiqun Liu, Min Zhang, Shaoping Ma, Liyun Ru
WWW
2008
ACM
16 years 7 months ago
IRLbot: scaling to 6 billion pages and beyond
This paper shares our experience in designing a web crawler that can download billions of pages using a single-server implementation and models its performance. We show that with ...
Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, Dmit...
WWW
2008
ACM
16 years 7 months ago
Ranking refinement and its application to information retrieval
We consider the problem of ranking refinement, i.e., to improve the accuracy of an existing ranking function with a small set of labeled instances. We are, particularly, intereste...
Rong Jin, Hamed Valizadegan, Hang Li
WWW
2008
ACM
16 years 7 months ago
Learning to classify short and sparse text & web with hidden topics from large-scale data collections
This paper presents a general framework for building classifiers that deal with short and sparse text & Web segments by making the most of hidden topics discovered from larges...
Xuan Hieu Phan, Minh Le Nguyen, Susumu Horiguchi
WWW
2008
ACM
16 years 7 months ago
A combinatorial allocation mechanism with penalties for banner advertising
Most current banner advertising is sold through negotiation thereby incurring large transaction costs and possibly suboptimal allocations. We propose a new automated system for se...
Uriel Feige, Nicole Immorlica, Vahab S. Mirrokni, ...
WWW
2008
ACM
16 years 7 months ago
Generating hypotheses from the web
Hypothesis generation is a crucial initial step for making scientific discoveries. This paper addresses the problem of automatically discovering interesting hypotheses from the we...
Wei Jin, Rohini K. Srihari, Abhishek Singh
WWW
2008
ACM
16 years 7 months ago
Sessionlock: securing web sessions against eavesdropping
Typical web sessions can be hijacked easily by a network eavesdropper in attacks that have come to be designated "sidejacking." The rise of ubiquitous wireless networks,...
Ben Adida
WWW
2008
ACM
16 years 7 months ago
How people use the web on mobile devices
This paper describes a series of user studies on how people use the Web via mobile devices. The data primarily comes from contextual inquiries with 47 participants between 2004 an...
Yanqing Cui, Virpi Roto
WWW
2008
ACM
16 years 7 months ago
Automatically refining the wikipedia infobox ontology
The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia inf...
Fei Wu, Daniel S. Weld
183
Voted
WWW
2008
ACM
16 years 7 months ago
Dtwiki: a disconnection and intermittency tolerant wiki
Wikis have proven to be a valuable tool for collaboration and content generation on the web. Simple semantics and ease-of-use make wiki systems well suited for meeting many emergi...
Bowei Du, Eric A. Brewer