Search Sciweavers | Sciweavers

1486 search results - page 98 / 298

» A Document as a Small World

118

click to vote

CEAS
2007
Springer

109views Internet Technology» more CEAS 2007»

Hardening Fingerprinting by Context

16 years 19 days ago

Download www.ceas.cc

Near-duplicate detection is not only an important pre and post processing task in Information Retrieval but also an eﬀective spam-detection technique. Among diﬀerent approache...

Aleksander Kolcz, Abdur Chowdhury

claim paper

Read More »

162

click to vote

SIGIR
2010
ACM

205views Information Technology» more SIGIR 2010»

Adaptive near-duplicate detection via similarity learning

15 years 10 months ago

Download research.microsoft.com

In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...

Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz

claim paper

Read More »

191

click to vote

CLIN
2001

137views Computational Linguistics» more CLIN 2001»

Applying Monte Carlo Techniques to Language Identification

15 years 7 months ago

Download www.xs4all.nl

Two major stages stages in language identification systems can be identified: the language modeling stage, where the distinctive features of languages are determined and stored in...

Arjen Poutsma

claim paper

Read More »

169

click to vote

CORR
2006
Springer

100views Education» more CORR 2006»

Automatic annotation of multilingual text collections with a conceptual thesaurus

15 years 6 months ago

Download langtech.jrc.it

Automatic annotation of documents with controlled vocabulary terms (descriptors) from a conceptual thesaurus is not only useful for document indexing and retrieval. The mapping of...

Bruno Pouliquen, Ralf Steinberger, Camelia Ignat

claim paper

Read More »

192

click to vote

JAIR
2010

94views more JAIR 2010»

Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback

15 years 4 months ago

Download www.jair.org

While traditional research on text clustering has largely focused on grouping documents by topic, it is conceivable that a user may want to cluster documents along other dimension...

Sajib Dasgupta, Vincent Ng

claim paper

Read More »

« Prev « First page 98 / 298 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers