Sciweavers

4234 search results - page 337 / 847
» A Method for Web Information Extraction
Sort
View
WWW
2005
ACM
16 years 7 months ago
Scaling link-based similarity search
To exploit the similarity information hidden in the hyperlink structure of the web, this paper introduces algorithms scalable to graphs with billions of vertices on a distributed ...
Balázs Rácz, Dániel Fogaras
CIKM
2008
Springer
15 years 8 months ago
Information shared by many objects
If Kolmogorov complexity [25] measures information in one object and Information Distance [4, 23, 24, 42] measures information shared by two objects, how do we measure information...
Chong Long, Xiaoyan Zhu, Ming Li, Bin Ma
ACL
2006
15 years 8 months ago
Selection of Effective Contextual Information for Automatic Synonym Acquisition
Various methods have been proposed for automatic synonym acquisition, as synonyms are one of the most fundamental lexical knowledge. Whereas many methods are based on contextual c...
Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama
SIGIR
2002
ACM
15 years 6 months ago
Unsupervised document classification using sequential information maximization
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...
Noam Slonim, Nir Friedman, Naftali Tishby
WWW
2008
ACM
16 years 7 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev