We investigate the connection between part of speech (POS) distribution and content in language. We define POS blocks to be groups of parts of speech. We hypothesise that there ex...
This paper describes an adapted information bottleneck approach for construction of domain-oriented sentiment lexicon. The basic idea is to use three kinds of relationships (WWint...
In our participation to the 2010 LogCLEF track we focused on the analysis of the European Library (TEL) logs and in particular we experimented with the identification of the natura...
We propose a technique for measuring the structural similarity of semistructured documents based on entropy. After extracting the structural information from two documents we use ...
The effectiveness of information retrieval systems is measured by comparing performance on a common set of queries and documents. Significance tests are often used to evaluate the...