We present V-measure, an external entropybased cluster evaluation measure. Vmeasure provides an elegant solution to many problems that affect previously defined cluster evaluatio...
Abstract. We extend an automatically generated bilingual JapaneseSwedish dictionary with new translations, automatically discovered from the multi-lingual online encyclopedia Wikip...
This paper presents a method for compiling a large-scale bilingual corpus from a database of movie subtitles. To create the corpus, we propose an algorithm based on Gale and Churc...
We illustrate that Web searches can often be utilized to generate background text for use with text classification. This is the case because there are frequently many pages on the...
Web search engines compete to offer the fastest responses with highest relevance. However, as Web collections grow, it becomes more difficult to achieve this purpose. As most user...