Sciweavers

6328 search results - page 969 / 1266
» From Quantity to Quality
Sort
View
LREC
2010
217views Education» more  LREC 2010»
15 years 8 months ago
Building a Web Corpus of Czech
Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...
Drahomíra "johanka" Spoustová, Miros...
LREC
2010
178views Education» more  LREC 2010»
15 years 8 months ago
Data Issues in English-to-Hindi Machine Translation
Statistical machine translation to morphologically richer languages is a challenging task and more so if the source and target languages differ in word order. Current state-of-the...
Ondrej Bojar, Pavel Stranák, Daniel Zeman
LREC
2010
140views Education» more  LREC 2010»
15 years 8 months ago
mwetoolkit: a Framework for Multiword Expression Identification
This paper presents the Multiword Expression Toolkit (mwetoolkit), an environment for type and language-independent MWE identification from corpora. The mwetoolkit provides a targ...
Carlos Ramisch, Aline Villavicencio, Christian Boi...
LREC
2010
154views Education» more  LREC 2010»
15 years 8 months ago
Building a Bilingual ValLex Using Treebank Token Alignment: First Observations
In this paper we explore the potential and limitations of a concept of building a bilingual valency lexicon based on the alignment of nodes in a parallel treebank. Our aim is to b...
Jana Sindlerová, Ondrej Bojar
ECIR
2008
Springer
15 years 8 months ago
A Novel Implementation of the FITE-TRT Translation Method
Cross-language Information Retrieval requires good methods for translating cross-lingual spelling variants which are not covered by the available dictionary resources. FITE-TRT is ...
Aki Loponen, Ari Pirkola, Kalervo Järvelin, H...