Abstract. A large volume of data with complex structures is currently represented in GML (Geography Markup Language) for storing and exchanging geographic information. As the size ...
This paper describes the process and the resources used to automatically annotate a French corpus of spontaneous speech transcriptions in super-chunks. Super-chunks are enhanced c...
Olivier Blanc, Matthieu Constant, Anne Dister, Pat...
Identification of transliterations is aimed at enriching multilingual lexicons and improving performance in various Natural Language Processing (NLP) applications including Cross ...
The JOS language resources are meant to facilitate developments of HLT and corpus linguistics for the Slovene language and consist of the morphosyntactic specifications, defining ...
Tomaz Erjavec, Darja Fiser, Simon Krek, Nina Ledin...
This paper deals with the task of large vocabulary proper name recognition. In order to accomodate a wide diversity of possible name pronunciations (due to non-native name origins...