In this paper we discuss the problem of building the Polish lexicon for the Cyc ontology. As the ontology is very large and complex we describe semi-automatic translation of part o...
Language identification is the task of identifying the language a given document is written in. This paper describes a detailed examination of what models perform best under diffe...
A challenge for programming language research is to design and implement multi-threaded low-level languages providing static guarantees for memory safety and freedom from data rac...
This paper presents an investigation of the relation between words and their gender in two gendered languages: German and Romanian. Gender is an issue that has long preoccupied li...
We investigate the potential of Tree Substitution Grammars as a source of features for native language detection, the task of inferring an author’s native language from text in ...