The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
— This paper deals with automatic dialogue acts (DAs) recognition in Czech. Dialogue acts are sentence-level labels that represent different states of a dialogue, such as questio...
We propose a strategy to reduce the impact of the sparse data problem in the tasks of lexical information acquisition based on the observation of linguistic cues. It justifies tha...
Previous research has shown that the plausibility of an adjective-noun combination is correlated with its corpus co-occurrence frequency. In this paper, we estimate the co-occurre...
Group lasso is a natural extension of lasso and selects variables in a grouped manner. However, group lasso suffers from estimation inefficiency and selection inconsistency. To re...