Abstract. The identification of reliable and interesting items on Internet becomes more and more difficult and time consuming. This paper is a position paper describing our intend...
Parallel text is one of the most valuable resources for development of statistical machine translation systems and other NLP applications. The Linguistic Data Consortium (LDC) has...
Multi-word terms are traditionally identified using statistical techniques or, more recently, using hybrid techniques combining statistics with shallow linguistic information. Al)...
One use of simulation is to inform decision makers that seek to select the best of several alternative systems. The system with the highest (or lowest) mean value for simulation o...
This article advocates a new model for inductive learning. Called sequential induction, it helps bridge classical fixed-sample learning techniques (which are efficient but difficu...