How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...
Most reinforcement learning models of animal conditioning operate under the convenient, though fictive, assumption that Pavlovian conditioning concerns prediction learning whereas...
Peter Dayan, Yael Niv, Ben Seymour, Nathaniel D. D...
This paper presents two pivot strategies for statistical machine transliteration, namely system-based pivot strategy and model-based pivot strategy. Given two independent source-p...
Min Zhang, Xiangyu Duan, Vladimir Pervouchine, Hai...
This paper is concerned with a new task of ranking, referred to as "supplementary data assisted ranking", or "supplementary ranking" for short. Different from c...
We propose an unbounded-depth, hierarchical, Bayesian nonparametric model for discrete sequence data. This model can be estimated from a single training sequence, yet shares stati...