We experimented on task-level robot learning based on bi-directional theory. The via-point representation was used for ‘learning by watching’. In our previous work, we had a r...
We propose a method that we call auto-adaptive convolution which extends the classical notion of convolution in pictures analysis to function analysis on a discrete set. We define...
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
Abstract. In this paper, we discuss an approach to an operator scheduling problem in a large organization over time with the aim of maintaining service quality and reducing total l...
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...