In this paper, we describe an unsupervised segmentation method for contours which proves quite adapted for the images obtained by electronic acquisition. We present two statistica...
Unsupervised segmentation of weather images into features that correspond to physical storms is a fundamental and difficult problem. Treating an infrared satellite image as a Mark...
When the transition probabilities and rewards of a Markov Decision Process are specified exactly, the problem can be solved without any interaction with the environment. When no s...
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...