Sciweavers

2005 search results - page 196 / 401
» Decisive Markov Chains
Sort
View
ICALP
2009
Springer
16 years 7 months ago
Reachability in Stochastic Timed Games
We define stochastic timed games, which extend two-player timed games with probabilities (following a recent approach by Baier et al), and which extend in a natural way continuous-...
Patricia Bouyer, Vojtech Forejt
ICRA
2007
IEEE
126views Robotics» more  ICRA 2007»
16 years 29 days ago
A formal framework for robot learning and control under model uncertainty
— While the Partially Observable Markov Decision Process (POMDP) provides a formal framework for the problem of robot control under uncertainty, it typically assumes a known and ...
Robin Jaulmes, Joelle Pineau, Doina Precup
ECML
2007
Springer
16 years 24 days ago
Safe Q-Learning on Complete History Spaces
In this article, we present an idea for solving deterministic partially observable markov decision processes (POMDPs) based on a history space containing sequences of past observat...
Stephan Timmer, Martin Riedmiller
ICANN
2007
Springer
16 years 24 days ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
ICML
2006
IEEE
16 years 18 days ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Philipp W. Keller, Shie Mannor, Doina Precup