Search Sciweavers | Sciweavers

2005 search results - page 196 / 401

» Decisive Markov Chains

164

click to vote

ICALP
2009
Springer

92views Programming Languages» more ICALP 2009»

Reachability in Stochastic Timed Games

16 years 7 months ago

Download www.lsv.ens-cachan.fr

We define stochastic timed games, which extend two-player timed games with probabilities (following a recent approach by Baier et al), and which extend in a natural way continuous-...

Patricia Bouyer, Vojtech Forejt

claim paper

Read More »

160

click to vote

ICRA
2007
IEEE

126views Robotics» more ICRA 2007»

A formal framework for robot learning and control under model uncertainty

16 years 29 days ago

Download www.cs.mcgill.ca

— While the Partially Observable Markov Decision Process (POMDP) provides a formal framework for the problem of robot control under uncertainty, it typically assumes a known and ...

Robin Jaulmes, Joelle Pineau, Doina Precup

claim paper

Read More »

128

click to vote

ECML
2007
Springer

108views Machine Learning» more ECML 2007»

Safe Q-Learning on Complete History Spaces

16 years 24 days ago

Download www.ni.uos.de

In this article, we present an idea for solving deterministic partially observable markov decision processes (POMDPs) based on a history space containing sequences of past observat...

Stephan Timmer, Martin Riedmiller

claim paper

Read More »

146

click to vote

ICANN
2007
Springer

95views Neural Networks» more ICANN 2007»

Solving Deep Memory POMDPs with Recurrent Policy Gradients

16 years 24 days ago

Download www.idsia.ch

Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...

Daan Wierstra, Alexander Förster, Jan Peters,...

claim paper

Read More »

206

click to vote

ICML
2006
IEEE

256views Machine Learning» more ICML 2006»

Automatic basis function construction for approximate dynamic programming and reinforcement learning

16 years 18 days ago

Download www.ece.mcgill.ca

We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...

Philipp W. Keller, Shie Mannor, Doina Precup

claim paper

Read More »

« Prev « First page 196 / 401 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers