Sciweavers

163 search results - page 5 / 33
» Policy Gradient Methods for Robotics
Sort
View
130
Voted
ICANN
2007
Springer
15 years 12 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
ICMLA
2010
15 years 3 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
NIPS
2008
15 years 7 months ago
Particle Filter-based Policy Gradient in POMDPs
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
Pierre-Arnaud Coquelin, Romain Deguest, Rém...
JMLR
2006
124views more  JMLR 2006»
15 years 5 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
AIPS
2007
15 years 8 months ago
Concurrent Probabilistic Temporal Planning with Policy-Gradients
We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search t...
Douglas Aberdeen, Olivier Buffet