Sciweavers

829 search results - page 114 / 166
» A time aggregation approach to Markov decision processes
Sort
View
ICML
2007
IEEE
16 years 7 months ago
Learning state-action basis functions for hierarchical MDPs
This paper introduces a new approach to actionvalue function approximation by learning basis functions from a spectral decomposition of the state-action manifold. This paper exten...
Sarah Osentoski, Sridhar Mahadevan
ECML
2007
Springer
16 years 10 days ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
AI
2006
Springer
15 years 10 months ago
Belief Selection in Point-Based Planning Algorithms for POMDPs
Abstract. Current point-based planning algorithms for solving partially observable Markov decision processes (POMDPs) have demonstrated that a good approximation of the value funct...
Masoumeh T. Izadi, Doina Precup, Danielle Azar
AAAI
2007
15 years 8 months ago
Compact Spectral Bases for Value Function Approximation Using Kronecker Factorization
A new spectral approach to value function approximation has recently been proposed to automatically construct basis functions from samples. Global basis functions called proto-val...
Jeffrey Johns, Sridhar Mahadevan, Chang Wang
NIPS
2001
15 years 7 months ago
Multiagent Planning with Factored MDPs
We present a principled and efficient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication be...
Carlos Guestrin, Daphne Koller, Ronald Parr