Sciweavers

829 search results - page 53 / 166
» A time aggregation approach to Markov decision processes
Sort
View
ICANN
2007
Springer
16 years 9 days ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
ICIP
2002
IEEE
16 years 7 months ago
Extract highlights from baseball game video with hidden Markov models
In this paper, we describe a statistical method to detect highlights in a baseball game video. The input video is first segmented into scene shots, within which the camera motion ...
Peng Chang, Mei Han, Yihong Gong
AAAI
2012
13 years 8 months ago
Tree-Based Solution Methods for Multiagent POMDPs with Delayed Communication
Planning under uncertainty is an important and challenging problem in multiagent systems. Multiagent Partially Observable Markov Decision Processes (MPOMDPs) provide a powerful fr...
Frans Adriaan Oliehoek, Matthijs T. J. Spaan
ECML
2006
Springer
15 years 8 months ago
Reinforcement Learning for MDPs with Constraints
In this article, I will consider Markov Decision Processes with two criteria, each defined as the expected value of an infinite horizon cumulative return. The second criterion is e...
Peter Geibel
NETWORKING
2000
15 years 7 months ago
Fairness and Aggregation: A Primal Decomposition Study
Abstract. We examine the fair allocation of capacity to a large population of best-effort connections in a typical multiple access communication system supporting some bandwidth on...
André Girard, Catherine Rosenberg, Mohammed...