Sciweavers

829 search results - page 98 / 166
» A time aggregation approach to Markov decision processes
Sort
View

Publication
273views
15 years 1 months ago
Monte Carlo Value Iteration for Continuous-State POMDPs
Partially observable Markov decision processes (POMDPs) have been successfully applied to various robot motion planning tasks under uncertainty. However, most existing POMDP algo...
Haoyu Bai, David Hsu, Wee Sun Lee, and Vien A. Ngo
JMLR
2010
189views more  JMLR 2010»
15 years 1 months ago
Adaptive Step-size Policy Gradients with Average Reward Metric
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
AAAI
2011
14 years 6 months ago
Linear Dynamic Programs for Resource Management
Sustainable resource management in many domains presents large continuous stochastic optimization problems, which can often be modeled as Markov decision processes (MDPs). To solv...
Marek Petrik, Shlomo Zilberstein
ICIP
2005
IEEE
16 years 7 months ago
Content-based video copy detection in large databases: a local fingerprints statistical similarity search approach
Recent methods based on interest points and local fingerprints have been proposed to perform robust CBCD (content-based copy detection) of images and video. They include two steps...
Alexis Joly, Carl Frélicot, Olivier Buisson
ICRA
2010
IEEE
163views Robotics» more  ICRA 2010»
15 years 4 months ago
Exploiting domain knowledge in planning for uncertain robot systems modeled as POMDPs
Abstract— We propose a planning algorithm that allows usersupplied domain knowledge to be exploited in the synthesis of information feedback policies for systems modeled as parti...
Salvatore Candido, James C. Davidson, Seth Hutchin...