Sciweavers

2422 search results - page 311 / 485
» Security Policy Consistency
Sort
View
CDC
2009
IEEE
132views Control Systems» more  CDC 2009»
15 years 11 months ago
Q-learning and Pontryagin's Minimum Principle
Abstract— Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It ...
Prashant G. Mehta, Sean P. Meyn
GECCO
2000
Springer
143views Optimization» more  GECCO 2000»
15 years 10 months ago
A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents
Reinforcement learning (RL) is one of the machine learning techniques and has been received much attention as a new self-adaptive controller for various systems. The RL agent auto...
Isao Ono, Tetsuo Nijo, Norihiko Ono
AAAI
2010
15 years 8 months ago
Towards Multiagent Meta-level Control
Embedded systems consisting of collaborating agents capable of interacting with their environment are becoming ubiquitous. It is crucial for these systems to be able to adapt to t...
Shanjun Cheng, Anita Raja, Victor R. Lesser
NIPS
2007
15 years 8 months ago
Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply Reinforcement Learning algorithm...
Alessandro Lazaric, Marcello Restelli, Andrea Bona...
ECAI
2010
Springer
15 years 6 months ago
Knowledge Compilation Using Interval Automata and Applications to Planning
Knowledge compilation [6, 5, 14, 8] consists in transforming a problem offline into a form which is tractable online. In this paper, we introduce new structures, based on the notio...
Alexandre Niveau, Hélène Fargier, C&...