Sciweavers

3928 search results - page 602 / 786
» Optimal suffix selection
Sort
View
ICML
2003
IEEE
16 years 7 months ago
Q-Decomposition for Reinforcement Learning Agents
The paper explores a very simple agent design method called Q-decomposition, wherein a complex agent is built from simpler subagents. Each subagent has its own reward function and...
Stuart J. Russell, Andrew Zimdars
ICML
2002
IEEE
16 years 7 months ago
Coordinated Reinforcement Learning
We present several new algorithms for multiagent reinforcement learning. A common feature of these algorithms is a parameterized, structured representation of a policy or value fu...
Carlos Guestrin, Michail G. Lagoudakis, Ronald Par...
ICML
1995
IEEE
16 years 7 months ago
Learning Policies for Partially Observable Environments: Scaling Up
Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor fee...
Michael L. Littman, Anthony R. Cassandra, Leslie P...
ICSE
2009
IEEE-ACM
16 years 7 months ago
The road not taken: Estimating path execution frequency statically
A variety of compilers, static analyses, and testing frameworks rely heavily on path frequency information. Uses for such information range from optimizing transformations to bug ...
Raymond P. L. Buse, Westley Weimer
SIGSOFT
2009
ACM
16 years 7 months ago
Refactoring for reentrancy
A program is reentrant if distinct executions of that program on distinct inputs cannot affect each other. Reentrant programs have the desirable property that they can be deployed...
Jan Wloka, Manu Sridharan, Frank Tip