Search Sciweavers | Sciweavers

3928 search results - page 602 / 786

» Optimal suffix selection

161

click to vote

ICML
2003
IEEE

121views Machine Learning» more ICML 2003»

Q-Decomposition for Reinforcement Learning Agents

16 years 7 months ago

Download www.hpl.hp.com

The paper explores a very simple agent design method called Q-decomposition, wherein a complex agent is built from simpler subagents. Each subagent has its own reward function and...

Stuart J. Russell, Andrew Zimdars

claim paper

Read More »

159

click to vote

ICML
2002
IEEE

133views Machine Learning» more ICML 2002»

Coordinated Reinforcement Learning

16 years 7 months ago

Download select.cs.cmu.edu

We present several new algorithms for multiagent reinforcement learning. A common feature of these algorithms is a parameterized, structured representation of a policy or value fu...

Carlos Guestrin, Michail G. Lagoudakis, Ronald Par...

claim paper

Read More »

193

click to vote

ICML
1995
IEEE

213views Machine Learning» more ICML 1995»

Learning Policies for Partially Observable Environments: Scaling Up

16 years 7 months ago

Download reference.kfupm.edu.sa

Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor fee...

Michael L. Littman, Anthony R. Cassandra, Leslie P...

claim paper

Read More »

139

click to vote

ICSE
2009
IEEE-ACM

105views Software Engineering» more ICSE 2009»

The road not taken: Estimating path execution frequency statically

16 years 7 months ago

Download www.cs.virginia.edu

A variety of compilers, static analyses, and testing frameworks rely heavily on path frequency information. Uses for such information range from optimizing transformations to bug ...

Raymond P. L. Buse, Westley Weimer

claim paper

Read More »

162

click to vote

SIGSOFT
2009
ACM

121views Software Engineering» more SIGSOFT 2009»

Refactoring for reentrancy

16 years 7 months ago

Download domino.watson.ibm.com

A program is reentrant if distinct executions of that program on distinct inputs cannot affect each other. Reentrant programs have the desirable property that they can be deployed...

Jan Wloka, Manu Sridharan, Frank Tip

claim paper

Read More »

« Prev « First page 602 / 786 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers