Search Sciweavers | Sciweavers

4446 search results - page 492 / 890

» Learning Observer Agents

197

click to vote

SMC
2007
IEEE

102views Control Systems» more SMC 2007»

An improved immune Q-learning algorithm

16 years 1 months ago

Download web2.uwindsor.ca

—Reinforcement learning is a framework in which an agent can learn behavior without knowledge on a task or an environment by exploration and exploitation. Striking a balance betw...

Zhengqiao Ji, Q. M. Jonathan Wu, Maher A. Sid-Ahme...

claim paper

Read More »

149

click to vote

AAAI
2008

98views Intelligent Agents» more AAAI 2008»

Transferring Localization Models across Space

15 years 9 months ago

Download www.cs.ust.hk

Machine learning approaches to indoor WiFi localization involve an offline phase and an online phase. In the offline phase, data are collected from an environment to build a local...

Sinno Jialin Pan, Dou Shen, Qiang Yang, James T. K...

claim paper

Read More »

191

click to vote

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

15 years 8 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

182

click to vote

ATAL
2008
Springer

155views Intelligent Agents» more ATAL 2008»

Approximate predictive state representations

15 years 8 months ago

Download www.aamas-conference.org

Predictive state representations (PSRs) are models that represent the state of a dynamical system as a set of predictions about future events. The existing work with PSRs focuses ...

Britton Wolfe, Michael R. James, Satinder P. Singh

claim paper

Read More »

171

click to vote

ATAL
2008
Springer

104views Intelligent Agents» more ATAL 2008»

Expediting RL by using graphical structures

15 years 8 months ago

Download www.cs.washington.edu

The goal of Reinforcement learning (RL) is to maximize reward (minimize cost) in a Markov decision process (MDP) without knowing the underlying model a priori. RL algorithms tend ...

Peng Dai, Alexander L. Strehl, Judy Goldsmith

claim paper

Read More »

« Prev « First page 492 / 890 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers