Search Sciweavers | Sciweavers

2990 search results - page 484 / 598

» Hidden Markov processes

209

click to vote

JMLR
2010

189views more JMLR 2010»

Adaptive Step-size Policy Gradients with Average Reward Metric

15 years 1 months ago

Download jmlr.csail.mit.edu

In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...

Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...

claim paper

Read More »

199

click to vote

NN
2010
Springer

187views Neural Networks» more NN 2010»

Efficient exploration through active learning for value function approximation in reinforcement learning

15 years 1 months ago

Download sugiyama-www.cs.titech.ac.jp

Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares ...

Takayuki Akiyama, Hirotaka Hachiya, Masashi Sugiya...

claim paper

Read More »

200

click to vote

AAAI
2011

136views Intelligent Agents» more AAAI 2011»

Linear Dynamic Programs for Resource Management

14 years 6 months ago

Download www.cs.umass.edu

Sustainable resource management in many domains presents large continuous stochastic optimization problems, which can often be modeled as Markov decision processes (MDPs). To solv...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

364

click to vote

Publication

151views

Robust Bayesian reinforcement learning through tight lower bounds

14 years 5 months ago

Download arxiv.org

In the Bayesian approach to sequential decision making, exact calculation of the (subjective) utility is intractable. This extends to most special cases of interest, such as reinfo...

Christos Dimitrakakis

posted by olethros

Read More »

181

click to vote

MOBICOM
2009
ACM

174views Communications» more MOBICOM 2009»

Interference management via rate splitting and HARQ over time-varying fading channels

16 years 27 days ago

Download web.njit.edu

The coexistence of two unlicensed links is considered, where one link interferes with the transmission of the other, over a timevarying, block-fading channel. In the absence of fa...

Marco Levorato, Osvaldo Simeone, Urbashi Mitra

claim paper

Read More »

« Prev « First page 484 / 598 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers