Search Sciweavers | Sciweavers

188

ICML
2005
IEEE

136views Machine Learning» more ICML 2005»

Learning as search optimization: approximate large margin methods for structured prediction

16 years 7 months ago

Mappings to structured output spaces (strings, trees, partitions, etc.) are typically learned using extensions of classification algorithms to simple graphical structures (eg., li...

Daniel Marcu, Hal Daumé III

claim paper

Read More »

179

click to vote

ICML
2004
IEEE

118views Machine Learning» more ICML 2004»

A multiplicative up-propagation algorithm

16 years 7 months ago

Download www.postech.ac.kr

We present a generalization of the nonnegative matrix factorization (NMF), where a multilayer generative network with nonnegative weights is used to approximate the observed nonne...

Jong-Hoon Ahn, Seungjin Choi, Jong-Hoon Oh

claim paper

Read More »

173

click to vote

ICML
2000
IEEE

126views Machine Learning» more ICML 2000»

Reinforcement Learning in POMDP's via Direct Gradient Ascent

16 years 7 months ago

Download reference.kfupm.edu.sa

This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...

Jonathan Baxter, Peter L. Bartlett

claim paper

Read More »

175

click to vote

ICML
1998
IEEE

155views Machine Learning» more ICML 1998»

Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

16 years 7 months ago

Download www.lirmm.fr

In this paper, we adopt general-sum stochastic games as a framework for multiagent reinforcement learning. Our work extends previous work by Littman on zero-sum stochastic games t...

Junling Hu, Michael P. Wellman

claim paper

Read More »

179

click to vote

ICML
1998
IEEE

268views Machine Learning» more ICML 1998»

The MAXQ Method for Hierarchical Reinforcement Learning

16 years 7 months ago

Download www.cs.ualberta.ca

This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural seman...

Thomas G. Dietterich

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers