Sciweavers

5075 search results - page 392 / 1015
» Convergence
Sort
View
ICML
2005
IEEE
16 years 7 months ago
Learning as search optimization: approximate large margin methods for structured prediction
Mappings to structured output spaces (strings, trees, partitions, etc.) are typically learned using extensions of classification algorithms to simple graphical structures (eg., li...
Daniel Marcu, Hal Daumé III
ICML
2004
IEEE
16 years 7 months ago
A multiplicative up-propagation algorithm
We present a generalization of the nonnegative matrix factorization (NMF), where a multilayer generative network with nonnegative weights is used to approximate the observed nonne...
Jong-Hoon Ahn, Seungjin Choi, Jong-Hoon Oh
ICML
2000
IEEE
16 years 7 months ago
Reinforcement Learning in POMDP's via Direct Gradient Ascent
This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...
Jonathan Baxter, Peter L. Bartlett
ICML
1998
IEEE
16 years 7 months ago
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
In this paper, we adopt general-sum stochastic games as a framework for multiagent reinforcement learning. Our work extends previous work by Littman on zero-sum stochastic games t...
Junling Hu, Michael P. Wellman
ICML
1998
IEEE
16 years 7 months ago
The MAXQ Method for Hierarchical Reinforcement Learning
This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural seman...
Thomas G. Dietterich