In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
Real-time dynamic programming (RTDP) is a heuristic search algorithm for solving MDPs. We present a modified algorithm called Focused RTDP with several improvements. While RTDP ma...
We present a simple and scalable algorithm for maximum-margin estimation of structured output models, including an important class of Markov networks and combinatorial models. We ...
Benjamin Taskar, Simon Lacoste-Julien, Michael I. ...
We present an optimization algorithm based on a model of bacterial chemotaxis. The original biological model is used to formulate a simple optimization algorithm, which is evaluate...
In this paper, we explore the use of a particular multistage adaptation algorithm for a variety of adaptive filtering applications where the structure of the underlying process to...