Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
We propose a sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction functi...
Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
Recent advances in technology allow multi-agent systems to be deployed in cooperation with or as a service for humans. Typically, those systems are designed assuming individually ...
In this paper we propose an approximated structured prediction framework for large scale graphical models and derive message-passing algorithms for learning their parameters effic...