Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
ATM switches have to provide traffic management functions to meet the QoS requirements of different service categories. Among the traffic management functions we will focus on con...
There are many applications in which it is desirable to order rather than classify instances. Here we consider the problem of learning how to order, given feedback in the form of ...
William W. Cohen, Robert E. Schapire, Yoram Singer
This paper presents an extension of the capacitated facility location problem (CFLP), in which the general setup cost functions and multiple facilities in one site are considered....
In linear image restoration, the point spread function of the degrading system is assumed known even though this information is usually not available in real applications. As a re...