The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact s...
Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
Fractional hypertree width is a hypergraph measure similar to tree width and hypertree width. Its algorithmic importance comes from the fact that, as shown in previous work [14], ...
Mixed integer programming (MIP) formulations are typically tightened through the use of a separation algorithm and the addition of violated cuts. Using extended formulations involv...
Consider the problem of pricing n items under an unlimited supply with m buyers. Each buyer is interested in a bundle of at most k of the items. These buyers are single minded, wh...
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...