We define the class of games called simulation-based games, in which the payoffs are available as an output of an oracle (simulator), rather than specified analytically or using a...
We present an algorithm for large-scale equality constrained optimization. The method is based on a characterization of inexact sequential quadratic programming (SQP) steps that ca...
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on ...
We describe a general-purpose method for the accurate and robust interpretation of a data set of p-dimensional points by several deformable prototypes. This method is based on the ...
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...