Recent multi-agent extensions of Q-Learning require knowledge of other agents’ payoffs and Q-functions, and assume game-theoretic play at all times by all other agents. This pap...
While exploring to nd better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, ...
Satinder P. Singh, Andrew G. Barto, Roderic A. Gru...
Closed-loop control relies on sensory feedback that is usually assumed to be free. But if sensing incurs a cost, it may be coste ective to take sequences of actions in open-loop m...
Eric A. Hansen, Andrew G. Barto, Shlomo Zilberstei...
Abstract. Restricting the power of the schedulers that resolve the nondeterminism in probabilistic concurrent systems has recently drawn the attention of the research community. Th...
While it is clear that the full emotional effect of a movie scene is carried through the successful interpretation of audio and visual information, music still carries a significa...
Aida Austin, Elliot Moore II, Udit Gupta, Parag Ch...