Reinforcement learning is a popular and successful framework for many agent-related problems because only limited environmental feedback is necessary for learning. While many algo...
In this paper, we discuss the use of Targeted Trajectory Distribution Markov Decision Processes (TTD-MDPs)—a variant of MDPs in which the goal is to realize a specified distrib...
Sooraj Bhat, David L. Roberts, Mark J. Nelson, Cha...
This paper proposes the β-WoLF algorithm for multiagent reinforcement learning (MARL) in the stochastic games framework that uses an additional “advice” signal to inform agen...
The TAC Supply Chain Management (TAC/SCM) game presents a challenging dynamic environment for autonomous decision-making in a salient application domain. Strategic interactions co...
Patrick R. Jordan, Christopher Kiekintveld, Michae...
A negotiation chain is formed when multiple related negotiations are spread over multiple agents. In order to appropriately order and structure the negotiations occurring in the c...