We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...
While exploring to nd better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, ...
Satinder P. Singh, Andrew G. Barto, Roderic A. Gru...
Ant robots have very low computational power and limited memory. They communicate by leaving pheromones in the environment. In order to create a cooperative intelligent behavior, ...
Asaf Shiloni, Alon Levy, Ariel Felner, Meir Kalech
There is a chronic lack of shared application domains to test the research models and agent architectures on areas like negotiation, argumentation, trust and reputation. In this d...
Angela Fabregues, David Navarro, Alejandro Serrano...
In open, dynamic multi-agent systems, agents may form short-term ad-hoc groups, such as coalitions, in order to meet their goals. Trust and reputation are crucial concepts in thes...