Optimal resource scheduling in multiagent systems is a computationally challenging task, particularly when the values of resources are not additive. We consider the combinatorial ...
Dmitri A. Dolgov, Michael R. James, Michael E. Sam...
Planning in single-agent models like MDPs and POMDPs can be carried out by resorting to Q-value functions: a (near-) optimal Q-value function is computed in a recursive manner by ...
In this paper we present an advanced bidding agent that participates in first-price sealed bid auctions to allocate advertising space on BluScreen – an experimental public adve...
Alex Rogers, Esther David, Terry R. Payne, Nichola...
We previously presented a model for some wireless sensor and actuator network (WSAN) applications based on the vector space tools of frame theory. In this WSAN model there is a we...
Abstract— Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It ...