We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
In multiple criteria Markov Decision Processes (MDP) where multiple costs are incurred at every decision point, current methods solve them by minimising the expected primary cost ...
Abstract. Configuration is the process of composing a system from a set of components such that the system fulfills a set of desired demands. The configuration process relies on a ...
Assume a network (V, E) where a subset of the nodes in V are active. We consider the problem of selecting a set of k active nodes that best explain the observed activation state, ...
—Learning control is a concept for controlling dynamic systems in an iterative manner. It arises from the recognition that robotic manipulators are usually used to perform repeti...