The reinforcement learning problem can be decomposed into two parallel types of inference: (i) estimating the parameters of a model for the underlying process; (ii) determining be...
In this paper, we present a mandatory access control system that uses input from multiple stakeholders to compose policies based on runtime information. In the emerging ubiquitous...
A Markov Decision Process (MDP) is a general model for solving planning problems under uncertainty. It has been extended to multiobjective MDP to address multicriteria or multiagen...
Abstract--A delay-constrained scheduling problem for pointto-point communication is considered: a packet of B bits must be transmitted by a hard deadline of T slots over a timevary...
This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...