: Sequential algorithms given by Angluin 1987 and Schapire 1992 learn deterministic nite automata DFA exactly from Membership and Equivalence queries. These algorithms are feasible...
Abstract. Approximate Policy Iteration (API) is a reinforcement learning paradigm that is able to solve high-dimensional, continuous control problems. We propose to exploit API for...
Information security is an issue of global concern. As the Internet is delivering great convenience and benefits to the modern society, the rapidly increasing connectivity and acc...
Improving the sample efficiency of reinforcement learning algorithms to scale up to larger and more realistic domains is a current research challenge in machine learning. Model-ba...
A new "herding" algorithm is proposed which directly converts observed moments into a sequence of pseudo-samples. The pseudosamples respect the moment constraints and ma...