We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications...
The well-studied task of learning a linear function with errors is a seemingly hard problem and the basis for several cryptographic schemes. Here we demonstrate additional applicat...
Benny Applebaum, David Cash, Chris Peikert, Amit S...
We show that under reasonable conditions, online learning for a nonlinear function near a local minimum is similar to a multivariate Ornstein Uhlenbeck process. This implies that ...
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...
Abstract— The optimal model parameters of a kernel machine are typically given by the solution of a convex optimisation problem with a single global optimum. Obtaining the best p...