— We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first...
The online learning problem requires a player to iteratively choose an action in an unknown and changing environment. In the standard setting of this problem, the player has to ch...
We present a generalization of the nonnegative matrix factorization (NMF), where a multilayer generative network with nonnegative weights is used to approximate the observed nonne...
— Optimal component analysis (OCA) uses a stochastic gradient optimization process to find optimal representations for general criteria and shows good performance in object reco...
Most studies modeling inaccurate data in Gold style learning consider cases in which the number of inaccuracies is finite. The present paper argues that this approach is not reaso...