We propose a new neural network architecture, called Simple Recurrent Temporal-Difference Networks (SR-TDNs), that learns to predict future observations in partially observable en...
We consider the problem of embedding arbitrary objects (e.g., images, audio, documents) into Euclidean space subject to a partial order over pairwise distances. Partial order cons...
This paper gives an efficient Bayesian method for inferring the parameters of a PlackettLuce ranking model. Such models are parameterised distributions over rankings of a finite s...
We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabilistic mixture model via sampling, addressing both the infinite and finite horizo...
In this paper, we extend the Hilbert space embedding approach to handle conditional distributions. We derive a kernel estimate for the conditional embedding, and show its connecti...
Le Song, Jonathan Huang, Alexander J. Smola, Kenji...