Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
Web-based learning environments are extensively used nowadays. These environments maintain and produce vast amounts of data. Such vastness lead to the application of data mining t...
Ioannis Kazanidis, Stavros Valsamidis, Theodosios ...
Conversations provide rich opportunities for interactive, continuous learning. When something goes wrong, a system can ask for clarification, rewording, or otherwise redirect the...
We present a trainable sequential-inference technique for processes with large state and observation spaces and relational structure. Our method assumes "reliable observation...
We present a mixture model whose components are Restricted Boltzmann Machines (RBMs). This possibility has not been considered before because computing the partition function of a...