Learning curves for Gaussian process (GP) regression can be strongly affected by a mismatch between the ‘student’ model and the ‘teacher’ (true data generation process), e...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
A key feature in population based optimization algorithms is the ability to explore a search space and make a decision based on multiple solutions. In this paper, an incremental le...
Reminder systems support people with impaired prospective memory and/or executive function, by providing them with reminders of their functional daily activities. We integrate tem...
Matthew R. Rudary, Satinder P. Singh, Martha E. Po...
In active learning based music retrieval systems, providing multiple samples to the user for feedback is very necessary. In this paper, we present a new multi-samples selection st...