We present a data-driven approach to learn user-adaptive referring expression generation (REG) policies for spoken dialogue systems. Referring expressions can be difficult to unde...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
In standard online learning, the goal of the learner is to maintain an average loss that is "not too big" compared to the loss of the best-performing function in a fixed...
Our paper addresses the problem of enforcing constraints in human body tracking. A projection technique is derived to impose kinematic constraints on independent multi-body motion...
Text clustering is most commonly treated as a fully automated task without user supervision. However, we can improve clustering performance using supervision in the form of pairwi...