Recent research has demonstrated that useful POMDP solutions do not require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. ...
Abstract We study the version of the asymmetric prize collecting traveling salesman problem, where the objective is to find a directed tour that visits a subset of vertices such th...
Spoken dialogue management strategy optimization by means of Reinforcement Learning (RL) is now part of the state of the art. Yet, there is still a clear mismatch between the comp...
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
In 1876 Charles Lutwidge Dodgson suggested the intriguing voting rule that today bears his name. Although Dodgson’s rule is one of the most well-studied voting rules, it suffers...
Ioannis Caragiannis, Christos Kaklamanis, Nikos Ka...