The importance of the problems of contingent planning with actions that have non-deterministic effects and of planning with goal preferences has been widely recognized, and severa...
In this work, a generalized method for learning from sequence of unlabelled data points based on unsupervised order-preserving regression is proposed. Sequence learning is a funda...
We present an algorithm that derives actions' effects and preconditions in partially observable, relational domains. Our algorithm has two unique features: an expressive rela...
We examine the problem of Transfer in Reinforcement Learning and present a method to utilize knowledge acquired in one Markov Decision Process (MDP) to bootstrap learning in a mor...
We present an asymptotically optimal algorithm for the max variant of the k-armed bandit problem. Given a set of k slot machines, each yielding payoff from a fixed (but unknown) d...