In the last ve years, there has been numerous applications of wavelets and multiresolution analysis in many elds of computer graphics as di erent as geometric modelling, volume vi...
We study the problem of an apprentice learning to behave in an environment with an unknown reward function by observing the behavior of an expert. We follow on the work of Abbeel ...
Semantic theories of natural language associate meanings with utterances by providing meanings for lexical items and rules for determining the meaning of larger units given the me...
Partially Observable Markov Decision Process (POMDP) is a popular framework for planning under uncertainty in partially observable domains. Yet, the POMDP model is riskneutral in ...
In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...