Abstract. Recent theories of universal algorithmic intelligence, combined with the view that the world can be completely specified in mathematical terms, have led to claims about ...
Reward shaping is a well-known technique applied to help reinforcement-learning agents converge more quickly to nearoptimal behavior. In this paper, we introduce social reward sha...
Monica Babes, Enrique Munoz de Cote, Michael L. Li...
Partially Observable Markov Decision Process (POMDP) is a popular framework for planning under uncertainty in partially observable domains. Yet, the POMDP model is riskneutral in ...
Software agents offer a promise to change electronic commerce trading by helping traders to purchase products based on their interests and preferences. E-commerce systems are incr...
For half a century, artificial intelligence researchers have focused on giving machines linguistic and mathematical-logical reasoning abilities, modeled after the classic linguist...