Sciweavers

1167 search results - page 161 / 234
» policy 2007
Sort
View
NIPS
2007
15 years 7 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett
NIPS
2007
15 years 7 months ago
Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion
We consider apprenticeship learning—learning from expert demonstrations—in the setting of large, complex domains. Past work in apprenticeship learning requires that the expert...
J. Zico Kolter, Pieter Abbeel, Andrew Y. Ng
NIPS
2007
15 years 7 months ago
What makes some POMDP problems easy to approximate?
Point-based algorithms have been surprisingly successful in computing approximately optimal solutions for partially observable Markov decision processes (POMDPs) in high dimension...
David Hsu, Wee Sun Lee, Nan Rong
SEC
2007
15 years 7 months ago
Building a Distributed Semantic-aware Security Architecture
Enhancing the service-oriented architecture paradigm with semantic components is a new field of research and goal of many ongoing projects. The results lead to more powerful web a...
Jan Kolter, Rolf Schillinger, Günther Pernul
ACSW
2004
15 years 7 months ago
Learning Dynamics of Pesticide Abuse through Data Mining
Recent studies by agriculture researchers in Pakistan have shown that attempts of crop yield maximization through pro-pesticide state policies have led to a dangerously high pesti...
Ahsan Abdullah, Stephen Brobst, Ijaz Pervaiz