Sciweavers

4035 search results - page 173 / 807
» Useless Actions Are Useful
Sort
View
ECML
2007
Springer
16 years 20 days ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
CIG
2006
IEEE
16 years 16 days ago
Capturing The Information Conveyed By Opponents' Betting Behavior in Poker
— This paper develops an approach to the capture and measurement of the information contained in opponents’ bet actions in seven card stud poker. We develop a causal model link...
Eric Saund
ROBIO
2006
IEEE
121views Robotics» more  ROBIO 2006»
16 years 15 days ago
Behaviour Cooperation by Negation for Mobile Robots
— This article presents a behavioural architecture, the Survival Kit (SK), which allows behaviours to cast their multivalued output by means of constraints over an ’action feat...
Pedro Santana, Luís Correia
ICPP
1993
IEEE
15 years 10 months ago
A Unified Model for Concurrent Debugging
: Events are occurrence instances of actions. The thesis of this paper is that the use of “actions”, instead of events, greatly simplifies the problem of concurrent debugging....
S. I. Hyder, John Werth, James C. Browne
EDM
2008
141views Data Mining» more  EDM 2008»
15 years 8 months ago
An Open Repository and analysis tools for fine-grained, longitudinal learner data
We introduce an open data repository and set of associated visualization and analysis tools. The Pittsburgh Science of Learning Center's "DataShop" has data from tho...
Kenneth R. Koedinger, Kyle Cunningham, Alida Skogs...