Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how local learning rules at single synapses su...
Robert A. Legenstein, Dejan Pecevski, Wolfgang Maa...
In some environments, a learning agent must learn to balance competing objectives. For example, a Q-learner agent may need to learn which choices expose the agent to risk and whic...
This paper presents a search engine architecture, RETIN, aiming at retrieving complex categories in large image databases. For indexing, a scheme based on a two-step quantization ...
Philippe Henri Gosselin, Matthieu Cord, Sylvie Phi...
Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schem...
Procedural representations of control policies have two advantages when facing the scale-up problem in learning tasks. First they are implicit, with potential for inductive genera...