We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Abstract. A combinatorial random variable is a discrete random variable defined over a combinatorial set (e.g., a power set of a given set). In this paper we introduce combinatoria...
Ron Bekkerman, Mehran Sahami, Erik G. Learned-Mill...
Stability has been explored to study the performance of learning algorithms in recent years and it has been shown that stability is sufficient for generalization and is sufficient ...
This paper proposes an incremental multiple-object recognition and localization (IMORL) method. The objective of IMORL is to adaptively learn multiple interesting objects in an ima...
In the field of machine learning and pattern recognition, feature subset selection is an important area, where many approaches have been proposed. In this paper, we choose some fe...