High dimensionality of POMDP's belief state space is one major cause that makes the underlying optimal policy computation intractable. Belief compression refers to the method...
Xin Li, William Kwok-Wai Cheung, Jiming Liu, Zhili...
Clusters are mostly used through Resources Management Systems (RMS) with a static allocation of resources for a bounded amount of time. Those approaches are known to be insufficie...
Reinforcement learning (RL) is one of the machine learning techniques and has been received much attention as a new self-adaptive controller for various systems. The RL agent auto...
Abstract--The now commonplace multi-core chips have introduced, by design, a deep hierarchy of memory and cache banks within parallel computers as a tradeoff between the user frien...
This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural seman...