A new training algorithm is presented for delayed reinforcement learning problems that does not assume the existence of a critic model and employs the polytope optimization algorit...
When we try to implement a multi-objective genetic algorithm (MOGA) with variable weights for finding a set of Pareto optimal solutions, one difficulty lies in determining appropri...
In this paper we present a method of parameter optimization, relative trust-region learning, where the trust-region method and the relative optimization [21] are jointly exploited...
—The Dynamic Vehicle Routing Problem (DVRP) is an optimization problem in which agents deliver orders that are not known in advance to the routing. Partial solutions need to be a...
One crucial issue in genetic programming (GP) is how to acquire promising building blocks efficiently. In this paper, we propose a GP method (called GPTM, GP with Tree Mining) whi...