Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
The development of natural language proccssing (NLP) systems that perform machine translation (MT) and information retrieval (IR) has highlighted the need for the automatic recogn...
In recent years, interest in studying evolutionary algorithms (EAs) for dynamic optimization problems (DOPs) has grown due to its importance in real-world applications. Several app...
We introduce a new plan repair method for problems cast as Mixed Integer Programs. In order to tackle the inherent complexity of these NP-hard problems, our approach relies on the ...
Emmanuel Rachelson, Ala Ben Abbes, Sebastien Dieme...
The routing in communication networks is typically a multicriteria decision making (MCDM) problem. However, setting the parameters of most used MCDM methods to fit the preferences ...