— We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first...
Analyzing accidents is a vital exercise in the development of safety-critical software systems to prevent past accidents from reoccurring in the future. Current practices such as ...
Tariq Mahmood, Edmund Kazmierczak, Tim Kelly, Denn...
In this paper we report on using a relational state space in multi-agent reinforcement learning. There is growing evidence in the Reinforcement Learning research community that a r...
Tom Croonenborghs, Karl Tuyls, Jan Ramon, Maurice ...
- The effectiveness of stochastic power management relies on the accurate system and workload model and effective policy optimization. Workload modeling is a machine learning proce...
Recently, instead of selecting a single kernel, multiple kernel learning (MKL) has been proposed which uses a convex combination of kernels, where the weight of each kernel is opt...