In this paper we use the Proximal Policy Optimization (PPO) deep reinforcement learning algorithm to train a Neural Network to control a four-legged robot in simulation. Reinforcement learning in general can learn complex behavior policies from simple state-reward tuples datasets and PPO in particular has proved its effectiveness in solving complex tasks with continuous states and actions. Moreover, since it is model-free, it is general and can adapt to changes in the environment or in the robot itself. The virtual environment used to train the agent was modeled using our physics engine Project Chrono. Chrono can handle non smooth dynamics simulation allowing us to introduce stiff leg-ground contacts and using its Python interface Pychrono it can be interfaced with the Machine Leaning framework TensorFlow with ease. We trained the Neural Network until it learned to control the motor torques, then various policy Neural Network input state choices have been compared.
|Titolo:||Training a Four Legged Robot via Deep Reinforcement Learning and Multibody Simulation|
|Data di pubblicazione:||2020|
|Appare nelle tipologie:||4.1b Atto convegno Volume|