Training and Validation

Train and simulate reinforcement learning agents

To learn an optimal policy, a reinforcement learning agent interacts with the environment through a repeated trial-and-error process. During training, the agent tunes the parameters of its policy representation to maximize the long-term reward. Reinforcement Learning Toolbox™ software provides functions for training agents and validating the training results through simulation. For more information, see Train Reinforcement Learning Agents.

Functions

`train`	Train reinforcement learning agents within a specified environment
`rlTrainingOptions`	Options for training reinforcement learning agents
`sim`	Simulate trained reinforcement learning agents within specified environment
`rlSimulationOptions`	Options for simulating a reinforcement learning agent within an environment

Blocks

RL Agent

Reinforcement learning agent

Topics

Training and Simulation Basics

Train Reinforcement Learning Agents

Find the optimal policy by training your agent within a specified environment.

Train Reinforcement Learning Agent in Basic Grid World

Train Q-learning and SARSA agents to solve a grid world in MATLAB^®.

Train Reinforcement Learning Agent in MDP Environment

Train a reinforcement learning agent in a generic Markov decision process environment.

Create Simulink Environment and Train Agent

Train a controller using reinforcement learning with a plant modeled in Simulink^® as the training environment.

Parallel Computing

Train AC Agent to Balance Cart-Pole System Using Parallel Computing

Train actor-critic agent using asynchronous parallel computing.

Train DQN Agent for Lane Keeping Assist Using Parallel Computing

Train a reinforcement learning agent for an automated driving application using parallel computing.

Train Agents in MATLAB Environments

Train DDPG Agent to Control Double Integrator System

Train a deep deterministic policy gradient agent to control a second-order dynamic system modeled in MATLAB.

Train PG Agent with Baseline to Control Double Integrator System

Train a policy gradient with a baseline to control a double integrator system modeled in MATLAB.

Train DQN Agent to Balance Cart-Pole System

Train a deep Q-learning network agent to balance a cart-pole system modeled in MATLAB.

Train PG Agent to Balance Cart-Pole System

Train a policy gradient agent to balance a cart-pole system modeled in MATLAB.

Train AC Agent to Balance Cart-Pole System

Train an actor-critic agent to balance a cart-pole system modeled in MATLAB.

Train DDPG Agent to Swing Up and Balance Pendulum with Image Observation

Train a reinforcement learning agent using an image-based observation signal.

Create Agent Using Deep Network Designer and Train Using Image Observations

Create a reinforcement learning agent using the Deep Network Designer app from the Deep Learning Toolbox™.

Train Agents in Simulink Environments

Train DQN Agent to Swing Up and Balance Pendulum

Train a Deep Q-network agent to balance a pendulum modeled in Simulink.

Train DDPG Agent to Swing Up and Balance Pendulum

Train a deep deterministic policy gradient agent to balance a pendulum modeled in Simulink.

Train DDPG Agent to Swing Up and Balance Pendulum with Bus Signal

Train a reinforcement learning agent to balance a pendulum Simulink model that contains observations in a bus signal.

Train DDPG Agent to Swing Up and Balance Cart-Pole System

Train a deep deterministic policy gradient agent to swing up and balance a cart-pole system modeled in Simscape™ Multibody™.

Multi-Agent Training

Train Multiple Agents to Perform Collaborative Task

Train two PPO agents to collaboratively move an object.

Train Multiple Agents for Area Coverage

Train three PPO agents to explore a grid-world environment in a collaborative-competitive manner.

Train Multiple Agents for Path Following Control

Train a DQN and a DDPG agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.

Imitation Learning

Imitate MPC Controller for Lane Keeping Assist

Train a deep neural network to imitate the behavior of a model predictive controller.

Imitate Nonlinear MPC Controller for Flying Robot

Train a deep neural network to imitate the behavior of a nonlinear model predictive controller.

Train DDPG Agent with Pretrained Actor Network

Train a reinforcement learning agent using an actor network that has been previously trained using supervised learning.

Custom Agents and Training Algorithms

Train Custom LQR Agent

Train a custom LQR agent.

Train Reinforcement Learning Policy Using Custom Training Loop

Train a reinforcement learning policy using your own custom training algorithm.

Create Agent for Custom Reinforcement Learning Algorithm

Create agent for custom reinforcement learning algorithm.

Featured Examples

Tune PI Controller using Reinforcement Learning

Tune the gains of a PI controller using a reinforcement learning agent.

Open Live Script

Train DDPG Agent for PMSM Control

Train a DDPG agent to control the currents in a permanent magnet synchronous motor.

Open Live Script

Train DDPG Agent to Control Flying Robot

Train a reinforcement learning agent to control a flying robot model.

Open Live Script

Train PPO Agent to Land Rocket

Train a reinforcement learning agent to land a rocket.

Open Live Script

Train Biped Robot to Walk Using Reinforcement Learning Agents

Train a reinforcement learning agent to control a biped walking robot modeled in Simscape Multibody.

Open Live Script

Quadruped Robot Locomotion Using DDPG Agent

Train a reinforcement learning agent to control a quadruped walking robot modeled in Simscape Multibody.

Open Live Script

Train DQN Agent for Lane Keeping Assist

Train a reinforcement learning agent for a lane keeping assist application.

Open Live Script

Train DDPG Agent for Adaptive Cruise Control

Train a reinforcement learning agent for an adaptive cruise control application.

Open Live Script

Train DDPG Agent for Path-Following Control

Train a reinforcement learning agent for a lane following application.

Open Live Script

Train PPO Agent for Automatic Parking Valet

Documentation

Training and Validation

Functions

Blocks

Topics

Training and Simulation Basics

Parallel Computing

Train Agents in MATLAB Environments

Train Agents in Simulink Environments

Multi-Agent Training

Imitation Learning

Custom Agents and Training Algorithms

Featured Examples

Tune PI Controller using Reinforcement Learning

Train DDPG Agent for PMSM Control

Train DDPG Agent to Control Flying Robot

Train PPO Agent to Land Rocket

Train Biped Robot to Walk Using Reinforcement Learning Agents

Quadruped Robot Locomotion Using DDPG Agent

Train DQN Agent for Lane Keeping Assist

Train DDPG Agent for Adaptive Cruise Control

Train DDPG Agent for Path-Following Control

Train PPO Agent for Automatic Parking Valet

Reinforcement Learning Toolbox Documentation

Support