Training and Validation

Train and simulate reinforcement learning agents

To learn an optimal policy, a reinforcement learning agent interacts with the environment through a repeated trial-and-error process. During training, the agent tunes the parameters of its policy representation to maximize the long-term reward. Reinforcement Learning Toolbox™ software provides functions for training agents and validating the training results through simulation. For more information, see Train Reinforcement Learning Agents.

Functions

trainTrain reinforcement learning agents within a specified environment
rlTrainingOptionsOptions for training reinforcement learning agents
simSimulate trained reinforcement learning agents within specified environment
rlSimulationOptionsOptions for simulating a reinforcement learning agent within an environment

Blocks

RL AgentReinforcement learning agent

Topics

Training and Simulation Basics

Train Reinforcement Learning Agents

Find the optimal policy by training your agent within a specified environment.

Train Reinforcement Learning Agent in Basic Grid World

Train Q-learning and SARSA agents to solve a grid world in MATLAB®.

Train Reinforcement Learning Agent in MDP Environment

Train a reinforcement learning agent in a generic Markov decision process environment.

Create Simulink Environment and Train Agent

Train a controller using reinforcement learning with a plant modeled in Simulink® as the training environment.

Parallel Computing

Train AC Agent to Balance Cart-Pole System Using Parallel Computing

Train actor-critic agent using asynchronous parallel computing.

Train DQN Agent for Lane Keeping Assist Using Parallel Computing

Train a reinforcement learning agent for an automated driving application using parallel computing.

Train Agents in MATLAB Environments

Train DDPG Agent to Control Double Integrator System

Train a deep deterministic policy gradient agent to control a second-order dynamic system modeled in MATLAB.

Train PG Agent with Baseline to Control Double Integrator System

Train a policy gradient with a baseline to control a double integrator system modeled in MATLAB.

Train DQN Agent to Balance Cart-Pole System

Train a deep Q-learning network agent to balance a cart-pole system modeled in MATLAB.

Train PG Agent to Balance Cart-Pole System

Train a policy gradient agent to balance a cart-pole system modeled in MATLAB.

Train AC Agent to Balance Cart-Pole System

Train an actor-critic agent to balance a cart-pole system modeled in MATLAB.

Train DDPG Agent to Swing Up and Balance Pendulum with Image Observation

Train a reinforcement learning agent using an image-based observation signal.

Create Agent Using Deep Network Designer and Train Using Image Observations

Create a reinforcement learning agent using the Deep Network Designer app from the Deep Learning Toolbox™.

Train Agents in Simulink Environments

Train DQN Agent to Swing Up and Balance Pendulum

Train a Deep Q-network agent to balance a pendulum modeled in Simulink.

Train DDPG Agent to Swing Up and Balance Pendulum

Train a deep deterministic policy gradient agent to balance a pendulum modeled in Simulink.

Train DDPG Agent to Swing Up and Balance Pendulum with Bus Signal

Train a reinforcement learning agent to balance a pendulum Simulink model that contains observations in a bus signal.

Train DDPG Agent to Swing Up and Balance Cart-Pole System

Train a deep deterministic policy gradient agent to swing up and balance a cart-pole system modeled in Simscape™ Multibody™.

Multi-Agent Training

Train Multiple Agents to Perform Collaborative Task

Train two PPO agents to collaboratively move an object.

Train Multiple Agents for Area Coverage

Train three PPO agents to explore a grid-world environment in a collaborative-competitive manner.

Train Multiple Agents for Path Following Control

Train a DQN and a DDPG agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.

Imitation Learning

Imitate MPC Controller for Lane Keeping Assist

Train a deep neural network to imitate the behavior of a model predictive controller.

Imitate Nonlinear MPC Controller for Flying Robot

Train a deep neural network to imitate the behavior of a nonlinear model predictive controller.

Train DDPG Agent with Pretrained Actor Network

Train a reinforcement learning agent using an actor network that has been previously trained using supervised learning.

Custom Agents and Training Algorithms

Train Custom LQR Agent

Train a custom LQR agent.

Train Reinforcement Learning Policy Using Custom Training Loop

Train a reinforcement learning policy using your own custom training algorithm.

Create Agent for Custom Reinforcement Learning Algorithm

Create agent for custom reinforcement learning algorithm.

Featured Examples