Twin-delayed deep deterministic policy gradient reinforcement learning agent
The twin-delayed deep deterministic policy gradient (DDPG) algorithm is an actor-critic, model-free, online, off-policy reinforcement learning method which computes an optimal policy that maximizes the long-term reward.
Use rlTD3Agent
to create one of the following types of agents.
Twin-delayed deep deterministic policy gradient (TD3) agent with two Q-value functions. This agent prevents overestimation of the value function by learning two Q value functions and using the minimum values for policy updates.
Delayed deep deterministic policy gradient (delayed DDPG) agent with a single Q value function. This agent is a DDPG agent with target policy smoothing and delayed policy and target updates.
For more information, see Twin-Delayed Deep Deterministic Policy Gradient Agents.
For more information on the different types of reinforcement learning agents, see Reinforcement Learning Agents.
creates an agent with the specified actor and critic representations and sets the
agent
= rlTD3Agent(actor
,critics
,agentOptions
)AgentOptions
property. To create a:
TD3 agent, specify a two-element row vector of critic representations.
Delayed DDPG agent, specify a single critic representation.
train | Train a reinforcement learning agent within a specified environment |
sim | Simulate a trained reinforcement learning agent within a specified environment |
getActor | Get actor representation from reinforcement learning agent |
setActor | Set actor representation of reinforcement learning agent |
getCritic | Get critic representation from reinforcement learning agent |
setCritic | Set critic representation of reinforcement learning agent |
generatePolicyFunction | Create function that evaluates trained policy of reinforcement learning agent |