Proximal policy optimization reinforcement learning agent
The proximal policy optimization (PPO) is a model-free, online, on-policy, policy gradient reinforcement learning method. This algorithm alternates between sampling data through environmental interaction and optimizing a clipped surrogate objective function using stochastic gradient descent.
For more information on PPO agents, see Proximal Policy Optimization Agents.
For more information on the different types of reinforcement learning agents, see Reinforcement Learning Agents.
creates a proximal policy optimization (PPO) agent with the specified actor and critic
networks and sets the agent
= rlPPOAgent(actor
,critic
,agentOptions
)AgentOptions
property.
train | Train a reinforcement learning agent within a specified environment |
sim | Simulate a trained reinforcement learning agent within a specified environment |
getActor | Get actor representation from reinforcement learning agent |
setActor | Set actor representation of reinforcement learning agent |
getCritic | Get critic representation from reinforcement learning agent |
setCritic | Set critic representation of reinforcement learning agent |
generatePolicyFunction | Create function that evaluates trained policy of reinforcement learning agent |