rlACAgent

Actor-critic reinforcement learning agent

Description

Actor-critic (AC) agents implement actor-critic algorithms such as A2C and A3C, which are model-free, online, on-policy reinforcement learning methods. The goal of this agent is to optimize the policy (actor) directly and train a critic to estimate the return or future rewards.

For more information see Actor-Critic Agents.

For more information on the different types of reinforcement learning agents, see Reinforcement Learning Agents.

Creation

Description

example

agent = rlACAgent(actor,critic,agentOptions) creates an actor-critic agent with the specified actor and critic networks and sets the AgentOptions property.

Input Arguments

expand all

Actor network representation for the policy, specified as an rlStochasticActorRepresentation object. For more information on creating actor representations, see Create Policy and Value Function Representations.

Critic network representation for estimating the discounted long-term reward, specified as an rlValueRepresentation. For more information on creating critic representations, see Create Policy and Value Function Representations.

Properties

expand all

Agent options, specified as an rlACAgentOptions object.

Object Functions

trainTrain a reinforcement learning agent within a specified environment
simSimulate a trained reinforcement learning agent within a specified environment
getActorGet actor representation from reinforcement learning agent
setActorSet actor representation of reinforcement learning agent
getCriticGet critic representation from reinforcement learning agent
setCriticSet critic representation of reinforcement learning agent
generatePolicyFunctionCreate function that evaluates trained policy of reinforcement learning agent

Examples

collapse all

Create an environment interface and obtain its observation and action specifications.

env = rlPredefinedEnv("CartPole-Discrete");
obsInfo = getObservationInfo(env);
actInfo = getActionInfo(env);

Create a critic representation.

% create the network to be used as approximator in the critic
criticNetwork = [
    imageInputLayer([4 1 1],'Normalization','none','Name','state')
    fullyConnectedLayer(1,'Name','CriticFC')];

% set some options for the critic
criticOpts = rlRepresentationOptions('LearnRate',8e-3,'GradientThreshold',1);

% create the critic
critic = rlValueRepresentation(criticNetwork,obsInfo,'Observation',{'state'},criticOpts);

Create an actor representation.

% create the network to be used as approximator in the actor
actorNetwork = [
    imageInputLayer([4 1 1],'Normalization','none','Name','state')
    fullyConnectedLayer(2,'Name','action')];

% set some options for the actor
actorOpts = rlRepresentationOptions('LearnRate',8e-3,'GradientThreshold',1);

% create the actor
actor = rlStochasticActorRepresentation(actorNetwork,obsInfo,actInfo,...
    'Observation',{'state'},actorOpts);

Specify agent options, and create an AC agent using the environment, actor, and critic.

agentOpts = rlACAgentOptions('NumStepsToLookAhead',32,'DiscountFactor',0.99);
agent = rlACAgent(actor,critic,agentOpts)
agent = 
  rlACAgent with properties:

    AgentOptions: [1x1 rl.option.rlACAgentOptions]

To check your agent, use getAction to return the action from a random observation.

getAction(agent,{rand(4,1)})
ans = -10

You can now test and train the agent against the environment.

Introduced in R2019a