getAction

Obtain action from agent or actor representation given environment observations

collapse all in page

Syntax

agentAction = getAction(agent,obs)

actorAction = getAction(actorRep,obs)

[actorAction,nextState] = getAction(actorRep,obs)

Description

Agent

example

agentAction = getAction(agent,obs) returns the action derived from the policy of a reinforcement learning agent given environment observations.

Actor Representation

example

actorAction = getAction(actorRep,obs) returns the action derived from policy representation actorRep given environment observations obs.

[actorAction,nextState] = getAction(actorRep,obs) returns the updated state of the actor representation when the actor uses a recurrent neural network as a function approximator.

Examples

collapse all

Get Actions from Agent

Open Live Script

Create an environment interface and obtain its observation and action specifications. For this environment load the predefined environment used for the discrete cart-pole system.

env = rlPredefinedEnv("CartPole-Discrete");
obsInfo = getObservationInfo(env);
actInfo = getActionInfo(env);

Create a critic representation.

statePath = [
    featureInputLayer(4,'Normalization','none','Name','state')
    fullyConnectedLayer(24, 'Name', 'CriticStateFC1')
    reluLayer('Name','CriticRelu1')
    fullyConnectedLayer(24,'Name','CriticStateFC2')];
actionPath = [
    featureInputLayer(1,'Normalization','none','Name','action')
    fullyConnectedLayer(24, 'Name', 'CriticActionFC1')];
commonPath = [
    additionLayer(2,'Name','add')
    reluLayer('Name','CriticCommonRelu')
    fullyConnectedLayer(1,'Name','output')];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = addLayers(criticNetwork, commonPath);    
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');

Create a representation for the critic.

criticOpts = rlRepresentationOptions('LearnRate',0.01,'GradientThreshold',1);
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,...
    'Observation',{'state'},'Action',{'action'},criticOpts);

Specify agent options, and create a DQN agent using the environment and critic.

agentOpts = rlDQNAgentOptions(...
    'UseDoubleDQN',false, ...    
    'TargetUpdateMethod',"periodic", ...
    'TargetUpdateFrequency',4, ...   
    'ExperienceBufferLength',100000, ...
    'DiscountFactor',0.99, ...
    'MiniBatchSize',256);
agent = rlDQNAgent(critic,agentOpts);

Obtain a discrete action from the agent for a single observation. For this example, use a random observation array.

act = getAction(agent,{rand(4,1)})

act = 10

You can also obtain actions for a batch of observations. For example, obtain actions for a batch of 10 observations.

actBatch = getAction(agent,{rand(4,1,10)});
size(actBatch)

ans = 1×2

     1    10

actBatch contains one action for each observation in the batch, with each action being one of the possible discrete actions.

Get Action from Deterministic Actor

Open Live Script

Create observation and action information. You can also obtain these specifications from an environment.

obsinfo = rlNumericSpec([4 1]);
actinfo = rlNumericSpec([2 1]);
numObs = obsinfo.Dimension(1);
numAct = actinfo.Dimension(1);

Create a recurrent deep neural network for the actor.

net = [featureInputLayer(4,'Normalization','none','Name','state')
            fullyConnectedLayer(10,'Name','fc1')
            reluLayer('Name','relu1')
            fullyConnectedLayer(20,'Name','CriticStateFC2')
            fullyConnectedLayer(numAct,'Name','action')
            tanhLayer('Name','tanh1')];

Create a deterministic actor representation for the network.

actorOptions = rlRepresentationOptions('LearnRate',1e-3,'GradientThreshold',1);
actor = rlDeterministicActorRepresentation(net,obsinfo,actinfo,...
    'Observation',{'state'},'Action',{'tanh1'});

Obtain an action from this actor for a random batch of 20 observations.

act = getAction(actor,{rand(4,1,10)})

act = 1x1 cell array
    {2x1x10 single}

act contains the two computed actions for all 10 observations in the batch.

Input Arguments

collapse all

`agent` — Reinforcement learning agent
`rlQAgent` | `rlSARSAAgent` | `rlDQNAgent` | `rlPGAgent` | `rlDDPGAgent` | `rlTD3Agent` | `rlACAgent` | `rlPPOAgent`

Reinforcement learning agent, specified as one of the following objects:

`actorRep` — Actor representation
`rlDeterministicActorRepresentation` object | `rlStochasticActorRepresentation` object

Actor representation, specified as either an rlDeterministicActorRepresentation or rlStochasticActorRepresentation object.

`obs` — Environment observations
cell array

Environment observations, specified as a cell array with as many elements as there are observation input channels. Each element of obs contains an array of observations for a single observation input channel.

The dimensions of each element in obs are M_O-by-L_B-by-L_S, where:

M_O corresponds to the dimensions of the associated observation input channel.
L_B is the batch size. To specify a single observation, set L_B = 1. To specify a batch of observations, specify L_B > 1. If valueRep or qValueRep has multiple observation input channels, then L_B must be the same for all elements of obs.
L_S specifies the sequence length for a recurrent neural network. If valueRep or qValueRep does not use a recurrent neural network, then L_S = 1. If valueRep or qValueRep has multiple observation input channels, then L_S must be the same for all elements of obs.

L_B and L_S must be the same for both act and obs.

Output Arguments

collapse all

`agentAction` — Action value from agent
array

Action value from agent, returned as an array with dimensions M_A-by-L_B-by-L_S, where:

M_A corresponds to the dimensions of the associated action specification.
L_B is the batch size.
L_S is the sequence length for recurrent neural networks. If the actor and critic in agent do not use recurrent neural networks, then L_S = 1.

Note

When agents such as rlACAgent, rlPGAgent, or rlPPOAgent use an rlStochasticActorRepresentation actor with a continuous action space, the constraints set by the action specification are not enforced by the agent. In these cases, you must enforce action space constraints within the environment.

`actorAction` — Action value from actor representation
single-element cell array

Action value from actor representation, returned as a single-element cell array that contains an array of dimensions M_A-by-L_B-by-L_S, where:

M_A corresponds to the dimensions of the action specification.
L_B is the batch size.
L_S is the sequence length for a recurrent neural network. If actorRep does not use a recurrent neural network, then L_S = 1.

Note

rlStochasticActorRepresentation actors with continuous action spaces do not enforce constraints set by the action specification. In these cases, you must enforce action space constraints within the environment.

`nextState` — Actor representation updated state
cell array

Actor representation updated state, returned as a cell array. If actorRep does not use a recurrent neural network, then state is an empty cell array.

You can set the state of the representation to state using the setState function. For example:

valueRep = setState(actorRep,state);

Documentation

getAction

Syntax

Description

Agent

Actor Representation

Examples

Get Actions from Agent

Get Action from Deterministic Actor

Input Arguments

`agent` — Reinforcement learning agent
`rlQAgent` | `rlSARSAAgent` | `rlDQNAgent` | `rlPGAgent` | `rlDDPGAgent` | `rlTD3Agent` | `rlACAgent` | `rlPPOAgent`

`actorRep` — Actor representation
`rlDeterministicActorRepresentation` object | `rlStochasticActorRepresentation` object

`obs` — Environment observations
cell array

Output Arguments

`agentAction` — Action value from agent
array

`actorAction` — Action value from actor representation
single-element cell array

`nextState` — Actor representation updated state
cell array

See Also

Topics

Reinforcement Learning Toolbox Documentation

Support

Documentation

getAction

Syntax

Description

Agent

Actor Representation

Examples

Get Actions from Agent

Get Action from Deterministic Actor

Input Arguments

agent — Reinforcement learning agent rlQAgent | rlSARSAAgent | rlDQNAgent | rlPGAgent | rlDDPGAgent | rlTD3Agent | rlACAgent | rlPPOAgent

actorRep — Actor representation rlDeterministicActorRepresentation object | rlStochasticActorRepresentation object

obs — Environment observations cell array

Output Arguments

agentAction — Action value from agent array

actorAction — Action value from actor representation single-element cell array

nextState — Actor representation updated state cell array

See Also

Topics

Reinforcement Learning Toolbox Documentation

Support

`agent` — Reinforcement learning agent
`rlQAgent` | `rlSARSAAgent` | `rlDQNAgent` | `rlPGAgent` | `rlDDPGAgent` | `rlTD3Agent` | `rlACAgent` | `rlPPOAgent`

`actorRep` — Actor representation
`rlDeterministicActorRepresentation` object | `rlStochasticActorRepresentation` object

`obs` — Environment observations
cell array

`agentAction` — Action value from agent
array

`actorAction` — Action value from actor representation
single-element cell array

`nextState` — Actor representation updated state
cell array