This example shows how to create a policy evaluation function for a PG Agent.
First, create and train a reinforcement learning agent. For this example, load the PG agent trained in Train PG Agent to Balance Cart-Pole System:
Then, create a policy evaluation function for this agent using default names:
This command creates the evaluatePolicy.m
file, which contains the policy function, and the agentData.mat
file, which contains the trained deep neural network actor.
View the generated function.
function action1 = evaluatePolicy(observation1)
%#codegen
% Reinforcement Learning Toolbox
% Generated on: 20-Aug-2020 17:00:53
actionSet = [-10 10];
% Select action from sampled probabilities
probabilities = localEvaluate(observation1);
% Normalize the probabilities
p = probabilities(:)'/sum(probabilities);
% Determine which action to take
edges = min([0 cumsum(p)],1);
edges(end) = 1;
[~,actionIndex] = histc(rand(1,1),edges); %#ok<HISTC>
action1 = actionSet(actionIndex);
end
%% Local Functions
function probabilities = localEvaluate(observation1)
persistent policy
if isempty(policy)
policy = coder.loadDeepLearningNetwork('agentData.mat','policy');
end
observation1 = observation1(:);
probabilities = predict(policy,observation1);
end
For a given observation, the policy function evaluates a probability for each potential action using the actor network. Then, the policy function randomly selects an action based on these probabilities.
Since the actor network for this PG agent has a single input layer and single output layer, you can generate code for this network using the Deep Learning Toolbox™ generation functionality. For more information, see Deploy Trained Reinforcement Learning Policies.