Create Simulink model for reinforcement learning, using reference model as environment
creates a Simulink® model with the name specified by env
= createIntegratedEnv(refModel
,newModel
)newModel
and returns a
reinforcement learning environment object, env
, for this model. The
new model contains an RL Agent block and uses the reference model
refModel
as a reinforcement learning environment for training the
agent specified by this block.
[
returns the block path to the RL Agent block in the new model and the
observation and action data specifications for the reference model,
env
,agentBlock
,obsInfo
,actInfo
] = createIntegratedEnv(___)obsInfo
and actInfo
, respectively.
[___] = createIntegratedEnv(___,
creates a model and environment interface using port, observation, and action information
specified using one or more Name,Value
)Name,Value
pair arguments.
This example shows how to use createIntegratedEnv
to create an environment object starting from a Simulink model implementing the system that the agent needs to interact with. Such a system is often referred to as plant, open loop system or reference system, while the whole (integrated) system including the agent is often referred to as the closed loop system.
For this example, use the flying robot model described in Train DDPG Agent to Control Flying Robot as the reference (open loop) system.
Open the flying robot model.
open_system('rlFlyingRobotEnv');
Initialize state variables and sample time.
% initial model state variables theta0 = 0; x0 = -15; y0 = 0; % sample time Ts = 0.4;
Create the Simulink model IntegratedEnv
containing the flying robot model connected in a closed loop to the agent block. The function also returns the reinforcement learning environment object env
, to be used for training.
env=createIntegratedEnv('rlFlyingRobotEnv','IntegratedEnv')
env = SimulinkEnvWithAgent with properties: Model : IntegratedEnv AgentBlock : IntegratedEnv/RL Agent ResetFcn : [] UseFastRestart : on
The function can also return the block path to the RL Agent block in the new integrated model, as well as the observation and action data specifications for the reference model.
[~,agentBlk,observationInfo,actionInfo]=createIntegratedEnv('rlFlyingRobotEnv','IntegratedEnv')
agentBlk = 'IntegratedEnv/RL Agent'
observationInfo = rlNumericSpec with properties: LowerLimit: -Inf UpperLimit: Inf Name: "observation" Description: [0x0 string] Dimension: [7 1] DataType: "double"
actionInfo = rlNumericSpec with properties: LowerLimit: -Inf UpperLimit: Inf Name: "action" Description: [0x0 string] Dimension: [2 1] DataType: "double"
This is useful in cases in which you need to modify descriptions, limits or names in observationInfo
and actionInfo
and later create an environment from the integrated model IntegratedEnv
, using the function rlSimulinkEnv
.
This example shows how to call the function createIntegratedEnv
with using Name and Value pairs to create an integrated (closed loop) Simulink environment and the corresponding environment object.
The first argument of the createIntegratedEnv
function is the name of the reference Simulink model which contains the system that the agent needs to interact with. Such a system is often referred to as plant, or open loop system. For this example, the reference system is the model of a water tank.
Open the open loop water tank model.
open_system('rlWatertankOpenloop.slx');
Set the sampling time of the discrete integrator block used to generate the observation, so the simulation can run.
Ts=1;
Since the input port is called u
(instead of action
), and the first and third output ports are called y
and stop
(instead of observation
and isdone
), use Name and Value pairs to specify the correct name when calling the function createIntegratedEnv
.
env=createIntegratedEnv('rlWatertankOpenloop','IntegratedWatertank','ActionPortName','u','ObservationPortName','y','IsDonePortName','stop')
env = SimulinkEnvWithAgent with properties: Model : IntegratedWatertank AgentBlock : IntegratedWatertank/RL Agent ResetFcn : [] UseFastRestart : on
This creates the new model IntegratedWatertank
which contains the reference model connected in a closed loop to the agent block. The function also returns the reinforcement learning environment object env
, to be used for training.
refModel
— Reference model nameReference model name, specified as a string or character vector. This is the Simulink model implementing the system that the agent needs to interact with. Such a system is often referred to as plant, open loop system or reference system, while the whole (integrated) system including the agent is often referred to as the closed loop system. The new Simulink model uses this reference model as the dynamic model of the environment for reinforcement learning.
newModel
— New model nameNew model name, specified as a string or character vector.
createIntegratedEnv
creates a Simulink model with this name, but does not save the model.
Specify optional
comma-separated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'IsDonePortName',"stopSim"
sets the
stopSim
port of the reference model as the source of the
isdone
signal.'ObservationPortName'
— Reference model observation output port name"observation"
(default) | string | character vectorReference model observation output port name, specified as the comma-separated
pair consisting of 'ObservationPortName'
and a string or character
vector. Specify ObservationPortName
when the name of the
observation output port of the reference model is not
"observation"
.
'ActionPortName'
— Reference model action input port name"action"
(default) | string | character vectorReference model action input port name, specified as the comma-separated pair
consisting of 'ActionPortName'
and a string or character vector.
Specify ActionPortName
when the name of the action input port of
the reference model is not "action"
.
'RewardPortName'
— Reference model reward output port name"reward"
(default) | string | character vectorReference model reward output port name, specified as the comma-separated pair
consisting of 'RewardPortName'
and a string or character vector.
Specify RewardPortName
when the name of the reward output port of
the reference model is not "reward"
.
'IsDonePortName'
— Reference model done flag output port name"isdone"
(default) | string | character vectorReference model done flag output port name, specified as the comma-separated pair
consisting of 'IsDonePortName'
and a string or character vector.
Specify IsDonePortName
when the name of the done flag output port
of the reference model is not "isdone"
.
'ObservationBusElementNames'
— Names of observation bus leaf elementsNames of observation bus leaf elements for which to create specifications,
specified as a string array. To create observation specifications for a subset of the
elements in a Simulink bus object, specify BusElementNames
. If you do not
specify BusElementNames
, a data specification is created for each
leaf element in the bus.
ObservationBusElementNames
is applicable only when the
observation output port is a bus signal.
Example: 'ObservationBusElementNames',["sin" "cos"]
creates
specifications for the observation bus elements with the names
"sin"
and "cos"
.
'ObservationDiscreteElements'
— Finite values for observation specificationsFinite values for discrete observation specification elements, specified as the
comma-separated pair consisting of 'ObservationDiscreteElements'
and a cell array of name-value pairs. Each name-value pair consists of an element name
and an array of discrete values.
If the observation output port of the reference model is:
A bus signal, specify the name of one of the leaf elements of the bus
specified in by ObservationBusElementNames
Nonbus signal, specify the name of the observation port, as specified by
ObservationPortName
The specified discrete values must be castable to the data type of the specified observation signal.
If you do not specify discrete values for an observation specification element, the element is continuous.
Example: 'ObservationDiscretElements',{'observation',[-1 0 1]}
specifies discrete values for a nonbus observation signal with default port name
observation
.
Example: 'ObservationDiscretElements',{'gear',[-1 0 1 2],'direction',[1 2
3 4]}
specifies discrete values for the 'gear'
and
'direction'
leaf elements of a bus action signal.
'ActionDiscreteElements'
— Finite values for action specificationsFinite values for discrete action specification elements, specified as the
comma-separated pair consisting of 'ActionDiscreteElements'
and a
cell array of name-value pairs. Each name-value pair consists of an element name and
an array of discrete values.
If the action input port of the reference model is:
A bus signal, specify the name of a leaf element of the bus
Nonbus signal, specify the name of the action port, as specified by
ActionPortName
The specified discrete values must be castable to the data type of the specified action signal.
If you do not specify discrete values for an action specification element, the element is continuous.
Example: 'ActionDiscretElements',{'action',[-1 0 1]}
specifies
discrete values for a nonbus action signal with default port name
'action'
.
Example: 'ActionDiscretElements',{'force',[-10 0 10],'torque',[-5 0
5]}
specifies discrete values for the 'force'
and
'torque'
leaf elements of a bus action signal.
env
— Reinforcement learning environmentSimulinkEnvWithAgent
objectReinforcement learning environment interface, returned as an
SimulinkEnvWithAgent
object.
agentBlock
— Block path to the agent blockBlock path to the agent block in the new model, returned as a character vector. To
train an agent in the new Simulink model, you must create an agent and specify the agent name in the RL Agent block
indicated by agentBlock
.
For more information on creating agents, see Reinforcement Learning Agents.
obsInfo
— Observation data specificationsrlNumericSpec
object | rlFiniteSetSpec
object | array of data specification objectsObservation data specifications, returned as one of the following:
rlNumericSpec
object for a single continuous observation
specification
rlFiniteSetSpec
object for a single discrete observation
specification
Array of data specification objects for multiple specifications
actInfo
— Action data specificationsrlNumericSpec
object | rlFiniteSetSpec
object | array of data specification objectsAction data specifications, returned as one of the following:
rlNumericSpec
object for a single continuous action
specification
rlFiniteSetSpec
object for a single discrete action
specification
Array of data specification objects for multiple action specifications
You have a modified version of this example. Do you want to open this example with your edits?