rlFunctionEnv

Specify custom reinforcement learning environment dynamics using functions

Description

Use rlFunctionEnv to define a custom reinforcement learning environment. You provide MATLAB® functions that define the step and reset behavior for the environment. This object is useful when you want to customize your environment beyond the predefined environments available with rlPredefinedEnv.

Creation

Description

example

env = rlFunctionEnv(obsInfo,actInfo,stepfcn,resetfcn) creates a reinforcement learning environment using the observation specification and agent specification you provide. You also provide your own MATLAB functions that define step and reset behavior for the environment.

Input Arguments

expand all

Observation specification, specified as a reinforcement learning spec object created with a spec command such as rlFiniteSetSpec or rlNumericSpec. This specification defines such information about the observations as the dimensions and names of the observation signals.

Action specification, specified as a reinforcement learning spec object created with a spec command such as rlFiniteSetSpec or rlNumericSpec. The specification defines such information about the actions as the dimensions and names of the action signals.

Step behavior for the environment, specified as a function, function handle, or anonymous function. stepfcn sets the value of the StepFcn property.

Reset behavior for the environment, specified as a function, function handle, or anonymous function. resetfcn sets the value of the ResetFcn property.

Properties

expand all

Step behavior for the environment, specified as a function, function handle, or anonymous function. When you create an rlFunctionEnv object, the stepfcn input argument sets the value of this property.

StepFcn is a function that you provide which describes how the environment advances to the next state from a given action. This function must have two inputs and four outputs, as illustrated by the following signature:

[Observation,Reward,IsDone,LoggedSignals] = myStepFunction(Action,LoggedSignals)

Thus, the step function computes the values of the observation and reward for the given action in the environment. The required input and output arguments are:

  • Action and Observation — The current action and the returned observation. These values must match the dimensions and data types specified in actInfo and obsInfo, respectively.

  • Reward — Reward for the current step, returned as a scalar value.

  • IsDone — Logical value indicating whether to end the simulation episode. The step function that you define can include logic to decide whether to end the simulation based on the observation, reward, or any other values.

  • LoggedSignals — Any data that you want to pass from one step to the next, specified as a structure.

To use additional input arguments beyond this required set, specify StepFcn using a function handle or an anonymous function. For an example showing multiple ways to define a step function, see Create MATLAB Environment Using Custom Functions.

Reset behavior for the environment, specified as a function, function handle, or anonymous function. When you create a rlFunctionEnv object, the resetfcn input argument sets the value of this property.

The reset function that you provide must have no inputs and two outputs, as illustrated by the following signature.

[InitialObservation,LoggedSignals] = myResetFunction

Thus, the reset function computes the initial values of the observation signals. For instance, sim calls the reset function to reset the environment at the start of each simulation, and train calls it at the start of each training episode. Therefore, you might create a reset function that randomizes certain state values, such that each training episode begins from different initial conditions.

The InitialObservation output must match the dimensions and data type of obsInfo.

To pass information from the reset condition into the first step, specify that information in the reset function as the output structure LoggedSignals.

To use input arguments with your reset function, specify ResetFcn using a function handle or an anonymous function. For an example showing multiple ways to define a reset function, see Create MATLAB Environment Using Custom Functions.

Information to pass to the next step, specified as a structure. When you create the environment, whatever you define as the LoggedSignals output of resetfcn initializes this property. When a step occurs, the software populates this property with data to pass to the next step, as defined in stepfcn.

Object Functions

getActionInfoObtain action data specifications from reinforcement learning environment or agent
getObservationInfoObtain observation data specifications from reinforcement learning environment or agent
simSimulate trained reinforcement learning agents within specified environment
validateEnvironmentValidate custom reinforcement learning environment

Examples

collapse all

Create a reinforcement learning environment by supplying custom dynamic functions in MATLAB®. Using rlFunctionEnv, you can create a MATLAB reinforcement learning environment from an observation specification, action specification, and step and reset functions that you define.

For this example, create an environment that represents a system for balancing a cart on a pole. The observations from the environment are the cart position, cart velocity, pendulum angle, and pendulum angle derivative. (For additional details about this environment, see Create MATLAB Environment Using Custom Functions.) Create an observation specification for those signals.

oinfo = rlNumericSpec([4 1]);
oinfo.Name = 'CartPole States';
oinfo.Description = 'x, dx, theta, dtheta';

The environment has a discrete action space where the agent can apply one of two possible force values to the cart, –10 N or 10 N. Create the action specification for those actions.

ActionInfo = rlFiniteSetSpec([-10 10]);
ActionInfo.Name = 'CartPole Action';

Next, specify the custom step and reset functions. For this example, use the supplied functions myResetFunction.m and myStepFunction.m. For details about these functions and how they are constructed, see Create MATLAB Environment Using Custom Functions.

Construct the custom environment using the defined observation specification, action specification, and function names.

env = rlFunctionEnv(oinfo,ActionInfo,'myStepFunction','myResetFunction');

You can create agents for env and train them within the environment as you would for any other reinforcement learning environment.

As an alternative to using function names, you can specify the functions as function handles. For more details and an example, see Create MATLAB Environment Using Custom Functions.

Introduced in R2019a