rlQAgentOptions

Options for Q-learning agent

Description

Use an rlQAgentOptions object to specify options for creating Q-learning agents. To create a Q-learning agent, use rlQAgent

For more information on Q-learning agents, see Q-Learning Agents.

For more information on the different types of reinforcement learning agents, see Reinforcement Learning Agents.

Creation

Syntax

opt = rlQAgentOptions

opt = rlQAgentOptions(Name,Value)

Description

opt = rlQAgentOptions creates an rlQAgentOptions object for use as an argument when creating a Q-learning agent using all default settings. You can modify the object properties using dot notation.

opt = rlQAgentOptions(Name,Value)sets option properties using name-value pairs. For example, rlQAgentOptions('DiscountFactor',0.95) creates an option set with a discount factor of 0.95. You can specify multiple name-value pairs. Enclose each property name in quotes.

Properties

expand all

`EpsilonGreedyExploration` — Options for epsilon greedy exploration
`EpsilonGreedyExploration` object

Options for epsilon greedy exploration, specified as an EpsilonGreedyExploration object with the following numeric value properties.

Property	Description
`Epsilon`	Probability threshold to either randomly select an action or select the action that maximizes the state-action value function. A larger value of `Epsilon` means that the agent randomly explores the action space at a higher rate.
`EpsilonMin`	Minimum value of `Epsilon`
`EpsilonDecay`	Decay rate

Epsilon is updated using the following formula when it is greater than EpsilonMin:

Epsilon = Epsilon*(1-EpsilonDecay)

To specify exploration options, use dot notation after creating the rlQAgentOptions object. For example, set the probability threshold to 0.9.

opt = rlQAgentOptions;
opt.EpsilonGreedyExploration.Epsilon = 0.9;

`SampleTime` — Sample time of agent
`1` (default) | positive scalar

Sample time of agent, specified as a positive scalar.

Within a Simulink environment, the agent gets executed every SampleTime seconds of simulation time.

Within a MATLAB environment, the agent gets executed every time the environment advances. However, SampleTime is the time interval between consecutive elements in the output experience returned by sim or train.

`DiscountFactor` — Discount factor
`0.99` (default) | positive scalar less than or equal to 1

Discount factor applied to future rewards during training, specified as a positive scalar less than or equal to 1.

Object Functions

rlQAgent Q-learning reinforcement learning agent

Examples

collapse all

Create Q-Learning Agent Options Object

Open Live Script

This example shows how to create an options object for a Q-Learning agent.

Create an rlQAgentOptions object that specifies the agent sample time.

opt = rlQAgentOptions('SampleTime',0.5)

opt = 
  rlQAgentOptions with properties:

    EpsilonGreedyExploration: [1x1 rl.option.EpsilonGreedyExploration]
                  SampleTime: 0.5000
              DiscountFactor: 0.9900

You can modify options using dot notation. For example, set the agent discount factor to 0.95.

opt.DiscountFactor = 0.95;

Documentation

rlQAgentOptions

Description

Creation

Syntax

Description

Properties

`EpsilonGreedyExploration` — Options for epsilon greedy exploration
`EpsilonGreedyExploration` object

`SampleTime` — Sample time of agent
`1` (default) | positive scalar

`DiscountFactor` — Discount factor
`0.99` (default) | positive scalar less than or equal to 1

Object Functions

Examples

Create Q-Learning Agent Options Object

See Also

Topics

Reinforcement Learning Toolbox Documentation

Support

Documentation

rlQAgentOptions

Description

Creation

Syntax

Description

Properties

EpsilonGreedyExploration — Options for epsilon greedy exploration EpsilonGreedyExploration object

SampleTime — Sample time of agent 1 (default) | positive scalar

DiscountFactor — Discount factor 0.99 (default) | positive scalar less than or equal to 1

Object Functions

Examples

Create Q-Learning Agent Options Object

See Also

Topics

Reinforcement Learning Toolbox Documentation

Support

`EpsilonGreedyExploration` — Options for epsilon greedy exploration
`EpsilonGreedyExploration` object

`SampleTime` — Sample time of agent
`1` (default) | positive scalar

`DiscountFactor` — Discount factor
`0.99` (default) | positive scalar less than or equal to 1