Note
The dataset
data type is not recommended. To
work with heterogeneous data, use the MATLAB®
table
data type instead. See MATLAB
table
documentation for more information.
The MATLAB Variables editor provides a convenient interface for viewing, modifying, and plotting dataset arrays.
First, load the sample data set,
hospital
.
load hospital
hospital
, is created in the MATLAB workspace.The dataset array has 100 observations and 7 variables.
To open hospital
in the Variables editor, click Open
Variable, and select hospital
.
The Variables editor opens, displaying the contents of the dataset array (only the first 10 observations are shown here).
In the Variables editor, you can see the names of the seven variables along the top row, and the observations names down the first column.
You can modify variable and observation names by double-clicking a name, and then typing new text.
All changes made in the Variables editor are also sent to the command line.
The sixth variable in the data set, BloodPressure
, is a numeric
array with two columns. The first column shows systolic blood pressure, and the
second column shows diastolic blood pressure. Click the arrow that appears on the
right side of the variable name cell to see the units and description of the
variable. You can type directly in the units and description fields to modify the
text. The variable data type and size are shown under the variable description.
You can reorder variables in a dataset array using the Variables editor. Hover over the left side of a variable name cell until a four-headed arrow appears.
After the arrow appears, click and drag the variable column to a new location.
The command for the variable reordering appears in the command line.
You can delete a variable in the Variables editor by selecting the variable column, right-clicking, and selecting Delete Column Variable(s).
The command for the variable deletion appears in the command line.
You can enter new data values directly into the Variables editor. For example, you
can add a new patient observation to the hospital
data set. To
enter a new last name, add a character vector to the end of the variable
LastName
.
The variable Gender
is a nominal
array. The
levels of the categorical variable appear in a drop-down list when you double-click
a cell in the Gender
column. You can choose one of the levels
previously used, or create a new level by selecting New
Item.
You can continue to add data for the remaining variables.
To change the observation name, click the observation name and type the new name.
The commands for entering the new data appear at the command line.
Notice the warning that appears after the first assignment. When you enter the first piece of data in the new observation row—here, the last name—default values are assigned to all other variables. Default assignments are:
0
for numeric variables
<undefined>
for categorical variables
[]
for cell arrays
You can also copy and paste data from one dataset array to another using the Variables editor.
You can use the Variables editor to sort dataset array observations by the values
of one or more variables. To sort by gender, for example, select the variable
Gender
. Then click Sort, and choose to
sort rows by ascending or descending values of the selected variable.
When sorting by variables that are cell arrays of character vectors
or of nominal data type, observations are sorted alphabetically. For ordinal
variables, rows are sorted by the ordering of the levels. For example, when the
observations of hospital
are sorted by the values in
Gender
, the females are grouped together, followed by the males.
To sort by the values of multiple variables, press Ctrl while you select multiple variables.
When you use the Variables editor to sort rows, it is the same as calling
sortrows
. You can see this at the command line after
executing the sorting.
You can select a subset of data from a dataset array in the Variables editor, and
create a new dataset array from the selection. For example, to create a dataset
array containing only the variables LastName
and
Age
:
Hold Ctrl while you click the variables
LastName
and Age
.
Right-click, and select New Workspace Variable from Selection > New Dataset Array.
The new dataset array appears in the Workspace window with the name
hospital1
. The Command Window shows the commands that execute
the selection.
You can use the same steps to select any subset of data. To select observations according to some logical condition, you can use a combination of sorting and selecting. For example, to create a new dataset array containing only males aged 45 and older:
Sort the observations of hospital
by the values in
Gender
and Age
,
descending.
Select the male observations with age 45 and older.
Right-click, and select New Workspace Variables from Selection > New Dataset Array. The new dataset array, hospital2
, is
created in the Workspace window.
You can rename the dataset array in the Workspace window.
You can plot data from a dataset array using plotting options in the Variables editor. Available plot choices depend on the data types of variables to be plotted.
For example, if you select the variable Age
, you can see in the
Plots tab some plotting options that are appropriate for
a univariate, numeric variable.
Sometimes, there are plot options for multiple variables, depending on their data
types. For example, if you select both Age
and
Gender
, you can draw box plots of age, grouped by gender.