This example shows how to add and delete variables in a dataset array. You can also edit dataset arrays using the Variables editor.
Import the data from the first worksheet in
hospitalSmall.xlsx
into a dataset array.
ds = dataset('XLSFile',fullfile(matlabroot,'help/toolbox/stats/examples','hospitalSmall.xlsx')); size(ds)
ans = 14 6
The dataset array, ds
, has 14 observations (rows) and 6
variables (columns).
The worksheet Heights
in
hospitalSmall.xlsx
has heights for the patients on the
first worksheet. Concatenate the data in this spreadsheet with
ds
.
ds2 = dataset('XLSFile',fullfile(matlabroot,'help/toolbox/stats/examples','hospitalSmall.xlsx'),'Sheet','Heights'); ds = [ds ds2]; size(ds)
ans = 14 7
The dataset array now has seven variables. You can only horizontally concatenate dataset arrays with observations in the same position, or with the same observation names.
ds.Properties.VarNames{end}
ans = hgt
The name of the last variable in ds
is
hgt
, which dataset
read from the first
row of the imported spreadsheet.
First, specify the unique identifiers in the variable id
as
observation names. Then, delete the variable id
from the
dataset array.
ds.Properties.ObsNames = ds.id; ds.id = []; size(ds)
ans = 14 6
The dataset array now has six variables. List the variable names.
ds.Properties.VarNames(:)
ans = 'name' 'sex' 'age' 'wgt' 'smoke' 'hgt'
There is no longer a variable called id
.
Add a new variable, bmi
—which contains
the body mass index (BMI) for each patient—to the dataset array.
BMI is a function of height and weight. Display the last name, gender,
and BMI for each patient.
ds.bmi = ds.wgt*703./ds.hgt.^2; ds(:,{'name','sex','bmi'})
ans = name sex bmi YPL-320 'SMITH' 'm' 24.544 GLI-532 'JOHNSON' 'm' 24.068 PNI-258 'WILLIAMS' 'f' 23.958 MIJ-579 'JONES' 'f' 25.127 XLK-030 'BROWN' 'f' 21.078 TFP-518 'DAVIS' 'f' 27.729 LPD-746 'MILLER' 'f' 26.828 ATA-945 'WILSON' 'm' 24.41 VNL-702 'MOORE' 'm' 27.822 LQW-768 'TAYLOR' 'f' 22.655 QFY-472 'ANDERSON' 'f' 23.409 UJG-627 'THOMAS' 'f' 25.883 XUE-826 'JACKSON' 'm' 24.265 TRW-072 'WHITE' 'm' 29.827
The operators ./
and .^
in
the calculation of BMI indicate element-wise division and exponentiation,
respectively.
Delete the variable wgt
, the fourth variable
in the dataset array.
ds(:,4) = []; ds.Properties.VarNames(:)
ans = 'name' 'sex' 'age' 'smoke' 'hgt' 'bmi'
The variable wgt
is deleted from the dataset
array.