Add and Delete Observations

This example shows how to add and delete observations in a dataset array. You can also edit dataset arrays using the Variables editor.

Load sample data.

Import the data from the first worksheet in hospitalSmall.xlsx into a dataset array.

ds = dataset('XLSFile',fullfile(matlabroot,'help/toolbox/stats/examples','hospitalSmall.xlsx'));
size(ds)
ans =

    14     6

The dataset array, ds, has 14 observations (rows) and 6 variables (columns).

Add observations by concatenation.

The second worksheet in hospitalSmall.xlsx has additional patient data. Append the observations in this spreadsheet to the end of ds.

ds2 = dataset('XLSFile',fullfile(matlabroot,'help/toolbox/stats/examples','hospitalSmall.xlsx'),'Sheet',2);
dsNew = [ds;ds2];
size(dsNew)
ans =

    22     6

The dataset array dsNew has 22 observations. In order to vertically concatenate two dataset arrays, both arrays must have the same number of variables, with the same variable names.

Add observations from a cell array.

If you want to append new observations stored in a cell array, first convert the cell array to a dataset array, and then concatenate the dataset arrays.

cellObs = {'id','name','sex','age','wgt','smoke';
               'YQR-965','BAKER','M',36,160,0;
               'LFG-497','WALL' ,'F',28,125,1;
               'KSD-003','REED' ,'M',32,187,0};
dsNew = [dsNew;cell2dataset(cellObs)];
size(dsNew)
ans =

    25     6

Add observations from a structure.

You can also append new observations stored in a structure. Convert the structure to a dataset array, and then concatenate the dataset arrays.

structObs(1,1).id = 'GHK-842';
structObs(1,1).name = 'GEORGE';
structObs(1,1).sex = 'M';
structObs(1,1).age = 45;
structObs(1,1).wgt = 182;
structObs(1,1).smoke = 1;

structObs(2,1).id = 'QRH-308';
structObs(2,1).name = 'BAILEY';
structObs(2,1).sex = 'F';
structObs(2,1).age = 29;
structObs(2,1).wgt = 120;
structObs(2,1).smoke = 0;

dsNew = [dsNew;struct2dataset(structObs)];
size(dsNew)
ans =

    27     6

Delete duplicate observations.

Use unique to delete any observations in a dataset array that are duplicated.

dsNew = unique(dsNew);
size(dsNew)
ans =

    21     6

One duplicated observation is deleted.

Delete observations by observation number.

Delete observations 18, 20, and 21 from the dataset array.

dsNew([18,20,21],:) = [];
size(dsNew)
ans =

    18     6

The dataset array has only 18 observations now.

Delete observations by observation name.

First, specify the variable of identifiers, id, as observation names. Then, delete the variable id from dsNew. You can use the observation name to index observations.

dsNew.Properties.ObsNames = dsNew.id;
dsNew.id = [];
dsNew('KOQ-996',:) = [];
size(dsNew)
ans =

    17     5

The dataset array now has one less observation and one less variable.

Search for observations to delete.

You can also search for observations in the dataset array. For example, delete observations for any patients with the last name WILLIAMS.

toDelete = strcmp(dsNew.name,'WILLIAMS');
dsNew(toDelete,:) = [];
size(dsNew)
ans =

    16     5
The dataset array now has one less observation.

See Also

| |

Related Examples

More About