Note
The dataset
data type is not recommended. To
work with heterogeneous data, use the MATLAB®
table
data type instead. See MATLAB
table
documentation for more information.
There are many ways to index into dataset arrays. For example,
for a dataset array, ds
, you can:
Use ()
to create a new dataset
array from a subset of ds
. For example, ds1
= ds(1:5,:)
creates a new dataset array, ds1
,
consisting of the first five rows of ds
. Metadata,
including variable and observation names, transfers to the new dataset
array.
Use variable names with dot notation to index individual
variables in a dataset array. For example, ds.Height
indexes
the variable named Height
.
Use observation names to index individual observations
in a dataset array. For example, ds('Obs1',:)
gives
data for the observation named Obs1
.
Use observation or variable numbers. For example, ds(:,[1,3,5])
gives
the data in the first, third, and fifth variables (columns) of ds
.
Use logical indexing to search for observations in ds
that
satisfy a logical condition. For example, ds(ds.Gender=='Male',:)
gives
the observations in ds
where the variable named Gender
,
a nominal array, has the value Male
.
Use ismissing
to find missing data
in the dataset array.
This example shows several indexing and searching methods for categorical arrays.
Load the sample data.
load hospital;
size(hospital)
ans = 1×2
100 7
The dataset array has 100 observations and 7 variables.
Index a variable by name. Return the minimum age in the dataset array.
min(hospital.Age)
ans = 25
Delete the variable Trials
.
hospital.Trials = []; size(hospital)
ans = 1×2
100 6
Index an observation by name. Display measurements on the first five variables for the observation named PUE-347
.
hospital('PUE-347',1:5)
ans = LastName Sex Age Weight Smoker PUE-347 {'YOUNG'} Female 25 114 false
Index variables by number. Create a new dataset array containing the first four variables of hospital
.
dsNew = hospital(:,1:4); dsNew.Properties.VarNames(:)
ans = 4x1 cell
{'LastName'}
{'Sex' }
{'Age' }
{'Weight' }
Index observations by number. Delete the last 10 observations.
hospital(end-9:end,:) = []; size(hospital)
ans = 1×2
90 6
Search for observations by logical condition. Create a new dataset array containing only females who smoke.
dsFS = hospital(hospital.Sex=='Female' & hospital.Smoker==true,:); dsFS(:,{'LastName','Sex','Smoker'})
ans = LastName Sex Smoker LPD-746 {'MILLER' } Female true XBR-291 {'GARCIA' } Female true AAX-056 {'LEE' } Female true DTT-578 {'WALKER' } Female true AFK-336 {'WRIGHT' } Female true RBA-579 {'SANCHEZ' } Female true HAK-381 {'MORRIS' } Female true NSK-403 {'RAMIREZ' } Female true ILS-109 {'WATSON' } Female true JDR-456 {'SANDERS' } Female true HWZ-321 {'PATTERSON'} Female true GGU-691 {'HUGHES' } Female true WUS-105 {'FLORES' } Female true