Select Subsets of Observations

This example shows how to select an observation or subset of observations from a dataset array.

Load sample data.

Load the sample dataset array, hospital. Dataset arrays can have observation (row) names. This array has observation names corresponding to unique patient identifiers.

load hospital
hospital.Properties.ObsNames(1:10)
ans = 10x1 cell
    {'YPL-320'}
    {'GLI-532'}
    {'PNI-258'}
    {'MIJ-579'}
    {'XLK-030'}
    {'TFP-518'}
    {'LPD-746'}
    {'ATA-945'}
    {'VNL-702'}
    {'LQW-768'}

These are the first 10 observation names.

Index an observation by name.

You can use the observation names to index into the dataset array. For example, extract the last name, sex, and age for the patient with identifier XLK-030.

hospital('XLK-030',{'LastName','Sex','Age'})
ans = 
               LastName         Sex       Age
    XLK-030    {'BROWN'}        Female    49 

Index a subset of observations by number.

Create a new dataset array containing the first 50 patients.

ds50 = hospital(1:50,:);
size(ds50)
ans = 1×2

    50     7

Search observations using a logical condition.

Create a new dataset array containing only male patients. To find the male patients, use a logical condition to search the variable containing gender information.

dsMale = hospital(hospital.Sex=='Male',:);
dsMale(1:10,{'LastName','Sex'})
ans = 
               LastName            Sex 
    YPL-320    {'SMITH'   }        Male
    GLI-532    {'JOHNSON' }        Male
    ATA-945    {'WILSON'  }        Male
    VNL-702    {'MOORE'   }        Male
    XUE-826    {'JACKSON' }        Male
    TRW-072    {'WHITE'   }        Male
    KOQ-996    {'MARTIN'  }        Male
    YUZ-646    {'THOMPSON'}        Male
    KPW-846    {'MARTINEZ'}        Male
    XBA-581    {'ROBINSON'}        Male

Search observations using multiple conditions.

You can use multiple conditions to search the dataset array. For example, create a new dataset array containing only female patients older than 40.

dsFemale = hospital(hospital.Sex=='Female' & hospital.Age > 40,:);
dsFemale(1:10,{'LastName','Sex','Age'})
ans = 
               LastName            Sex       Age
    XLK-030    {'BROWN'   }        Female    49 
    TFP-518    {'DAVIS'   }        Female    46 
    QFY-472    {'ANDERSON'}        Female    45 
    UJG-627    {'THOMAS'  }        Female    42 
    BKD-785    {'CLARK'   }        Female    48 
    VWL-936    {'LEWIS'   }        Female    41 
    AAX-056    {'LEE'     }        Female    44 
    AFK-336    {'WRIGHT'  }        Female    45 
    KKL-155    {'ADAMS'   }        Female    48 
    RBA-579    {'SANCHEZ' }        Female    44 

Select a random subset of observations.

Create a new dataset array containing a random subset of 20 patients from the dataset array hospital.

rng('default') % For reproducibility
dsRandom = hospital(randsample(length(hospital),20),:);
dsRandom.Properties.ObsNames
ans = 20x1 cell
    {'DAU-529'}
    {'AGR-528'}
    {'RBO-332'}
    {'QOO-305'}
    {'RVS-253'}
    {'QEQ-082'}
    {'EHE-616'}
    {'HVR-372'}
    {'KOQ-996'}
    {'REV-997'}
    {'PUE-347'}
    {'LQW-768'}
    {'YLN-495'}
    {'HJQ-495'}
    {'ELG-976'}
    {'XUE-826'}
    {'MEZ-469'}
    {'UDS-151'}
    {'MIJ-579'}
    {'DGC-290'}

Delete observations by name.

Delete the data for the patient with observation name HVR-372.

hospital('HVR-372',:) = [];
size(hospital)
ans = 1×2

    99     7

The dataset array has one less observation.

See Also

Related Examples

More About