Class: dataset
(Not Recommended) Print summary of dataset array
The dataset
data type is not recommended. To work with heterogeneous data,
use the MATLAB®
table
data type instead. See MATLAB
table
documentation for more information.
summary(A)
s = summary(A)
summary(A)
prints a summary of a dataset array and
the variables that it contains.
s = summary(A)
returns a scalar structure s
that
contains a summary of the dataset A
and the variables that
A
contains. For more information on the fields in s
,
see Outputs.
Summary information depends on the type of the variables in the data set:
For numerical variables, summary
computes a five-number summary
of the data, giving the minimum, the first quartile, the median, the third quartile, and
the maximum.
For logical variables, summary
counts the number of
true
s and false
s in the data.
For categorical variables, summary
counts the number of data at
each level.
The following list describes the fields in the structure s
:
Description
— A character array containing the dataset
description.
Variables
— A structure array with one element for each
dataset variable in A. Each element has the following fields:
Name
— A character vector containing the name of the
variable.
Description
— A character vector containing the
variable's description.
Units
— A character vector containing the variable's
units.
Size
— A numeric vector containing the size of the
variable.
Class
— A character vector containing the class of
the variable.
Data
— A scalar structure containing the following
fields.
For numeric variables:
Probabilities
— A numeric vector containing
the probabilities [0.0 .25 .50 .75 1.0] and NaN (if any are present in the
corresponding dataset variable).
Quantiles
— A numeric vector containing the
values that correspond to 'Probabilities' for the corresponding dataset
variable, and a count of NaNs (if any are present).
For logical variables:
Values
— The logical vector [true
false].
Counts
— A numeric vector of counts for each
logical value.
For categorical variables:
Levels
— A cell array containing the labels
for each level of the corresponding dataset variable.
Counts
— A numeric vector of counts for each
level.
'Data'
is empty if variable is not numeric, categorical, or
logical. If a dataset variable has more than one column, then the corresponding
'Quantiles'
or 'Counts'
field is a matrix
or an array.
Summarize Fisher's iris data:
load fisheriris species = nominal(species); data = dataset(species,meas); summary(data) species: [150x1 nominal] setosa versicolor virginica 50 50 50 meas: [150x4 double] min 4.3000 2 1 0.1000 1st Q 5.1000 2.8000 1.6000 0.3000 median 5.8000 3 4.3500 1.3000 3rd Q 6.4000 3.3000 5.1000 1.8000 max 7.9000 4.4000 6.9000 2.5000
Summarize the data in hospital.mat
:
load hospital summary(hospital) Dataset array created from the data file hospital.dat. The first column of the file ("id") is used for observation names. Other columns ("sex" and "smoke") have been converted from their original coded values into categorical and logical variables. Two sets of columns ("sys" and "dia", "trial1" through "trial4") have been combined into single variables with multivariate observations. Column headers have been replaced with more descriptive variable names. Units have been added where appropriate. LastName: [100x1 cell array of character vectors] Sex: [100x1 nominal] Female Male 53 47 Age: [100x1 double, Units = Yrs] min 1st Q median 3rd Q max 25 32 39 44 50 Weight: [100x1 double, Units = Lbs] min 1st Q median 3rd Q max 111 130.5000 142.5000 180.5000 202 Smoker: [100x1 logical] true false 34 66 BloodPressure: [100x2 double, Units = mm Hg] Systolic/Diastolic min 109 68 1st Q 117.5000 77.5000 median 122 81.5000 3rd Q 127.5000 89 max 138 99 Trials: [100x1 cell, Units = Counts] From zero to four measurement trials performed