(Not Recommended) Arrays for statistical data
The dataset
data type is not recommended. To work with heterogeneous data,
use the MATLAB®
table
data type instead. See MATLAB
table
documentation for more information.
Dataset arrays are used to collect heterogeneous data and metadata including variable and observation names into a single container variable. Dataset arrays are suitable for storing column-oriented or tabular data that are often stored as columns in a text file or in a spreadsheet, and can accommodate variables of different types, sizes, units, etc.
Dataset arrays can contain different kinds of variables, including numeric, logical, character, string, categorical, and cell. However, a dataset array is a different class than the variables that it contains. For example, even a dataset array that contains only variables that are double arrays cannot be operated on as if it were itself a double array. However, using dot subscripting, you can operate on variable in a dataset array as if it were a workspace variable.
You can subscript dataset arrays using parentheses much like ordinary numeric arrays, but in addition to numeric and logical indices, you can use variable and observation names as indices.
Use the dataset
constructor to create a dataset array from variables
in the MATLAB workspace. You can also create a dataset array by reading data from a text or
spreadsheet file. You can access each variable in a dataset array much like fields in a
structure, using dot subscripting. See the following section for a list of operations
available for dataset arrays.
dataset | (Not Recommended) Construct dataset array |
cat | (Not Recommended) Concatenate dataset arrays |
cellstr | (Not Recommended) Create cell array of character vectors from dataset array |
dataset2cell | (Not Recommended) Convert dataset array to cell array |
dataset2struct | (Not Recommended) Convert dataset array to structure |
datasetfun | (Not Recommended) Apply function to dataset array variables |
disp | (Not Recommended) Display dataset array |
display | (Not Recommended) Display dataset array |
double | (Not Recommended) Convert dataset variables to double array |
end | (Not Recommended) Last index in indexing expression for dataset array |
export | (Not Recommended) Write dataset array to file |
get | (Not Recommended) Access dataset array properties |
horzcat | (Not Recommended) Horizontal concatenation for dataset arrays |
intersect | (Not Recommended) Set intersection for dataset array observations |
isempty | (Not Recommended) True for empty dataset array |
ismember | (Not Recommended) Dataset array elements that are members of set |
ismissing | (Not Recommended) Find dataset array elements with missing values |
join | (Not Recommended) Merge dataset array observations |
length | (Not Recommended) Length of dataset array |
ndims | (Not Recommended) Number of dimensions of dataset array |
numel | (Not Recommended) Number of elements in dataset array |
replaceWithMissing | (Not Recommended) Insert missing data indicators into a dataset array |
replacedata | (Not Recommended) Replace dataset variables |
set | (Not Recommended) Set and display dataset array properties |
setdiff | (Not Recommended) Set difference for dataset array observations |
setxor | (Not Recommended) Set exclusive or for dataset array observations |
single | (Not Recommended) Convert dataset variables to single array |
size | (Not Recommended) Size of dataset array |
sortrows | (Not Recommended) Sort rows of dataset array |
stack | (Not Recommended) Stack dataset array from multiple variables into single variable |
subsasgn | (Not Recommended) Subscripted assignment to dataset array |
subsref | (Not Recommended) Subscripted reference for dataset array |
summary | (Not Recommended) Print summary of dataset array |
union | (Not Recommended) Set union for dataset array observations |
unique | (Not Recommended) Unique observations in dataset array |
unstack | (Not Recommended) Unstack dataset array from single variable into multiple variables |
vertcat | (Not Recommended) Vertical concatenation for dataset arrays |
A dataset array D
has properties that store metadata (information
about your data). Access or assign to a property using P =
D.Properties.PropName
or D.Properties.PropName = P
, where
PropName
is one of the following:
Description | (Not Recommended) Character vector describing dataset array |
DimNames | (Not Recommended) Two-element cell array of character vectors giving names of dimensions of dataset array |
ObsNames | (Not Recommended) Cell array of nonempty, distinct character vectors giving names of observations in dataset array |
Units | (Not Recommended) Units of variables in dataset array |
UserData | (Not Recommended) Variable containing additional information associated with dataset array |
VarDescription | (Not Recommended) Cell array of character vectors giving descriptions of variables in dataset array |
VarNames | (Not Recommended) Cell array giving names of variables in dataset array |
Value. To learn how this affects your use of the class, see Comparing Handle and Value Classes in the MATLAB Object-Oriented Programming documentation.
Load a dataset array from a .mat file and create some simple subsets:
load hospital h1 = hospital(1:10,:) h2 = hospital(:,{'LastName' 'Age' 'Sex' 'Smoker'}) % Access and modify metadata hospital.Properties.Description hospital.Properties.VarNames{4} = 'Wgt' % Create a new dataset variable from an existing one hospital.AtRisk = hospital.Smoker | (hospital.Age > 40) % Use individual variables to explore the data boxplot(hospital.Age,hospital.Sex) h3 = hospital(hospital.Age<30,... {'LastName' 'Age' 'Sex' 'Smoker'}) % Sort the observations based on two variables h4 = sortrows(hospital,{'Sex','Age'})