standardizeMissing

Insert standard missing values

Description

example

B = standardizeMissing(A,indicator) replaces values specified in indicator with standard missing values in an array or table.

Standard missing values depend on the data type:

  • NaN for double, single, duration, and calendarDuration

  • NaT for datetime

  • <missing> for string

  • <undefined> for categorical

  • ' ' for char

  • {''} for cell of character arrays

example

B = standardizeMissing(A,indicator,'DataVariables',vars) standardizes missing values in the variables specified by vars when A is a table or timetable.

Examples

collapse all

Create a row vector and replace all instances of -99 with the standard missing value for double data types, NaN.

A = [0 1 5 -99 8 3 4 -99 16];
B = standardizeMissing(A,-99)
B = 1×9

     0     1     5   NaN     8     3     4   NaN    16

Create a table containing Inf and 'N/A' to represent missing values.

dblVar = [NaN;3;Inf;7;9];
cellstrVar = {'one';'three';'';'N/A';'nine'};
charVar = ['A';'C';'E';' ';'I'];
categoryVar = categorical({'red';'yellow';'blue';'violet';''});

A = table(dblVar,cellstrVar,charVar,categoryVar)
A=5×4 table
    dblVar    cellstrVar    charVar    categoryVar
    ______    __________    _______    ___________

     NaN      {'one'   }       A       red        
       3      {'three' }       C       yellow     
     Inf      {0x0 char}       E       blue       
       7      {'N/A'   }               violet     
       9      {'nine'  }       I       <undefined>

Replace all instances of Inf with NaN and replace all instances of 'N/A' with the empty character vector, ''.

B = standardizeMissing(A,{Inf,'N/A'})
B=5×4 table
    dblVar    cellstrVar    charVar    categoryVar
    ______    __________    _______    ___________

     NaN      {'one'   }       A       red        
       3      {'three' }       C       yellow     
     NaN      {0x0 char}       E       blue       
       7      {0x0 char}               violet     
       9      {'nine'  }       I       <undefined>

Replace instances of Inf and 'N/A' occurring in specified variables of a table with the standard missing value indicators.

Create a table containing Inf and 'N/A' to represent missing values.

a = {'alpha';'bravo';'charlie';'';'N/A'};
x = [1;NaN;3;Inf;5];
y = [57;732;93;1398;Inf];

A = table(a,x,y)
A=5×3 table
         a          x      y  
    ___________    ___    ____

    {'alpha'  }      1      57
    {'bravo'  }    NaN     732
    {'charlie'}      3      93
    {0x0 char }    Inf    1398
    {'N/A'    }      5     Inf

For the variables a and x, replace instances of Inf with NaN and 'N/A' with the empty character vector, ''.

B = standardizeMissing(A,{Inf,'N/A'},'DataVariables',{'a','x'})
B=5×3 table
         a          x      y  
    ___________    ___    ____

    {'alpha'  }      1      57
    {'bravo'  }    NaN     732
    {'charlie'}      3      93
    {0x0 char }    NaN    1398
    {0x0 char }      5     Inf

Inf in the variable y remains unchanged because y is not included in the 'DataVariables' name-value pair argument.

Input Arguments

collapse all

Input data, specified as a vector, matrix, multidimensional array, table, or timetable. If A is a timetable, then ismissing operates on the table data only and ignores NaT and NaN values in the vector of row times.

Data Types: double | single | char | string | cell | table | timetable | categorical | datetime | duration

Nonstandard missing-value indicator, specified as a scalar, vector, or cell array. The elements of indicator define the values that standardizeMissing treats as missing. If A is an array, then indicator must be a vector. If A is a table or timetable, then indicator can also be a cell array with entries of multiple data types.

The data types specified in indicator match data types in the corresponding entries of A. The following are additional data type matches between the elements of indicator and elements of A:

  • double indicators match double, single, integer, and logical entries of A.

  • string and char indicators match categorical entries of A.

Example: B = standardizeMissing(A,'N/A') replaces the character vector 'N/A' with the empty character vector, ''.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | char | string | cell | datetime | duration

Table variables to standardize, specified as a variable name, a cell array of variable names, a numeric vector, a logical vector, a function handle, or a table vartype subscript. vars can be one of the following:

  • A character vector specifying a single table variable name

  • A cell array of character vectors where each element is a table variable name

  • A vector of table variable indices

  • A logical vector whose elements each correspond to a table variable, where true includes the corresponding variable and false excludes it

  • A function handle that returns a logical scalar, such as @isnumeric

  • A table vartype subscript

Example: 'Age'

Example: {'Height','Weight'}

Example: @iscategorical

Example: vartype('numeric')

Output Arguments

collapse all

Standardized array or table, specified as a vector, matrix, multidimensional array, table, or timetable. B has the same size as A.

Data Types: double | single | char | string | cell | table | timetable | categorical | datetime | duration | calendarDuration

Algorithms

standardizeMissing treats leading and trailing white space differently for cell arrays of character vectors, character arrays, and categorical arrays.

  • For cell arrays of character vectors, standardizeMissing does not ignore white space. All character vectors must match exactly a character vector specified in indicator.

  • For character arrays, standardizeMissing ignores trailing white space.

  • For categorical arrays, standardizeMissing ignores leading and trailing white space.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Introduced in R2013b