Define Import Options for Tables

Typically, you can import tables using the readtable function. However, sometimes importing tabular data requires additional control over the import process. For example, you might want to select the variables to import or handle rows with missing or error-causing data. To control the import process, you can create an import options object. The object has properties that you can adjust based on your import needs.

Create Import Options

To create an import options object for a sample data set, airlinesmall.csv, use the detectImportOptions function. The detectImportOptions function creates a DelimitedTextImportOptions object for this text file. For a full list of properties of the import options object, see the detectImportOptions reference page.

opts = detectImportOptions('airlinesmall.csv');

Customize Table-Level Import Options

The import options object has properties that you can adjust to control the import process. Some properties apply to the entire table while others apply to specific variables. Properties that affect the entire table include rules to manage error-causing or missing data. For example, remove rows with data that cause import errors by setting the ImportErrorRule to 'omitrow'. Replace missing values by setting the MissingRule to 'fill'. The FillValue property value determines what value replaces the missing values. For example, you can replace missing values with NaN.

opts.ImportErrorRule = 'omitrow';
opts.MissingRule = 'fill';

Customize Variable-Level Import Options

To get and set options for specific variables use the getvaropts, setvartype, and setvaropts functions. For example, view the current options for the variables named FlightNum, Origin, Dest, and ArrDelay, using the getvaropts function.

getvaropts(opts,{'FlightNum','Origin','Dest','ArrDelay'});

Change the data types for the variables using the setvartype function:

  • Since the values in the variable FlightNum are identifiers for the flight and not numerical values, change its data type to char.

  • Since the variables Origin and Dest designate a finite set of repeating text values, change their data type to categorical.

 opts = setvartype(opts,{'FlightNum','Origin','Dest','ArrDelay'},...
                        {'char','categorical','categorical','single'});

Change other properties using the setvaropts function:

  • For the FlightNum variable, remove any leading white spaces from the text by setting the WhiteSpaceRule property to trimleading.

  • For the ArrDelay variable, replace fields containing 0 or NA with the value specified in FillValue property by setting the TreatAsMissing property.

 opts = setvaropts(opts,'FlightNum','WhitespaceRule','trimleading');
 opts = setvaropts(opts,'ArrDelay','TreatAsMissing',{'0','NA'});

Import Table

Specify the variables to get, import them using readtable, and display the first 8 rows of the table.

opts.SelectedVariableNames = {'FlightNum','Origin','Dest','ArrDelay'}; 
T = readtable('airlinesmall.csv',opts);
T(1:8,:)
ans=8×4 table
    FlightNum    Origin    Dest    ArrDelay
    _________    ______    ____    ________

    {'1503'}      LAX      SJC         8   
    {'1550'}      SJC      BUR         8   
    {'1589'}      SAN      SMF        21   
    {'1655'}      BUR      SJC        13   
    {'1702'}      SMF      LAX         4   
    {'1729'}      LAX      SJC        59   
    {'1763'}      SAN      SFO         3   
    {'1800'}      SEA      LAX        11   

See Also

| | | | | | | | |

Related Topics