Define import options for database data
After you create an SQLImportOptions
object, you can customize
the import options for importing data from a database into MATLAB®. Import options include defining the data types and fill values for missing
data.
Create an SQLImportOptions
object with the databaseImportOptions
function.
ExcludeDuplicates
— Flag to exclude duplicatesfalse
(default) | true
Flag to exclude duplicates from imported data, specified as false
or true
. To exclude duplicates from the data in a database table or
the results of an SQL query, set the ExcludeDuplicates
property to
true
using dot notation.
Setting this property is the equivalent of using the DISTINCT
SQL
statement in ANSI SQL.
Data Types: logical
VariableNames
— Variable namesVariable names, specified as a cell array of character vectors. Each character vector in the cell array indicates the name of an imported database column from an SQL query or database table.
For a table or SQL query with only one database column, the cell array contains only one character vector.
The default variable names are the names of the columns in an SQL query or database table.
Example:
{'productNumber','stockNumber'}
Data Types: cell
VariableTypes
— Variable typesVariable types, specified as a cell array of character vectors. Each character vector in the cell array indicates the data type of an imported database column from an SQL query or database table. Each character vector must be a valid MATLAB data type.
For a table or SQL query with only one database column, the cell array contains only one character vector.
When you create the SQLImportOptions
object, the
databaseImportOptions
function automatically detects the data
type based on the data type of a database column. This table maps the data type of a
database column to the detected MATLAB data type.
Database Data Type | MATLAB Detected Data Type |
---|---|
|
|
|
|
|
|
|
|
If you are using the MySQL® native interface, this table maps the data type of a database column to the detected MATLAB data type.
MySQL Data Type | MATLAB Data Type |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If you are using the PostgreSQL native interface, this table maps the data type of a database column to the detected MATLAB data type.
PostgreSQL Data Type | MATLAB Data Type |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To update the VariableTypes
property, use the setoptions
function.
Example:
{'int64','int32'}
Data Types: cell
SelectedVariableNames
— Subset of variables to importSubset of variables to import, specified as a character vector, cell array of
character vectors, or numeric array that contains indices. Use the
SelectedVariableNames
property to determine the database columns
to import into the MATLAB workspace.
The values in the SelectedVariableNames
property must be equal
to the values in the VariableNames
property or a subset of these
values. By default, the SelectedVariableNames
property contains all
variable names specified in the VariableNames
property. When the
SelectedVariableNames
property specifies all variable names, the
sqlread
,
fetch
, and import functions of the
DatabaseDatastore
object import all database
columns.
Example:
{'productNumber','stockNumber'}
Example:
[1,2,3]
Data Types: double
| char
| cell
FillValues
— Fill value for missing dataFill value for missing data, specified as a cell array that contains one or more values. Each value can be one of these data types:
All integer classes
single
double
char
string
scalar
logical
datetime
array
categorical
array
missing
When you create the SQLImportOptions
object, the
databaseImportOptions
function automatically detects the fill
value for missing data based on the data type of the database column. This table maps
the data type of a database column to the detected MATLAB fill value.
Database Data Type | MATLAB Detected Fill Value |
---|---|
| '' |
| '' |
| NaN |
| false |
If you are using the MySQL native interface, this table maps the data type of a database column to the detected MATLAB data type.
Database Data Type | MATLAB Detected Fill Value |
---|---|
| '' (if the VariableTypes property
is char ) or <missing> (if the
VariableTypes property is
string ) |
| NaT |
| NaN |
| NaN |
| <undefined> |
If you are using the PostgreSQL native interface, this table maps the data type of a database column to the detected MATLAB data type.
Database Data Type | MATLAB Detected Fill Value |
---|---|
| false |
| NaN |
| <missing> |
| NaT |
| NaN |
| NaN |
| undefined |
To update the FillValues
property, use the setoptions
function.
Example:
{'',NaN}
Data Types: cell
VariableOptions
— Type-specific variable import optionsType-specific variable import options, returned as an array of variable import
options objects. The array contains an object corresponding to each variable specified
in the VariableNames
property. Each object in the array contains
properties that support the importing of data with a specific data type.
To query the current (or detected) options for a variable, use the getoptions
function.
To set and customize options for a variable, use the setoptions
function.
Example:
opts.VariableOptions
returns a collection of
SQLVariableImportOptions
objects, one corresponding to each
variable in the data.
VariableNamingRule
— Variable naming rule"modify"
(default) | "preserve"
Variable naming rule, specified as one of these values:
"modify"
— Remove non-ASCII characters from variable names
when the SQLImportOptions
function imports data.
"preserve"
— Preserve most variable names when the
SQLImportOptions
function imports data. For details, see Limitations.
If you are using the MySQL or PostgreSQL native interfaces, "preserve"
is the
default value.
The VariableNamingRule
property has these limitations:
These variable names are reserved identifiers for the table
data type: Properties
, RowNames
, and
VariableNames
.
The length of each variable name must be less than the number returned by
namelengthmax
.
Example: 'VariableNamingRule',"modify"
Data Types: string
getoptions | Retrieve import options for database data |
preview | Preview eight rows from database using import options |
reset | Reset to default import options for database data |
setoptions | Customize import options for database data |
Customize import options when importing data from a database table. Control the import options by creating an SQLImportOptions
object. Then, customize import options for different database columns. Import data using the sqlread
function.
This example uses the patients.xls
file, which contains the columns Gender
, Location
, SelfAssessedHealthStatus
, and Smoker
. The example also uses a Microsoft® SQL Server® Version 11.00.2100 database and the Microsoft SQL Server Driver 11.00.5058.
Create a database connection to a Microsoft SQL Server database with Windows® authentication. Specify a blank user name and password.
datasource = 'MS SQL Server Auth'; conn = database(datasource,'','');
Load patient information into the MATLAB® workspace.
patients = readtable('patients.xls');
Create the patients
database table using the patient information.
tablename = 'patients';
sqlwrite(conn,tablename,patients)
Create an SQLImportOptions
object using the patients
database table and the databaseImportOptions
function.
opts = databaseImportOptions(conn,tablename)
opts = SQLImportOptions with properties: ExcludeDuplicates: false VariableNamingRule: 'modify' VariableNames: {'LastName', 'Gender', 'Age' ... and 7 more} VariableTypes: {'char', 'char', 'double' ... and 7 more} SelectedVariableNames: {'LastName', 'Gender', 'Age' ... and 7 more} FillValues: {'', '', NaN ... and 7 more } VariableOptions: Show all 10 VariableOptions
Display the current import options for the variables selected in the SelectedVariableNames
property of the SQLImportOptions
object.
vars = opts.SelectedVariableNames; varOpts = getoptions(opts,vars)
varOpts = 1x10 SQLVariableImportOptions array with properties: Variable Options: (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) Name: 'LastName' | 'Gender' | 'Age' | 'Location' | 'Height' | 'Weight' | 'Smoker' | 'Systolic' | 'Diastolic' | 'SelfAssessedHealthStatus' Type: 'char' | 'char' | 'double' | 'char' | 'double' | 'double' | 'double' | 'double' | 'double' | 'char' MissingRule: 'fill' | 'fill' | 'fill' | 'fill' | 'fill' | 'fill' | 'fill' | 'fill' | 'fill' | 'fill' FillValue: '' | '' | NaN | '' | NaN | NaN | NaN | NaN | NaN | '' To access sub-properties of each variable, use getoptions
Change the data types for the Gender
, Location
, SelfAssessedHealthStatus
, and Smoker
variables using the setoptions
function. Because the Gender
, Location
, and SelfAssessedHealthStatus
variables indicate a finite set of repeating values, change their data type to categorical
. Because the Smoker
variable stores the values 0
and 1
, change its data type to logical
. Then, display the updated import options.
opts = setoptions(opts,{'Gender','Location','SelfAssessedHealthStatus'}, ... 'Type','categorical'); opts = setoptions(opts,'Smoker','Type','logical'); varOpts = getoptions(opts,{'Gender','Location','Smoker', ... 'SelfAssessedHealthStatus'})
varOpts = 1x4 SQLVariableImportOptions array with properties: Variable Options: (1) | (2) | (3) | (4) Name: 'Gender' | 'Location' | 'Smoker' | 'SelfAssessedHealthStatus' Type: 'categorical' | 'categorical' | 'logical' | 'categorical' MissingRule: 'fill' | 'fill' | 'fill' | 'fill' FillValue: <undefined> | <undefined> | 0 | <undefined> To access sub-properties of each variable, use getoptions
Import the patients
database table using the sqlread
function, and display the last eight rows of the table.
data = sqlread(conn,tablename,opts); tail(data)
ans=8×10 table
LastName Gender Age Location Height Weight Smoker Systolic Diastolic SelfAssessedHealthStatus
_____________ ______ ___ _________________________ ______ ______ ______ ________ _________ ________________________
{'Foster' } Female 30 St. Mary's Medical Center 70 124 false 130 91 Fair
{'Gonzales' } Male 48 County General Hospital 71 174 false 123 79 Good
{'Bryant' } Female 48 County General Hospital 66 134 false 129 73 Excellent
{'Alexander'} Male 25 County General Hospital 69 171 true 128 99 Good
{'Russell' } Male 44 VA Hospital 69 188 true 124 92 Good
{'Griffin' } Male 49 County General Hospital 70 186 false 119 74 Fair
{'Diaz' } Male 45 County General Hospital 68 172 true 136 93 Good
{'Hayes' } Male 48 County General Hospital 66 177 false 114 86 Fair
Display a summary of the imported data. The sqlread
function applies the import options to the variables in the imported data.
summary(data)
Variables: LastName: 100×1 cell array of character vectors Gender: 100×1 categorical Values: Female 53 Male 47 Age: 100×1 double Values: Min 25 Median 39 Max 50 Location: 100×1 categorical Values: County General Hospital 39 St. Mary s Medical Center 24 VA Hospital 37 Height: 100×1 double Values: Min 60 Median 67 Max 72 Weight: 100×1 double Values: Min 111 Median 142.5 Max 202 Smoker: 100×1 logical Values: True 34 False 66 Systolic: 100×1 double Values: Min 109 Median 122 Max 138 Diastolic: 100×1 double Values: Min 68 Median 81.5 Max 99 SelfAssessedHealthStatus: 100×1 categorical Values: Excellent 34 Fair 15 Good 40 Poor 11
Delete the patients
database table using the execute
function.
sqlquery = ['DROP TABLE ' tablename];
execute(conn,sqlquery)
Close the database connection.
close(conn)
The name-value pair argument 'VariableNamingRule'
has these limitations:
The SQLImportOptions
function returns an error when
you use the 'VariableNamingRule'
name-value pair
argument with the SQLImportOptions
object
opts
.
When the 'VariableNamingRule'
name-value pair
argument is set to the value 'modify'
:
The variable names Properties
,
RowNames
, and
VariableNames
are reserved
identifiers for the table
data
type.
The length of each variable name must be less than the
number returned by namelengthmax
.
close
| database
| databaseDatastore
| execute
| fetch
| sqlread
| sqlwrite
You have a modified version of this example. Do you want to open this example with your edits?