databaseDatastore

Datastore for data in database

Description

MATLAB® has various datastores that let you import large data sets into MATLAB for analysis. A DatabaseDatastore object is a type of datastore that contains data from a database table or the results from executing an SQL query in a relational database. For details about other datastores, see Getting Started with Datastore (MATLAB).

With a DatabaseDatastore object, you can preview and read records or chunks in a data set and reset the DatabaseDatastore to its initial state. Also, you can analyze a large data set in a database using tall arrays or MapReduce.

Reading data from DatabaseDatastore objects is the same as executing the fetch function on the data set. Using DatabaseDatastore objects provides advantages that enable you to:

Creation

Description

example

dbds = databaseDatastore(conn,source) creates a DatabaseDatastore object using the database connection. This datastore contains data from a database table or the results from an executed SQL query.

example

dbds = databaseDatastore(conn,source,Name,Value) specifies additional options using one or more name-value pair arguments. For example, 'ReadSize',100 retrieves 100 rows of data from the DatabaseDatastore object.

example

dbds = databaseDatastore(conn,source,opts) customizes the options for importing a large data set from a database using the SQLImportOptions object.

dbds = databaseDatastore(conn,source,opts,Name,Value) specifies additional options using one or more name-value pair arguments. For example, 'Catalog','toy_store' retrieves data from the toy_store database catalog.

Input Arguments

expand all

Database connection, specified as a connection object created with the database function.

Source, specified as a character vector or string scalar. The source indicates whether the DatabaseDatastore object stores data from a database table or the results from an executed SQL query.

Example: 'inventorytable'

Example: "SELECT productnumber,productname FROM producttable"

Data Types: char | string

Database import options, specified as an SQLImportOptions object.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: databaseDatastore(conn,source,'ReadSize',100,'Catalog','toy_store') creates a DatabaseDatastore object and stores 100 rows of data from a table or SQL query using the toy_store database catalog.

Number of rows to return, specified as the comma-separated pair consisting of 'ReadSize' and a positive numeric scalar. Use this name-value pair argument to limit the number of rows for retrieval from the DatabaseDatastore object.

Example: 1000

Data Types: double

Database catalog name, specified as the comma-separated pair consisting of 'Catalog' and a character vector or string scalar. A catalog serves as the container for the schemas in a database and contains related metadata information. A database can have numerous catalogs.

Use the 'Catalog' name-value pair argument only when source is a database table.

Example: 'Catalog','toy_store'

Data Types: char | string

Database schema name, specified as the comma-separated pair consisting of 'Schema' and a character vector or string scalar. A schema defines the database tables, views, relationships among tables, and other elements. A database catalog can have numerous schemas.

Use the 'Schema' name-value pair argument only when source is a database table.

Example: 'Schema','dbo'

Data Types: char | string

Limitations

  • The DatabaseDatastore object supports only Microsoft® SQL Server® 2012 and later versions.

  • The DatabaseDatastore object does not support using a parallel pool with Parallel Computing Toolbox™ installed. To analyze data using tall arrays or to run MapReduce algorithms, set the global execution environment to be the local MATLAB session by using mapreducer. Enter this code:

    mapreducer(0)
    For details about controlling parallel resources, see Run mapreduce on a Parallel Pool (Parallel Computing Toolbox).

Properties

expand all

This property is read-only.

Database connection, specified as a connection object created using database.

This property is read-only.

SQL query, specified as a character vector that specifies the SQL query to execute in the database.

Data Types: char

This property is read-only.

Column names of the retrieved data table, specified as a cell array of one or more character vectors.

Data Types: char

Number of rows to read from the retrieved data table, specified as a nonnegative numeric scalar. To specify the number of rows to read, set the ReadSize property.

Example: dbds.ReadSize = 5000;

Data Types: double

Object Functions

hasdataDetermine if data in DatabaseDatastore is available to read
previewReturn subset of data from DatabaseDatastore
readRead data in DatabaseDatastore
readallRead all data in DatabaseDatastore
resetReset DatabaseDatastore to initial state
closeClose and invalidate database and driver resource utilizer
isPartitionableDetermine whether datastore is partitionable
isShuffleableDetermine whether datastore is shuffleable

Examples

collapse all

Create a database connection using a JDBC driver. To create this connection, you must configure a JDBC data source. For more information, see the configureJDBCDataSource function. Then, create a DatabaseDatastore object using the results from an SQL query and preview a large data set.

Create a database connection to the JDBC data source MSSQLServerJDBCAuth. This data source configures a JDBC driver to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password.

datasource = "MSSQLServerJDBCAuth";
username = "";
password = "";
conn = database(datasource,username,password);

Create a DatabaseDatastore object using a database connection and an SQL query. This SQL query retrieves all flight data from the airlinesmall table. databaseDatastore executes the SQL query.

sqlquery = 'select * from airlinesmall';

dbds = databaseDatastore(conn,sqlquery)
dbds = 

  DatabaseDatastore with properties:

       Connection: [1×1 database.jdbc.connection]
            Query: 'select * from airlinesmall'
    VariableNames: {1×29 cell}
         ReadSize: 10000

dbds is a DatabaseDatastore object with these properties:

  • Connection -- Database connection object

  • Query -- Executed SQL query

  • VariableNames -- List of column names from the executed SQL query

  • ReadSize -- Maximum number of records to read from the executed SQL query

Display the database connection property.

dbds.Connection
ans = 

  connection with properties:

                  DataSource: 'MSSQLServerJDBCAuth'
                    UserName: ''
                      Driver: 'com.microsoft.sqlserver.j ...'
                         URL: 'jdbc:sqlserver://dbtb04:5 ...'
                     Message: ''
                        Type: 'JDBC Connection Object'
  Database Properties:

                  AutoCommit: 'on'
                    ReadOnly: 'off'
                LoginTimeout: 0
      MaxDatabaseConnections: 0

  Catalog and Schema Information:

              DefaultCatalog: 'toy_store'
                    Catalogs: {'master', 'model', 'msdb' ... and 2 more}
                     Schemas: {'db_accessadmin', 'db_backupoperator', 'db_datareader' ... and 15 more}

  Database and Driver Information:

         DatabaseProductName: 'Microsoft SQL Server'
      DatabaseProductVersion: '11.00.2100'
                  DriverName: 'Microsoft JDBC Driver 4.0 ...'
               DriverVersion: '4.0.2206.100'

The Message property is blank when the database connection is successful.

Preview the first eight records in the large data set returned by executing the SQL query in the DatabaseDatastore object.

preview(dbds)
ans =

  8×29 table

    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum     ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    ________    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________

    1990      9          11            2         1810         1812        1939         1930           'AA'           1426       'NA'                89                 78           'NA'          9          -2       'RST'     'ORD'       268       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10          27            6         1353         1355        1634         1640           'US'            112       'NA'               161                165           'NA'         -6          -2       'TPA'     'SYR'      1104       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10          23            2         1057         1055        1205         1155           'US'           1621       'NA'                68                 60           'NA'         10           2       'ROC'     'EWR'       246       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10           8            1         1515         1440        1609         1535           'NW'            749       'NA'                54                 55           'NA'         34          35       'MSP'     'FSD'       197       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10          19            5         1130         1120        1203         1154           'UA'            369       'NA'                93                 94           'NA'          9          10       'BUF'     'ORD'       473       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10          12            5         1755         1733        1858         1820           'DL'            590       'NA'                63                 47           'NA'         38          22       'BOS'     'BGR'       201       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    2001     11          22            4         1345         1355        1530         1549           'MQ'           4982       '#NAME?'           105                114           '90'        -19         -10       'JAX'     'MIA'       334       '8'       '7'           0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    2001     11          26            1         2105         2110        2209         2237           'AA'           1947       'N3BäA1'            64                 87           '47'        -28          -5       'SFO'     'LAX'       337       '6'       '11'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       

Close the DatabaseDatastore object and the database connection.

close(dbds)

Retrieve a large data set from a database table by creating a DatabaseDatastore object.

This example uses a Microsoft® SQL Server® Version 11.00.2100 database and the Microsoft SQL Server Driver 11.00.5058.

Create a database connection to a Microsoft SQL Server database with Windows® authentication. Specify a blank user name and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Load flight information in the MATLAB® workspace.

flights = readtable('airlinesmall_subset.xlsx');

Create the flights database table using the flight information.

tablename = 'flights';
sqlwrite(conn,tablename,flights)

Create a DatabaseDatastore object using a database connection and the flights database table.

dbds = databaseDatastore(conn,tablename)
dbds = 
  DatabaseDatastore with properties:

       Connection: [1×1 database.odbc.connection]
            Query: 'SELECT * from flights'
    VariableNames: {1×29 cell}
         ReadSize: 10000

dbds is a DatabaseDatastore object with these properties:

  • Connection — Database connection object

  • Query — Executed SQL query

  • VariableNames — List of column names from the executed SQL query

  • ReadSize — Maximum number of records to read from the executed SQL query

Display the database connection property.

dbds.Connection
ans = 
  connection with properties:

                  DataSource: 'MS SQL Server Auth'
                    UserName: ''
                     Message: ''
                        Type: 'ODBC Connection Object'
  Database Properties:

                  AutoCommit: 'on'
                    ReadOnly: 'off'
                LoginTimeout: 15
      MaxDatabaseConnections: 0

  Catalog and Schema Information:

              DefaultCatalog: 'toy_store'
                    Catalogs: {'master', 'msdb', 'tempdb' ... and 1 more}
                     Schemas: {'dbo', 'guest', 'INFORMATION_SCHEMA' ... and 2 more}

  Database and Driver Information:

         DatabaseProductName: 'Microsoft SQL Server'
      DatabaseProductVersion: '11.00.2100'
                  DriverName: 'sqlncli11.dll'
               DriverVersion: '11.00.6518'

The Message property is blank when the database connection is successful.

Preview the first eight records in the data set returned by executing the SQL query in the DatabaseDatastore object.

preview(dbds)
ans=8×29 table
    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum     ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    SDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    ________    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ______    _____________    _________________

    1996      1          18            4         2117         2120        2305         2259           'HP'            415       'N637AW'           108                 99            85           6          -3       'COS'     'PHX'      551         5         18           0              NaN              0            NaN             NaN          NaN           NaN                NaN       
    1996      1          12            5         1252         1245        1511         1500           'HP'            610       'N905AW'            79                 75            58          11           7       'LAX'     'PHX'      370         3         18           0              NaN              0            NaN             NaN          NaN           NaN                NaN       
    1996      1          16            2         1441         1445        1708         1721           'HP'            211       'N165AW'            87                 96            74         -13          -4       'RNO'     'PHX'      601         4          9           0              NaN              0            NaN             NaN          NaN           NaN                NaN       
    1996      1           1            1         2258         2300        2336         2335           'HP'           1245       'N183AW'            38                 35            20           1          -2       'TUS'     'PHX'      110         6         12           0              NaN              0            NaN             NaN          NaN           NaN                NaN       
    1996      1           4            4         1814         1814        1901         1910           'US'            683       'N963VJ'            47                 56            34          -9           0       'DTW'     'PIT'      201         6          7           0              NaN              0            NaN             NaN          NaN           NaN                NaN       
    1996      1          31            3         1822         1820        1934         1925           'US'            757       'N912VJ'            72                 65            52           9           2       'PHL'     'PIT'      267         6         14           0              NaN              0            NaN             NaN          NaN           NaN                NaN       
    1996      1          18            4          729          730         841          843           'US'           1564       'N941VJ'            72                 73            58          -2          -1       'DCA'     'PVD'      357         3         11           0              NaN              0            NaN             NaN          NaN           NaN                NaN       
    1996      1          26            5         1704         1705        1829         1839           'NW'           1538       'N960N'             85                 94            69         -10          -1       'DTW'     'RIC'      456         3         13           0              NaN              0            NaN             NaN          NaN           NaN                NaN       

Close the DatabaseDatastore object and the database connection.

close(dbds)

Create a database connection using a JDBC driver. To create this connection, you must configure a JDBC data source. For more information, see the configureJDBCDataSource function. Then, create a DatabaseDatastore object by setting the ReadSize property, and preview a large data set.

Create a database connection to the JDBC data source MSSQLServerJDBCAuth. This data source configures a JDBC driver to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password.

datasource = "MSSQLServerJDBCAuth";
username = "";
password = "";
conn = database(datasource,username,password);

Create a DatabaseDatastore object using a database connection and an SQL query. This SQL query retrieves all flight data from the airlinesmall table. Specify reading a maximum of 1000 records from the executed SQL query. databaseDatastore executes the SQL query.

sqlquery = 'select * from airlinesmall';

dbds = databaseDatastore(conn,sqlquery,'ReadSize',1000)
dbds = 

  DatabaseDatastore with properties:

       Connection: [1×1 database.jdbc.connection]
            Query: 'select * from airlinesmall'
    VariableNames: {1×29 cell}
         ReadSize: 1000

dbds is a DatabaseDatastore object with these properties:

  • Connection -- Database connection object

  • Query -- Executed SQL query

  • VariableNames -- List of column names from the executed SQL query

  • ReadSize -- Maximum number of records to read from the executed SQL query

Display the database connection property.

dbds.Connection
ans = 

  connection with properties:

                  DataSource: 'MSSQLServerJDBCAuth'
                    UserName: ''
                      Driver: 'com.microsoft.sqlserver.j ...'
                         URL: 'jdbc:sqlserver://dbtb04:5 ...'
                     Message: ''
                        Type: 'JDBC Connection Object'
  Database Properties:

                  AutoCommit: 'on'
                    ReadOnly: 'off'
                LoginTimeout: 0
      MaxDatabaseConnections: 0

  Catalog and Schema Information:

              DefaultCatalog: 'toy_store'
                    Catalogs: {'master', 'model', 'msdb' ... and 2 more}
                     Schemas: {'db_accessadmin', 'db_backupoperator', 'db_datareader' ... and 15 more}

  Database and Driver Information:

         DatabaseProductName: 'Microsoft SQL Server'
      DatabaseProductVersion: '11.00.2100'
                  DriverName: 'Microsoft JDBC Driver 4.0 ...'
               DriverVersion: '4.0.2206.100'

The Message property is blank when the database connection is successful.

Preview the first eight records in the large data set returned by executing the SQL query in the DatabaseDatastore object.

preview(dbds)
ans =

  8×29 table

    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum     ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    ________    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________

    1990      9          11            2         1810         1812        1939         1930           'AA'           1426       'NA'                89                 78           'NA'          9          -2       'RST'     'ORD'       268       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10          27            6         1353         1355        1634         1640           'US'            112       'NA'               161                165           'NA'         -6          -2       'TPA'     'SYR'      1104       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10          23            2         1057         1055        1205         1155           'US'           1621       'NA'                68                 60           'NA'         10           2       'ROC'     'EWR'       246       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10           8            1         1515         1440        1609         1535           'NW'            749       'NA'                54                 55           'NA'         34          35       'MSP'     'FSD'       197       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10          19            5         1130         1120        1203         1154           'UA'            369       'NA'                93                 94           'NA'          9          10       'BUF'     'ORD'       473       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    1990     10          12            5         1755         1733        1858         1820           'DL'            590       'NA'                63                 47           'NA'         38          22       'BOS'     'BGR'       201       'NA'      'NA'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    2001     11          22            4         1345         1355        1530         1549           'MQ'           4982       '#NAME?'           105                114           '90'        -19         -10       'JAX'     'MIA'       334       '8'       '7'           0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       
    2001     11          26            1         2105         2110        2209         2237           'AA'           1947       'N3BäA1'            64                 87           '47'        -28          -5       'SFO'     'LAX'       337       '6'       '11'          0              'NA'             0            'NA'            'NA'          'NA'          'NA'               'NA'       

Close the DatabaseDatastore object and the database connection.

close(dbds)

Customize import options when importing a large data set from a database table. Control the import options by creating an SQLImportOptions object. Then, customize the import options for database columns that contain logical data. Import and preview the data by creating a DatabaseDatastore object and using the preview function.

This example uses the airlinesmall_subset.xls spreadsheet, which contains the column Cancelled. Also, the example uses a Microsoft® SQL Server® Version 11.00.2100 database and the Microsoft SQL Server Driver 11.00.5058.

Create a database connection to a Microsoft SQL Server database with Windows® authentication. Specify a blank user name and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Load flight information into the MATLAB® workspace.

flights = readtable('airlinesmall_subset.xlsx');

Create the flights database table using the flight information.

tablename = 'flights';
sqlwrite(conn,tablename,flights)

Create an SQLImportOptions object using the flights database table with the databaseImportOptions function.

opts = databaseImportOptions(conn,tablename);

Retrieve the default import options for the Cancelled variable.

varnames = 'Cancelled';
varOpts = getoptions(opts,varnames)
varOpts = 
  SQLVariableImportOptions with properties:

  Variable Properties :
               Name: 'Cancelled'
               Type: 'double'
          FillValue: NaN

Set the import options for the data type of the specified variable to logical. Also, set the import options to replace missing data in the specified variable with the fill value true.

opts = setoptions(opts,varnames,'Type','logical', ...
    'FillValue',true);

Create the DatabaseDatastore object to import a large data set using the import options.

dbds = databaseDatastore(conn,tablename,opts);

Import the logical data in the selected variable and display a preview of the data. The imported data shows that the variable has the logical data type.

opts.SelectedVariableNames = varnames;
data = preview(dbds);
cancelled = data.Cancelled
cancelled = 8×1 logical array

   0
   0
   0
   0
   0
   0
   0
   0

Delete the flights database table using the execute function.

sqlquery = ['DROP TABLE ' tablename];
execute(conn,sqlquery)

Close the database connection.

close(conn)

Customize import options when importing a large data set from a database table. Control the import options by creating an SQLImportOptions object. Then, customize the import options for database columns that contain logical data. Create a DatabaseDatastore object using the specified database catalog and schema. Import the database data and preview it by using the preview function with the DatabaseDatastore object.

This example uses the airlinesmall_subset.xls spreadsheet, which contains the column Cancelled. Also, the example uses a Microsoft® SQL Server® Version 11.00.2100 database and the Microsoft SQL Server Driver 11.00.5058.

Create a database connection to a Microsoft SQL Server database with Windows® authentication. Specify a blank user name and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Load flight information into the MATLAB® workspace.

flights = readtable('airlinesmall_subset.xlsx');

Create the flights database table using the flight information and the toy_store database catalog and dbo database schema.

tablename = 'flights';
sqlwrite(conn,tablename,flights, ...
    'Catalog','toy_store','Schema','dbo')

Create an SQLImportOptions object using the flights database table and the databaseImportOptions function. Specify the toy_store database catalog and dbo database schema.

opts = databaseImportOptions(conn,tablename, ...
    'Catalog','toy_store','Schema','dbo');

Retrieve the default import options for the Cancelled variable.

varnames = 'Cancelled';
varOpts = getoptions(opts,varnames)
varOpts = 
  SQLVariableImportOptions with properties:

  Variable Properties :
               Name: 'Cancelled'
               Type: 'double'
          FillValue: NaN

Set the import options for the data type of the specified variable to logical. Also, set the import options to replace missing data in the specified variable with the fill value true.

opts = setoptions(opts,varnames,'Type','logical', ...
    'FillValue',true);

Create the DatabaseDatastore object to import a large data set using import options, the toy_store database catalog, and the dbo database schema.

dbds = databaseDatastore(conn,tablename,opts, ...
    'Catalog','toy_store','Schema','dbo');

Import the logical data in the selected variable and display a preview of the data. The imported data shows that the variable has the logical data type.

opts.SelectedVariableNames = varnames;
data = preview(dbds);
cancelled = data.Cancelled
cancelled = 8×1 logical array

   0
   0
   0
   0
   0
   0
   0
   0

Delete the flights database table from the toy_store database catalog and the dbo database schema by using the execute function.

sqlquery = ['DROP TABLE toy_store.dbo.' tablename];
execute(conn,sqlquery)

Close the database connection.

close(conn)

Introduced in R2014b