partitionRead

Import data from partitions of Apache Cassandra database table

Description

example

results = partitionRead(conn,keyspace,tablename) returns imported data by reading all Cassandra® database columns from all partitions of a Cassandra database table. The partitionRead function imports data from a Cassandra database into MATLAB® without using a Cassandra Query Language (CQL) query.

example

results = partitionRead(conn,keyspace,tablename,keyValue1...keyValueN) returns imported data by reading all Cassandra columns from one or more partitions specified by the partition key values.

example

results = partitionRead(___,'ConsistencyLevel',level) sets a consistency level to specify how many nodes must respond when the function reads data from the Cassandra database table, using any of the previous input argument combinations.

Examples

collapse all

Using a Cassandra® database connection, import data from a Cassandra database table into MATLAB®. The Cassandra database contains a database table with employee data.

Create a Cassandra database connection using the local host address. conn is a cassandra object.

contactPoints = "localhost";
conn = cassandra(contactPoints);

Import employee data into MATLAB from the employeedata keyspace and the employees_by_job database table by using the Cassandra database connection.

keyspace = "employeedata";
tablename = "employees_by_job";
results = partitionRead(conn,keyspace,tablename);

Display the first few rows of the returned employee data.

head(results)
ans=8×13 table
      job_id       hire_date     employee_id    commission_pct    department_id      email       first_name      last_name      manager_id         office         performance_ratings     phone_number     salary
                                                                                                                                              building    room                                                   
    __________    ___________    ___________    ______________    _____________    __________    __________    _____________    __________    ________________    ___________________    ______________    ______

    "ST_CLERK"    08-Mar-2008        128             NaN               50          "SMARKLE"     "Steven"      "Markle"            120        "North"     171         [3×1 int32]        "650.124.1434"     2200 
    "ST_CLERK"    06-Feb-2008        136             NaN               50          "HPHILTAN"    "Hazel"       "Philtanker"        122        "North"     303         [        2]        "650.127.1634"     2200 
    "ST_CLERK"    12-Dec-2007        135             NaN               50          "KGEE"        "Ki"          "Gee"               122        "West"      287         [2×1 int32]        "650.127.1734"     2400 
    "ST_CLERK"    10-Apr-2007        132             NaN               50          "TJOLSON"     "TJ"          "Olson"             121        "North"     256         [        7]        "650.124.8234"     2100 
    "ST_CLERK"    14-Jan-2007        127             NaN               50          "JLANDRY"     "James"       "Landry"            120        "West"      273         [2×1 int32]        "650.124.1334"     2400 
    "ST_CLERK"    28-Sep-2006        126             NaN               50          "IMIKKILI"    "Irene"       "Mikkilineni"       120        "East"      246         [4×1 int32]        "650.124.1224"     2700 
    "ST_CLERK"    26-Aug-2006        134             NaN               50          "MROGERS"     "Michael"     "Rogers"            122        "East"      246         [3×1 int32]        "650.127.1834"     2900 
    "ST_CLERK"    09-Jul-2006        144             NaN               50          "PVARGAS"     "Peter"       "Vargas"            124        "North"     129         [3×1 int32]        "650.121.2004"     2500 

results is a table that contains these variables:

  • job_id — Job identifier

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • first_name — First name

  • last_name — Last name

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Close the Cassandra database connection.

close(conn)

Using a Cassandra® database connection, import data from a Cassandra database table into MATLAB®. Use the value of the partition key in the database table to import data. The Cassandra database contains a database table with employee data.

Create a Cassandra database connection using the local host address. conn is a cassandra object.

contactPoints = "localhost";
conn = cassandra(contactPoints);

Import employee data into MATLAB from the employeedata keyspace and the employees_by_job database table by using the Cassandra database connection. This database table has the job_id partition key. Specify the IT_PROG value of the partition key to import all data for only those employees who are programmers.

keyspace = "employeedata";
tablename = "employees_by_job";
keyValue = "IT_PROG";
results = partitionRead(conn,keyspace,tablename,keyValue);

Display the returned employee data.

results
results=5×13 table
     job_id       hire_date     employee_id    commission_pct    department_id      email       first_name      last_name     manager_id         office         performance_ratings     phone_number     salary
                                                                                                                                            building    room                                                   
    _________    ___________    ___________    ______________    _____________    __________    ___________    ___________    __________    ________________    ___________________    ______________    ______

    "IT_PROG"    21-May-2007        104             NaN               60          "BERNST"      "Bruce"        "Ernst"           103        "North"     371         [        8]        "590.423.4568"     6000 
    "IT_PROG"    07-Feb-2007        107             NaN               60          "DLORENTZ"    "Diana"        "Lorentz"         103        "West"      133         [3×1 int32]        "590.423.5567"     4200 
    "IT_PROG"    05-Feb-2006        106             NaN               60          "VPATABAL"    "Valli"        "Pataballa"       103        "East"      231         [5×1 int32]        "590.423.4560"     4800 
    "IT_PROG"    03-Jan-2006        103             NaN               60          "AHUNOLD"     "Alexander"    "Hunold"          102        "West"      155         [2×1 int32]        "590.423.4567"     9000 
    "IT_PROG"    25-Jun-2005        105             NaN               60          "DAUSTIN"     "David"        "Austin"          103        "South"     393         [2×1 int32]        "590.423.4569"     4800 

results is a table that contains these variables:

  • job_id — Job identifier

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • first_name — First name

  • last_name — Last name

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Close the Cassandra database connection.

close(conn)

Using a Cassandra® database connection, import data from a Cassandra database table into MATLAB®. Use the values of two partition keys in the database table to import data. The Cassandra database contains a database table with employee data.

Create a Cassandra database connection using the local host address. conn is a cassandra object.

contactPoints = "localhost";
conn = cassandra(contactPoints);

Import employee data into MATLAB from the employeedata keyspace and the employees_by_name database table by using the Cassandra database connection. This database table has the first_name and last_name partition keys. Specify the first and last names of two employees as values of the partition keys to import data for those two employees.

keyspace = "employeedata";
tablename = "employees_by_name";
keyValue1 = ["Christopher","Alexander"];
keyValue2 = ["Olsen","Hunold"];
results = partitionRead(conn,keyspace,tablename,keyValue1,keyValue2);

Display the returned employee data for the two employees.

results
results=2×13 table
     first_name      last_name     hire_date     employee_id    commission_pct    department_id      email       job_id      manager_id         office         performance_ratings        phone_number        salary
                                                                                                                                           building    room                                                         
    _____________    _________    ___________    ___________    ______________    _____________    _________    _________    __________    ________________    ___________________    ____________________    ______

    "Alexander"      "Hunold"     03-Jan-2006        103             NaN               60          "AHUNOLD"    "IT_PROG"       102        "West"      155         [2×1 int32]        "590.423.4567"           9000 
    "Christopher"    "Olsen"      30-Mar-2006        153             0.2               80          "COLSEN"     "SA_REP"        145        "South"     333         [4×1 int32]        "011.44.1344.498718"     8000 

results is a table that contains these variables:

  • first_name — First name

  • last_name — Last name

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • job_id — Job identifier

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Close the Cassandra database connection.

close(conn)

Using a Cassandra® database connection, import data from a Cassandra database table into MATLAB®. Use the value of the partition key in the database table to import data. Specify a consistency level for returning results. The Cassandra database contains a database table with employee data.

Create a Cassandra database connection using the local host address. conn is a cassandra object.

contactPoints = "localhost";
conn = cassandra(contactPoints);

Import employee data into MATLAB from the employeedata keyspace and the employees_by_job database table by using the Cassandra database connection. This database table has the job_id partition key. Specify the IT_PROG value of the partition key to import all data for only those employees who are programmers. Also, specify the consistency level as a quorum.

keyspace = "employeedata";
tablename = "employees_by_job";
keyValue = "IT_PROG";
level = "QUORUM";
results = partitionRead(conn,keyspace,tablename,keyValue, ...
    'ConsistencyLevel',level);

Most of the replica nodes respond with the returned data.

Display the returned employee data.

results
results=5×13 table
     job_id       hire_date     employee_id    commission_pct    department_id      email       first_name      last_name     manager_id         office         performance_ratings     phone_number     salary
                                                                                                                                            building    room                                                   
    _________    ___________    ___________    ______________    _____________    __________    ___________    ___________    __________    ________________    ___________________    ______________    ______

    "IT_PROG"    21-May-2007        104             NaN               60          "BERNST"      "Bruce"        "Ernst"           103        "North"     371         [        8]        "590.423.4568"     6000 
    "IT_PROG"    07-Feb-2007        107             NaN               60          "DLORENTZ"    "Diana"        "Lorentz"         103        "West"      133         [3×1 int32]        "590.423.5567"     4200 
    "IT_PROG"    05-Feb-2006        106             NaN               60          "VPATABAL"    "Valli"        "Pataballa"       103        "East"      231         [5×1 int32]        "590.423.4560"     4800 
    "IT_PROG"    03-Jan-2006        103             NaN               60          "AHUNOLD"     "Alexander"    "Hunold"          102        "West"      155         [2×1 int32]        "590.423.4567"     9000 
    "IT_PROG"    25-Jun-2005        105             NaN               60          "DAUSTIN"     "David"        "Austin"          103        "South"     393         [2×1 int32]        "590.423.4569"     4800 

results is a table that contains these variables:

  • job_id — Job identifier

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • first_name — First name

  • last_name — Last name

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Close the Cassandra database connection.

close(conn)

Input Arguments

collapse all

Cassandra database connection, specified as a cassandra object.

Keyspace, specified as a character vector or string scalar. If you do not know the keyspace, then access the Keyspaces property of the cassandra object using dot notation to view the keyspaces in the Cassandra database.

Example: "employeedata"

Data Types: char | string

Cassandra database table name, specified as a character vector or string scalar. If you do not know the name of the table, then use the tablenames function to find it.

Example: "employees_by_job"

Data Types: char | string

Partition key values, specified as one of these data types:

  • numeric scalar

  • numeric array

  • character vector

  • cell array of character vectors

  • string scalar

  • string array

  • logical

  • logical array

  • datetime array

  • duration array

If you do not specify the keyValue1...keyValueN input argument, then the partitionRead function imports data from all partitions of the Cassandra database table (same as the CQL query SELECT * FROM tablename).

Specify one key value for each partition key of the Cassandra database table. The maximum number of partition key values you can specify is the number of primary keys, which includes the partition keys and clustering columns in the Cassandra database.

If you specify a scalar value, then the CQL query equivalent is an = clause in the CQL WHERE clause. If you specify an array of values, then the CQL query equivalent is an IN clause in the CQL WHERE clause.

If all partition key values are scalar values, then the partitionRead function imports data from one partition. If some partition key values are arrays, then the partitionRead function imports data by searching multiple partitions that correspond to all possible key combinations.

The following table describes supported Cassandra partition keys.

Supported Cassandra Partition KeyMATLAB Valid Data Types for One PartitionMATLAB Valid Data Types for Multiple Partitions

ascii

character vector or string scalar

cell array of character vectors or string array

bigint

numeric scalar or logical scalar

numeric array or logical array

blob

numeric array

cell array of numeric arrays

boolean

numeric scalar or logical scalar

numeric array or logical array

date

datetime array, string scalar, or character vector

datetime array, string array, or cell array of character vectors

decimal

numeric scalar, logical scalar, or java.math.BigDecimal scalar

numeric array, logical array, or java.math.BigDecimal array

double

numeric scalar or logical scalar

numeric array or logical array

float

numeric scalar or logical scalar

numeric array or logical array

inet

character vector or string scalar

cell array of character vectors or string array

int

numeric scalar or logical scalar

numeric array or logical array

smallint

numeric scalar or logical scalar

numeric array or logical array

text

character vector or string scalar

cell array of character vectors or string array

time

duration array, string scalar, or character vector

duration array, string array, or cell array of character vectors

timestamp

datetime array, string scalar, or character vector

datetime array, string array, or cell array of character vectors

timeuuid

character vector or string scalar

cell array of character vectors or string array

tinyint

numeric scalar or logical scalar

numeric array or logical array

uuid

character vector or string scalar

cell array of character vectors or string array

varchar

character vector or string scalar

cell array of character vectors or string array

varint

numeric scalar, logical scalar, or java.math.BigInteger

numeric array, logical array, or java.math.BigInteger array

These Cassandra partition keys are not supported:

  • counter

  • list

  • map

  • set

  • tuple

  • user-defined types (UDTs)

Example: ["MA","CT"]

Example: 1,2,'DataProvider1','AmbientTemp'

Data Types: double | logical | char | string | struct | cell | datetime | duration

Consistency level, specified as one of these values.

Consistency Level ValueConsistency Level Description

"ALL"

Return query results when all replica nodes respond.

"QUORUM"

Return query results when most replica nodes respond.

"LOCAL_QUORUM"

Return query results when most replica nodes in the local data center respond.

"ONE" (default)

Return query results when one replica node responds.

"TWO"

Return query results when two replica nodes respond.

"THREE"

Return query results when three replica nodes respond.

"LOCAL_ONE"

Return query results when one replica node in the local data center responds.

"SERIAL"

Return query results for current (and possibly uncommitted) data for replica nodes in any data center.

"LOCAL_SERIAL"

Return query results for current (and possibly uncommitted) data for replica nodes in the local data center.

You can specify the value of the consistency level as a character vector or string scalar.

For details about consistency levels, see Configuring data consistency.

Data Types: char | string

Output Arguments

collapse all

Imported data results, returned as a table. The table contains imported data from the partitions that correspond to the keyValue1...keyValueN input argument. Each Cassandra database column from the partitions becomes a variable in the table. The variable names match the names of the Cassandra database columns in the specified partitions.

The data types of the variables in the table depend on the Cassandra data types. For details about how CQL data types convert to MATLAB data types, see Convert CQL Data Types to MATLAB Data Types.

Introduced in R2018b