subds = partition(ds,n,index)
partitions datastore ds into the number of parts specified by
n and returns the partition corresponding to the index
index.
subds = partition(ds,'Files',index)
partitions the datastore by files and returns the partition corresponding to the
file of index index in the Files
property.
subds = partition(ds,'Files',filename)
partitions the datastore by files and returns the partition corresponding to the
file specified by filename.
Create a datastore for a large collection of files. For this example, use ten copies of the sample file airlinesmall.csv. To handle missing fields in the tabular data, specify the name-value pairs TreatAsMissing and MissingValue.
Partition the datastore into three parts and return the first partition. The partition function returns approximately the first third of the data from the datastore ds.
subds = partition(ds,3,1)
subds =
TabularTextDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\airlinesmall.csv';
' ...\matlab\toolbox\matlab\demos\airlinesmall.csv';
' ...\matlab\toolbox\matlab\demos\airlinesmall.csv'
... and 1 more
}
FileEncoding: 'UTF-8'
AlternateFileSystemRoots: {}
ReadVariableNames: true
VariableNames: {'Year', 'Month', 'DayofMonth' ... and 26 more}
Text Format Properties:
NumHeaderLines: 0
Delimiter: ','
RowDelimiter: '\r\n'
TreatAsMissing: 'NA'
MissingValue: 0
Advanced Text Format Properties:
TextscanFormats: {'%f', '%f', '%f' ... and 26 more}
TextType: 'char'
ExponentCharacters: 'eEdD'
CommentStyle: ''
Whitespace: ' \b\t'
MultipleDelimitersAsOne: false
Properties that control the table returned by preview, read, readall:
SelectedVariableNames: {'Year', 'Month', 'DayofMonth' ... and 26 more}
SelectedFormats: {'%f', '%f', '%f' ... and 26 more}
ReadSize: 20000 rows
The Files property of the datastore contains a list of files included in the datastore. Check the number of files in the Files property of the datastore ds and the partitioned datastore subds. The datastore ds contains ten files and the partition subds contains the first four files.
Create a datastore from the sample file, mapredout.mat,
which is the output file of the mapreduce
function.
ds = datastore('mapredout.mat');
Partition the datastore into three parts on three workers in a parallel
pool.
numWorkers = 3;
p = parpool('local',numWorkers);
n = numpartitions(ds,p);
parfor ii=1:n
subds = partition(ds,n,ii);
while hasdata(subds)
data = read(subds);
endend
Input datastore. You can use the datastore function to
create a datastore object from your data.
n — Number of partitions positive integer
Number of partitions, specified as a positive integer.
Example: 3
Data Types: double
index — Index positive integer
Index, specified as a positive integer.
Example: 1
Data Types: double
filename — file name character vector | string scalar
File name, specified as a character vector or string scalar.
The value of filename must match exactly the file name
contained in the Files property of the datastore. To
ensure that the file names match exactly, specify
filename using ds.Files{N} where
N is the index of the file in the
Files property. For example,
ds.Files{3} specifies the third file in the datastore
ds.