Write datastore to files
writeall(
writes the data from the input datastore ds
,outputLocation
)ds
to output files at the
location specified in outputLocation
. The number of output files is the
same as the number of files referenced by the datastore.
writeall(
writes data with additional options specified by one or more name-value pair arguments. For
example, you can specify ds
,outputLocation
,Name,Value
)'OutputType'
and a file extension such as
'csv'
to specify the type of files that writeall
creates.
Write a datastore to disk, and then create a new datastore for the written files. This process is useful for saving your work or sharing a datastore with a colleague.
Create a datastore for the airlinesmall.csv
data set, treating 'NA'
values as missing data. Select only the Year
, Month
, and Carrier
variables.
ds = datastore('airlinesmall.csv'); ds.TreatAsMissing = 'NA'; ds.SelectedVariableNames = {'Month','Year','UniqueCarrier'};
Preview the datastore.
preview(ds)
ans=8×3 table
Month Year UniqueCarrier
_____ ____ _____________
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
Save this datastore to a new folder named ExampleData
on the C:\
disk. (You can specify a different write location, especially if you are not using a Windows® computer.) Specify 'FolderLayout'
as 'flatten'
to put the datastore files directly in the target folder, without creating subfolders.
location = 'C:\ExampleData'; writeall(ds,location,'Folderlayout','flatten')
Clear ds
from your working directory. To recover the datastore from disk, create a new datastore that references the same folder where the data was written.
clear ds
ds2 = datastore(location);
When writing a datastore to files, you can choose to duplicate or
flatten the folder structure by setting the 'FolderLayout'
parameter.
Create an image datastore from an input location and then write the datastore out to
files. Next, write the files at the output location by duplicating the folder layout.
Finally, write the files as a flat list to the output location.
Create an image datastore from the location 'C:\input\'
, which
contains some images of cars and trains:
imds = imageDatastore('C:\input\','IncludeSubfolders',true);
This folder structure is just an example. You can try the functionality with a folder structure on your machine that contains image files.
Examine the Folders
property of the input datastore, which
contains the fully qualified path to the input dataset.
imds.Folders
ans = 1×1 cell array {'C:\input\'}
Write the datastore to the output location 'C:\output\'
. The
default value of the 'FolderLayout'
parameter is
'duplicate'
. Therefore, the writeall
function
duplicates all the folders contained in the Folders
property of the
input datastore and then writes the files to the corresponding output folders.
writeall(imds,'C:\output\');
Now, write the files from the datastore to output location
'C:\output\'
as a flat list by setting
'FolderLayout'
to 'flatten'
. In this instance,
the writeall
function does not preserve the input folder layout and
instead writes all the files to a flat output location.
writeall(imds,'C:\output\','FolderLayout','flatten');
For more information, see FolderLayout.
You can use the FilenamePrefix
and
FilenameSuffix
parameters to add text to the beginning or end of the
output file names. Create an image datastore from an input location. Write the datastore
out to files and append text to the beginning of the file names. Then write the datastore
again and append text to the end of the file names.
Create an image datastore from the location 'C:\input\'
, which
contains some images of cars and trains:
imds = imageDatastore('C:\input\','IncludeSubfolders',true);
This folder structure is just an example. You can try the functionality with a folder structure on your machine that contains image files.
Now, write the files to an output location. Specify the value of
FilenamePrefix
as 'png_'
to append that text to
the beginning of each output file name.
writeall(imds,'C:\output\','FilenamePrefix','png_');
Write the files to a different output location. Specify the value of
FilenameSuffix
as '_image'
to append that text
to the end of each output file name.
writeall(imds,'C:\output2\','FilenameSuffix','_image');
ds
— Input datastoreInput datastore. You can use these datastores as input to
writeall
:
You also can use these datastores with writeall
if you supply
either the 'OutputFormat'
or 'WriteFcn'
name-value pair:
outputLocation
— Folder location to write dataFolder location to write data, specified as a character vector or string scalar.
outputLocation
can specify a full or relative path.
Example: outputLocation = '../../dir/data'
Example: outputLocation = "C:\Users\MyName\Desktop"
Data Types: char
| string
Specify optional
comma-separated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
writeall(ds,outputLocation,'FolderLayout','flatten')
'OutputFormat'
— Output file formatOutput file format, specified as the comma-separated pair consisting of
'OutputFormat'
and a character vector or string scalar.
Based on the type of the input datastore ds
, select one of
these values for the output files. You can select any tabular output format for any of
the tabular datastores listed below.
Input Datastore | Output Format |
---|---|
TabulartextDatastore ,
SpreadsheetDatastore ,
ParquetDatastore | 'txt' , 'csv' ,
'xlsx' , 'xls' ,
'parquet' , or 'parq' |
ImageDatastore | 'png' , 'jpg' ,
'jpeg' , 'tif' , or
'tiff' |
AudioDatastore | 'wav' , 'ogg' ,
'flac' , 'mp4' , or
'm4a' |
FileDatastore ,
TransformedDatastore ,
CombinedDatastore | All output formats are supported. |
For some output formats, there are additional name-value pairs that you can use to refine the writing operation.
Output Format | Name-Value Pairs |
---|---|
| |
| |
| |
| |
|
Data Types: char
| string
'FolderLayout'
— Layout of files in output folder'duplicate'
(default) | 'flatten'
Layout of files in output folder, specified as the comma-separated pair consisting
of 'FolderLayout'
and either 'duplicate'
or
'flatten'
.
'duplicate'
— The writeall
function
replicates the folder structure contained within the input data location to the
specified output location. The input data location contains the
Folders
property of the input datastore.
'flatten'
— The writeall
function
writes all the files from the input to the specified output folder without any
subfolders.
Data Types: char
| string
'FilenamePrefix'
— Prefix to file namePrefix to file name, specified as the comma-separated pair consisting of
'FilenamePrefix'
and a character vector or string scalar.
The writeall
function adds the specified prefix to the output
file names. For example, this code adds today’s date to the beginning of all output
file names from the
datastore:
prefixText = string(datetime('today')) writeall(imds,'C:\myFolder','FilenamePrefix',prefixText);
Data Types: char
| string
'FilenameSuffix'
— Suffix to file nameSuffix to file name, specified as the comma-separated pair consisting of
'FilenameSuffix'
and a character vector or string scalar.
The writeall
function adds the specified suffix to the output
file names. For example, this code adds the descriptive text
'jpeg_70per'
to the end of all output file names from the
datastore:
writeall(imds,'C:\myFolder','FilenameSuffix','jpeg_70per');
Data Types: char
| string
'UseParallel'
— Indicator to write in parallelfalse
(default) | true
| 0
| 1
Indicator to write in parallel, specified as the comma-separated pair consisting
of 'UseParallel'
and either false
or
true
(0
or 1
).
By default writeall
writes in serial. If you set
UseParallel
to true
, then
writeall
divides the writing operations into separate groups
and runs the groups in parallel if:
Parallel Computing Toolbox™ is installed.
An open parallel pool exists or automatic pool creation is enabled in the Parallel Preferences.
Otherwise, writeall
writes in serial regardless of the value
for UseParallel
.
Note
Parallel writing is not supported for CombinedDatastore
objects
or datastores resulting from transform
applied to a
CombinedDatastore
.
Data Types: logical
'WriteFcn'
— Custom writing functionCustom writing function, specified as the comma-separated pair consisting of
'WriteFcn'
and a function handle. The specified function is
responsible for creating the output files.
You can use the 'WriteFcn'
name-value pair argument to write
data in a variety of formats, even if writeall
does not directly
support the output format.
The custom writing function must accept at least three input arguments,
data
, writeInfo
, and
outputType
:
function myWriteFcn(data,writeInfo,outputType)
data
contains the output of the read
method operating on the datastore.
writeInfo
is an object of type
matlab.io.datastore.WriteInfo
with fields listed in the
table.
Field | Description | Type |
---|---|---|
ReadInfo | The second output of the read method | struct |
SuggestedOutputName | A fully qualified, globally unique file name that meets the location and naming requirements | string |
Location | The location argument passed to datastore write | string |
outputType
is the output format to be written to, for
example "txt"
or "csv"
.
A writing function that writes parquet files from a datastore is:
function myParquetWriteFcn(data, writeInfo, outputType) if strcmp(outputType,"parquet") parquetwrite(writeInfo.SuggestedOutputName,data) end end
myParquetWriteFcn
as the writing function for a datastore
ds
, use these
commands:ds = parquetDatastore(location); outputLocation = 'C:/tmp/MyData'; writeall(ds,outputLocation,'WriteFcn',@myParquetWriteFcn);
Data Types: function_handle
jpeg
or jpg
Output'Quality'
— Quality of JPEG-compressed file75
(default) | scalar in the range [0,100]Quality of the JPEG-compressed file, specified as the comma-separated pair
consisting of 'Quality'
and a scalar in the range [0,100], where 0
is lower quality and higher compression, and 100 is higher quality and lower
compression.
Example: 'Quality',25
wav
or flac
Output'BitsPerSample'
— Number of output bits per sample16
(default) | 8
| 24
| 32
| 64
Number of output bits per sample, specified as the comma-separated pair consisting
of 'BitsPerSample'
and 8
, 16
,
24
, 32
, or 64
.
For FLAC files, only 8
, 16
, or
24
bits per sample are supported.
Example: 'BitsPerSample',32
mp4
or m4a
Output'BitRate'
— Kilobits per second (kbit/s)Number of kilobits per second (kbit/s) used for compressed audio files, specified
as the comma-separated pair consisting of 'BitRate'
and
64
, 96
, 128
,
160
, 192
, 256
, or
320
. On Windows® 7 or later, the only valid values are: 96
,
128
, 160
, and 192
.
In general, a larger BitRate
value results in higher
compression quality.
Example: 'BitRate',96
txt
or csv
Output'WriteVariableNames'
— Indicator for writing variable names as column headingstrue
(default) | false
Indicator for writing variable names as column headings, specified as the comma-separated pair
consisting of 'WriteVariableNames'
and either true
or false
.
Indicator | Behavior |
---|---|
| The writing function includes variable names as the column headings of the output. This is the default behavior. |
| The writing function does not include variable names in the output. |
'Delimiter'
— Field delimiter characterField delimiter character, specified as the comma-separated pair consisting of
'Delimiter'
and a character vector or string scalar containing
one of these specifiers:
Specifier | Field Delimiter |
---|---|
| Comma. This is the default behavior. |
| Space |
| Tab |
| Semicolon |
| Vertical bar |
You can use the 'Delimiter'
name-value pair only for delimited text files.
Example: 'Delimiter','space'
Data Types: char
| string
'Encoding'
— Character encoding scheme'system'
(default) | 'UTF-8'
| 'ISO-8859-1'
| 'windows-1251'
| 'windows-1252'
| ...Character encoding scheme associated with the file, specified as the comma-separated pair consisting of 'Encoding'
and 'system'
or a standard character encoding scheme name like one of the values in this table. When you do not specify any encoding or specify encoding as 'system'
, the writing function uses your system default encoding to write the file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
Example: 'UTF-8'
Data Types: char
| string
'QuoteStrings'
— Indicator for writing quoted textfalse
(default) | true
Indicator for writing quoted text, specified as the comma-separated pair consisting of
'QuoteStrings'
and either false
or
true
. If 'QuoteStrings'
is
true
, then the writing function encloses the text in double
quotation marks and replaces any double-quote characters that appear as part of that
text with two double-quote characters.
You can use the 'QuoteStrings'
name-value pair only with delimited text files.
xls
or xlsx
Output'Sheet'
— Worksheet to write toWorksheet to write to, specified as the comma-separated pair consisting of
'Sheet'
and a character vector or a string scalar containing the
worksheet name or a positive integer indicating the worksheet index. The worksheet name
cannot contain a colon (:
). To determine the names of sheets in a
spreadsheet file, use sheets = sheetnames(filename)
. For more
information, see sheetnames
.
Specify the worksheet to write to by name or index:
name — If the specified sheet name does not exist in the file, then the writing function adds a new sheet at the end of the worksheet collection.
index — If the specified sheet index is an index larger than the number of worksheets, then the writing function appends empty sheets until the number of worksheets in the workbook equals the sheet index. The writing function also generates a warning indicating that it has added a new worksheet.
You can use the 'Sheet'
name-value pair only with spreadsheet files.
Example: 'Sheet'
,2
Example: 'Sheet'
, 'MySheetName'
Data Types: char
| string
| single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
To run in parallel, set the 'UseParallel'
option to true
.
For more general information about parallel computing, see Run MATLAB Functions with Automatic Parallel Support (Parallel Computing Toolbox).
imageDatastore
| parquetDatastore
| spreadsheetDatastore
| tabularTextDatastore
| write
| audioDatastore
(Audio Toolbox)
You have a modified version of this example. Do you want to open this example with your edits?