Low-level file I/O functions allow the
most control over reading or writing data to a file. However, these
functions require that you specify more detailed information about
your file than the easier-to-use high-level functions,
such as importdata
. For more information on the
high-level functions that read text files, see Import Text Files.
If the high-level functions cannot import your data, use one of the following:
fscanf
, which reads formatted data
in a text or ASCII file; that is, a file you can view in a text editor.
For more information, see Reading Data in a Formatted Pattern.
fgetl
and fgets
,
which read one line of a file at a time, where a newline character
separates each line. For more information, see Reading Data Line-by-Line.
fread
, which reads a stream of
data at the byte or bit level. For more information, see Import Binary Data with Low-Level I/O.
For additional information, see:
Note
The low-level file I/O functions are based on functions in the ANSI® Standard C Library. However, MATLAB® includes vectorized versions of the functions, to read and write data in an array with minimal control loops.
To import text files that importdata
and textscan
cannot
read, consider using fscanf
. The fscanf
function
requires that you describe the format of your file, but includes many
options for this format description.
For example, create a text file mymeas.dat
as
shown. The data in mymeas.dat
includes repeated
sets of times, dates, and measurements. The header text includes the
number of sets of measurements, N
:
Measurement Data N=3 12:00:00 01-Jan-1977 4.21 6.55 6.78 6.55 9.15 0.35 7.57 NaN 7.92 8.49 7.43 7.06 9.59 9.33 3.92 0.31 09:10:02 23-Aug-1990 2.76 6.94 4.38 1.86 0.46 3.17 NaN 4.89 0.97 9.50 7.65 4.45 8.23 0.34 7.95 6.46 15:03:40 15-Apr-2003 7.09 6.55 9.59 7.51 7.54 1.62 3.40 2.55 NaN 1.19 5.85 5.05 6.79 4.98 2.23 6.99
As with any of the low-level I/O functions, before reading,
open the file with fopen
, and
obtain a file identifier. By default, fopen
opens
files for read access, with a permission of 'r'
.
When you finish processing the file, close it with fclose
(
.fid
)
Describe the data in the file with format specifiers, such as '%s'
for
text, '%d'
for an integer, or '%f'
for
a floating-point number. (For a complete list of specifiers, see the fscanf
reference page.)
To skip literal characters in the file, include them in the
format description. To skip a data field, use an asterisk ('*'
)
in the specifier.
For example, consider the header lines of mymeas.dat
:
Measurement Data % skip the first 2 words, go to next line: %*s %*s\n N=3 % ignore 'N=', read integer: N=%d\n % go to next line: \n 12:00:00 01-Jan-1977 4.21 6.55 6.78 6.55 ...
To read the headers and return the single value for N
:
N = fscanf(fid, '%*s %*s\nN=%d\n\n', 1);
By default, fscanf
reapplies your format
description until it cannot match the description to the data, or
it reaches the end of the file.
Optionally, specify the number of values to read, so that fscanf
does
not attempt to read the entire file. For example, in mymeas.dat
,
each set of measurements includes a fixed number of rows and columns:
measrows = 4; meascols = 4; meas = fscanf(fid, '%f', [measrows, meascols])';
There are several ways to store mymeas.dat
in
the MATLAB workspace. In this case, read the values into a structure.
Each element of the structure has three fields: mtime
, mdate
,
and meas
.
Note
fscanf
fills arrays with numeric values in
column order. To make the output array match the orientation of numeric
data in a file, transpose the array.
filename = 'mymeas.dat'; measrows = 4; meascols = 4; % open the file fid = fopen(filename); % read the file headers, find N (one value) N = fscanf(fid, '%*s %*s\nN=%d\n\n', 1); % read each set of measurements for n = 1:N mystruct(n).mtime = fscanf(fid, '%s', 1); mystruct(n).mdate = fscanf(fid, '%s', 1); % fscanf fills the array in column order, % so transpose the results mystruct(n).meas = ... fscanf(fid, '%f', [measrows, meascols])'; end % close the file fclose(fid);
MATLAB provides two functions that read lines from files
and store them as character vectors: fgetl
and fgets
.
The fgets
function copies the line along with the
newline character to the output, but fgetl
does
not.
The following example uses fgetl
to read
an entire file one line at a time. The function litcount
determines
whether a given character sequence (literal
) appears
in each line. If it does, the function prints the entire line preceded
by the number of times the literal appears on the line.
function y = litcount(filename, literal) % Count the number of times a given literal appears in each line. fid = fopen(filename); y = 0; tline = fgetl(fid); while ischar(tline) matches = strfind(tline, literal); num = length(matches); if num > 0 y = y + num; fprintf(1,'%d:%s\n',num,tline); end tline = fgetl(fid); end fclose(fid);
Create an input data file called badpoem
:
Oranges and lemons, Pineapples and tea. Orangutans and monkeys, Dragonflys or fleas.
To find out how many times 'an'
appears in
this file, call litcount
:
litcount('badpoem','an')
This returns:
2: Oranges and lemons, 1: Pineapples and tea. 3: Orangutans and monkeys, ans = 6
When you read a portion of your data at a time, you can use feof
to
check whether you have reached the end of the file. feof
returns
a value of 1
when the file pointer is at the end
of the file. Otherwise, it returns 0
.
Note
Opening an empty file does not move the
file position indicator to the end of the file. Read operations, and
the fseek
and frewind
functions,
move the file position indicator.
When you use textscan
, fscanf
, or fread
to
read portions of data at a time, use feof
to
check whether you have reached the end of the file.
For example, suppose that the hypothetical file mymeas.dat
has
the following form, with no information about the number of measurement
sets. Read the data into a structure with fields for mtime
, mdate
,
and meas
:
12:00:00 01-Jan-1977 4.21 6.55 6.78 6.55 9.15 0.35 7.57 NaN 7.92 8.49 7.43 7.06 9.59 9.33 3.92 0.31 09:10:02 23-Aug-1990 2.76 6.94 4.38 1.86 0.46 3.17 NaN 4.89 0.97 9.50 7.65 4.45 8.23 0.34 7.95 6.46
To read the file:
filename = 'mymeas.dat'; measrows = 4; meascols = 4; % open the file fid = fopen(filename); % make sure the file is not empty finfo = dir(filename); fsize = finfo.bytes; if fsize > 0 % read the file block = 1; while ~feof(fid) mystruct(block).mtime = fscanf(fid, '%s', 1); mystruct(block).mdate = fscanf(fid, '%s', 1); % fscanf fills the array in column order, % so transpose the results mystruct(block).meas = ... fscanf(fid, '%f', [measrows, meascols])'; block = block + 1; end end % close the file fclose(fid);
If you use fgetl
or fgets
in a control loop, feof
is
not always the best way to test for end of file. As an alternative,
consider checking whether the value that fgetl
or fgets
returns
is a character vector.
For example, the function litcount
described
in Reading Data Line-by-Line includes
the following while
loop and fgetl
calls
:
y = 0; tline = fgetl(fid); while ischar(tline) matches = strfind(tline, literal); num = length(matches); if num > 0 y = y + num; fprintf(1,'%d:%s\n',num,tline); end tline = fgetl(fid); end
This approach is more robust than testing ~feof(fid)
for
two reasons:
If fgetl
or fgets
find
data, they return a character vector. Otherwise, they return a number
(-1
).
After each read operation, fgetl
and fgets
check
the next character in the file for the end-of-file marker. Therefore,
these functions sometimes set the end-of-file indicator before they
return a value of -1
. For example, consider the
following three-line text file. Each of the first two lines ends with
a newline character, and the third line contains only the end-of-file
marker:
123 456
Three sequential calls to fgetl
yield the
following results:
t1 = fgetl(fid); % t1 = '123', feof(fid) = false t2 = fgetl(fid); % t2 = '456', feof(fid) = true t3 = fgetl(fid); % t3 = -1, feof(fid) = true
This behavior does not conform to the ANSI specifications for the related C language functions.
Encoding schemes support the characters required for particular alphabets, such as those for Japanese or European languages. Common encoding schemes include US-ASCII or UTF-8.
If you do not specify an encoding scheme when opening a file for reading,
fopen
uses auto character-set detection to determine the encoding. If
you do not specify an encoding scheme when opening a file for writing,
fopen
defaults to using UTF-8 in order to provide interoperability
between all platforms and locales without data loss or corruption.
To determine the default, open a file, and call fopen
again with the
syntax:
[filename, permission, machineformat, encoding] = fopen(fid);
If you specify an encoding scheme when you open a file, the
following functions apply that scheme: fscanf
, fprintf
, fgetl
, fgets
, fread
,
and fwrite
.
For a complete list of supported encoding schemes, and the syntax
for specifying the encoding, see the fopen
reference
page.