Hierarchical Data Format, Version 5, (HDF5) is a
general-purpose, machine-independent standard for storing scientific data in files, developed by
the National Center for Supercomputing Applications (NCSA). HDF5 is used by a wide range of
engineering and scientific fields that want a standard way to store data so that it can be
shared. For more information about the HDF5 file format, read the HDF5 documentation available
at the HDF Web site (https://www.hdfgroup.org
).
MATLAB® provides two methods to import data from an HDF5 file:
High-level functions that make it easy to import data, when working with numeric datasets
Low-level functions that enable more complete control over the importing process, by providing access to the routines in the HDF5 C library
Note
For information about importing to HDF4 files, which have a separate, incompatible format, see Import HDF4 Files Programmatically.
MATLAB includes several functions that you can use to examine the contents of an HDF5 file and import data from the file into the MATLAB workspace.
Note
You can only use the high-level functions to read numeric datasets or attributes. To read non-numeric datasets or attributes, you must use the low-level interface.
h5disp
—
View the contents of an HDF5 file
h5info
—
Create a structure that contains all the metadata defining an HDF5
file
h5read
—
Read data from a variable in an HDF5 file
h5readatt
—
Read data from an attribute associated with a variable in an HDF5
file or with the file itself (a global attribute).
For details about how to use these functions, see their reference pages, which include examples. The following sections illustrate some common usage scenarios.
HDF5 files can contain data and metadata, called attributes. HDF5 files organize the data and metadata in a hierarchical structure similar to the hierarchical structure of a UNIX® file system.
In an HDF5 file, the directories in the hierarchy are called groups. A group can contain other groups, data sets, attributes, links, and data types. A data set is a collection of data, such as a multidimensional numeric array or string. An attribute is any data that is associated with another entity, such as a data set. A link is similar to a UNIX file system symbolic link. Links are a way to reference objects without having to make a copy of the object.
Data types are a description of the data in the data set or attribute. Data types tell how to interpret the data in the data set.
To get a quick view into the contents of an HDF5 file, use the h5disp
function.
h5disp('example.h5') HDF5 example.h5 Group '/' Attributes: 'attr1': 97 98 99 100 101 102 103 104 105 0 'attr2': 2x2 H5T_INTEGER Group '/g1' Group '/g1/g1.1' Dataset 'dset1.1.1' Size: 10x10 MaxSize: 10x10 Datatype: H5T_STD_I32BE (int32) ChunkSize: [] Filters: none Attributes: 'attr1': 49 115 116 32 97 116 116 114 105 ... 'attr2': 50 110 100 32 97 116 116 114 105 ... Dataset 'dset1.1.2' Size: 20 MaxSize: 20 Datatype: H5T_STD_I32BE (int32) ChunkSize: [] Filters: none Group '/g1/g1.2' Group '/g1/g1.2/g1.2.1' Link 'slink' Type: soft link Group '/g2' Dataset 'dset2.1' Size: 10 MaxSize: 10 Datatype: H5T_IEEE_F32BE (single) ChunkSize: [] Filters: none Dataset 'dset2.2' Size: 5x3 MaxSize: 5x3 Datatype: H5T_IEEE_F32BE (single) ChunkSize: [] Filters: none . . .
To explore the hierarchical organization of an HDF5 file, use
the h5info
function. h5info
returns
a structure that contains various information about the HDF5 file,
including the name of the file.
info = h5info('example.h5') info = Filename: 'matlabroot\matlab\toolbox\matlab\demos\example.h5' Name: '/' Groups: [4x1 struct] Datasets: [] Datatypes: [] Links: [] Attributes: [2x1 struct]
By looking at the Groups
and Attributes
fields,
you can see that the file contains four groups and two attributes.
The Datasets
, Datatypes
, and Links
fields
are all empty, indicating that the root group does not contain any
data sets, data types, or links. To explore the contents of the sample
HDF5 file further, examine one of the structures in Groups
.
The following example shows the contents of the second structure in
this field.
level2 = info.Groups(2) level2 = Name: '/g2' Groups: [] Datasets: [2x1 struct] Datatypes: [] Links: [] Attributes: []
In the sample file, the group named /g2
contains
two data sets. The following figure illustrates this part of the sample
HDF5 file organization.
To get information about a data set, such as its name, dimensions,
and data type, look at either of the structures returned in the Datasets
field.
dataset1 = level2.Datasets(1) dataset1 = Filename: 'matlabroot\example.h5' Name: '/g2/dset2.1' Rank: 1 Datatype: [1x1 struct] Dims: 10 MaxDims: 10 Layout: 'contiguous' Attributes: [] Links: [] Chunksize: [] Fillvalue: []
To read data or metadata from an HDF5 file, use the h5read
function.
As arguments, specify the name of the HDF5 file and the name of the
data set. (To read the value of an attribute, you must use h5readatt
.)
To illustrate, this example reads the data set, /g2/dset2.1
from
the HDF5 sample file example.h5
.
data = h5read('example.h5','/g2/dset2.1') data = 1.0000 1.1000 1.2000 1.3000 1.4000 1.5000 1.6000 1.7000 1.8000 1.9000
When the h5read
function reads data from an HDF5 file into the MATLAB workspace, it maps HDF5 data types to MATLAB data types, as shown in the table below.
HDF5 Data Type | h5read Returns |
---|---|
Bit-field | Array of packed 8-bit integers |
Float | MATLAB single and double types, provided that they occupy 64 bits or fewer |
Integer types, signed and unsigned | Equivalent MATLAB integer types, signed and unsigned |
Opaque | Array of uint8 values |
Reference | Returns the actual data pointed to by the reference, not the value of the reference. |
Strings, fixed-length and variable length | Cell array of character vectors |
Enums | Cell array of character vectors, where each enumerated value is replaced by the corresponding member name |
Compound | 1-by-1 struct array; the dimensions of the dataset are expressed in the fields of the structure. |
Arrays | Array of values using the same datatype as the HDF5 array.
For example, if the array is of signed 32-bit integers, the MATLAB array
will be of type int32 . |
The example HDF5 file included with MATLAB includes examples of all these datatypes.
For example, the data set /g3/string
is a
string.
h5disp('example.h5','/g3/string') HDF5 example.h5 Dataset 'string' Size: 2 MaxSize: 2 Datatype: H5T_STRING String Length: 3 Padding: H5T_STR_NULLTERM Character Set: H5T_CSET_ASCII Character Type: H5T_C_S1 ChunkSize: [] Filters: none FillValue: ''
Now read the data from the file, MATLAB returns it as a cell array of character vectors.
s = h5read('example.h5','/g3/string') s = 'ab ' 'de ' >> whos s Name Size Bytes Class Attributes s 2x1 236 cell
The compound data types are always returned as a 1-by-1 struct.
The dimensions of the data set are expressed in the fields of the
struct. For example, the data set /g3/compound2D
is
a compound datatype.
h5disp('example.h5','/g3/compound2D') HDF5 example.h5 Dataset 'compound2D' Size: 2x3 MaxSize: 2x3 Datatype: H5T_COMPOUND Member 'a': H5T_STD_I8LE (int8) Member 'b': H5T_IEEE_F64LE (double) ChunkSize: [] Filters: none FillValue: H5T_COMPOUND
Now read the data from the file, MATLAB returns it as a 1-by-1 struct.
data = h5read('example.h5','/g3/compound2D') data = a: [2x3 int8] b: [2x3 double]
In R2015a and later releases, MATLAB supports reading HDF5 datasets that are written using a third-party filter. To read the datasets using the dynamically loaded filter feature, you must:
Install the HDF5 filter plugin on your system as a shared library or a DLL.
Set the HDF5_PLUGIN_PATH
environment variable
to point to the installation.
For more information see, HDF5 Dynamically Loaded Filters.
Note
Writing HDF5 datasets using dynamically loaded filters is not supported.
MATLAB provides direct access to dozens of functions in the HDF5 library with low-level functions that correspond to the functions in the HDF5 library. In this way, you can access the features of the HDF5 library from MATLAB, such as reading and writing complex data types and using the HDF5 subsetting capabilities. For more information, see Using the MATLAB Low-Level HDF5 Functions to Export Data.