Deep Learning in MATLAB

What Is Deep Learning?

Deep learning is a branch of machine learning that teaches computers to do what comes naturally to humans: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. Deep learning is especially suited for image recognition, which is important for solving problems such as facial recognition, motion detection, and many advanced driver assistance technologies such as autonomous driving, lane detection, pedestrian detection, and autonomous parking.

Deep Learning Toolbox™ provides simple MATLAB^® commands for creating and interconnecting the layers of a deep neural network. Examples and pretrained networks make it easy to use MATLAB for deep learning, even without knowledge of advanced computer vision algorithms or neural networks.

For a free hands-on introduction to practical deep learning methods, see Deep Learning Onramp.

What Do You Want to Do?	Learn More
Perform transfer learning to fine-tune a network with your data	Start Deep Learning Faster Using Transfer Learning Tip Fine-tuning a pretrained network to learn a new task is typically much faster and easier than training a new network.
Classify images with pretrained networks	Pretrained Deep Neural Networks
Create a new deep neural network for classification or regression	Create Simple Deep Learning Network for Classification Train Convolutional Neural Network for Regression
Resize, rotate, or preprocess images for training or prediction	Preprocess Images for Deep Learning
Label your image data automatically based on folder names, or interactively using an app	Train Network for Image Classification Image Labeler (Computer Vision Toolbox)
Create deep learning networks for sequence and time series data.	Sequence Classification Using Deep Learning Time Series Forecasting Using Deep Learning
Classify each pixel of an image (for example, road, car, pedestrian)	Getting Started with Semantic Segmentation Using Deep Learning (Computer Vision Toolbox)
Detect and recognize objects in images	Deep Learning, Semantic Segmentation, and Detection (Computer Vision Toolbox)
Classify text data	Classify Text Data Using Deep Learning
Classify audio data for speech recognition	Speech Command Recognition Using Deep Learning
Visualize what features networks have learned	Deep Dream Images Using GoogLeNet Visualize Activations of a Convolutional Neural Network
Train on CPU, GPU, multiple GPUs, in parallel on your desktop or on clusters in the cloud, and work with data sets too large to fit in memory	Deep Learning with Big Data on GPUs and in Parallel

To learn more about deep learning application areas, including automated driving, see Deep Learning Applications.

To choose whether to use a pretrained network or create a new deep network, consider the scenarios in this table.

	Use a Pretrained Network for Transfer Learning	Create a New Deep Network
Training Data	Hundreds to thousands of labeled images (small)	Thousands to millions of labeled images
Computation	Moderate computation (GPU optional)	Compute intensive (requires GPU for speed)
Training Time	Seconds to minutes	Days to weeks for real problems
Model Accuracy	Good, depends on the pretrained model	High, but can overfit to small data sets

For more information, see Choose Network Architecture.

Deep learning uses neural networks to learn useful representations of features directly from data. Neural networks combine multiple nonlinear processing layers, using simple elements operating in parallel and inspired by biological nervous systems. Deep learning models can achieve state-of-the-art accuracy in object classification, sometimes exceeding human-level performance.

You train models using a large set of labeled data and neural network architectures that contain many layers, usually including some convolutional layers. Training these models is computationally intensive and you can usually accelerate training by using a high performance GPU. This diagram shows how convolutional neural networks combine layers that automatically learn features from many images to classify new images.

Many deep learning applications use image files, and sometimes millions of image files. To access many image files for deep learning efficiently, MATLAB provides the imageDatastore function. Use this function to:

Automatically read batches of images for faster processing in machine learning and computer vision applications
Import data from image collections that are too large to fit in memory
Label your image data automatically based on folder names

Try Deep Learning in 10 Lines of MATLAB Code

This example shows how to use deep learning to identify objects on a live webcam using only 10 lines of MATLAB code. Try the example to see how simple it is to get started with deep learning in MATLAB.

Run these commands to get the downloads if needed, connect to the webcam, and get a pretrained neural network.
```
camera = webcam; % Connect to the camera
net = alexnet;   % Load the neural network
```
If you need to install the webcam and alexnet add-ons, a message from each function appears with a link to help you download the free add-ons using Add-On Explorer. Alternatively, see Deep Learning Toolbox Model for AlexNet Network and MATLAB Support Package for USB Webcams.
After you install Deep Learning Toolbox Model for AlexNet Network, you can use it to classify images. AlexNet is a pretrained convolutional neural network (CNN) that has been trained on more than a million images and can classify images into 1000 object categories (for example, keyboard, mouse, coffee mug, pencil, and many animals).
Run the following code to show and classify live images. Point the webcam at an object and the neural network reports what class of object it thinks the webcam is showing. It will keep classifying images until you press Ctrl+C. The code resizes the image for the network using imresize (Image Processing Toolbox).
```
while true
    im = snapshot(camera);       % Take a picture
    image(im);                   % Show the picture
    im = imresize(im,[227 227]); % Resize the picture for alexnet
    label = classify(net,im);    % Classify the picture
    title(char(label));          % Show the class label
    drawnow
end
```
In this example, the network correctly classifies a coffee mug. Experiment with objects in your surroundings to see how accurate the network is.

To watch a video of this example, see Deep Learning in 11 Lines of MATLAB Code.
To learn how to extend this example and show the probability scores of classes, see Classify Webcam Images Using Deep Learning.
For next steps in deep learning, you can use the pretrained network for other tasks. Solve new classification problems on your image data with transfer learning or feature extraction. For examples, see Start Deep Learning Faster Using Transfer Learning and Train Classifiers Using Features Extracted from Pretrained Networks. To try other pretrained networks, see Pretrained Deep Neural Networks.

Start Deep Learning Faster Using Transfer Learning

Transfer learning is commonly used in deep learning applications. You can take a pretrained network and use it as a starting point to learn a new task. Fine-tuning a network with transfer learning is much faster and easier than training from scratch. You can quickly make the network learn a new task using a smaller number of training images. The advantage of transfer learning is that the pretrained network has already learned a rich set of features that can be applied to a wide range of other similar tasks.

For example, if you take a network trained on thousands or millions of images, you can retrain it for new object detection using only hundreds of images. You can effectively fine-tune a pretrained network with much smaller data sets than the original training data. If you have a very large dataset, then transfer learning might not be faster than training a new network.

Transfer learning enables you to:

Transfer the learned features of a pretrained network to a new problem
Transfer learning is faster and easier than training a new network
Reduce training time and dataset size
Perform deep learning without needing to learn how to create a whole new network

For an interactive example, see Transfer Learning with Deep Network Designer.

For a programmatic example, see Train Deep Learning Network to Classify New Images.

Train Classifiers Using Features Extracted from Pretrained Networks

Feature extraction allows you to use the power of pretrained networks without investing time and effort into training. Feature extraction can be the fastest way to use deep learning. You extract learned features from a pretrained network, and use those features to train a classifier, for example, a support vector machine (SVM — requires Statistics and Machine Learning Toolbox™). For example, if an SVM trained using alexnet can achieve >90% accuracy on your training and validation set, then fine-tuning with transfer learning might not be worth the effort to gain some extra accuracy. If you perform fine-tuning on a small dataset, then you also risk overfitting. If the SVM cannot achieve good enough accuracy for your application, then fine-tuning is worth the effort to seek higher accuracy.

For an example, see Extract Image Features Using Pretrained Network.

Deep Learning with Big Data on CPUs, GPUs, in Parallel, and on the Cloud

Neural networks are inherently parallel algorithms. You can take advantage of this parallelism by using Parallel Computing Toolbox™ to distribute training across multicore CPUs, graphical processing units (GPUs), and clusters of computers with multiple CPUs and GPUs.

Training deep networks is extremely computationally intensive and you can usually accelerate training by using a high performance GPU. If you do not have a suitable GPU, you can train on one or more CPU cores instead. You can train a convolutional neural network on a single GPU or CPU, or on multiple GPUs or CPU cores, or in parallel on a cluster. Using GPU or parallel options requires Parallel Computing Toolbox.

You do not need multiple computers to solve problems using data sets too large to fit in memory. You can use the imageDatastore function to work with batches of data without needing a cluster of machines. However, if you have a cluster available, it can be helpful to take your code to the data repository rather than moving large amounts of data around.

To learn more about deep learning hardware and memory settings, see Deep Learning with Big Data on GPUs and in Parallel.

Documentation