Getting Started with Object Detection Using Deep Learning

Object detection using deep learning provides a fast and accurate means to predict the location of an object in an image. Deep learning is a powerful machine learning technique in which the object detector automatically learns image features required for detection tasks. Several techniques for object detection using deep learning are available such as Faster R-CNN, you only look once (YOLO) v2, and single shot detection (SSD).

Applications for object detection include:

Image classification
Scene understanding
Self-driving vehicles
Surveillance

Create Training Data for Object Detection

Use a labeling app to interactively label ground truth data in a video, image sequence, image collection, or custom data source. You can label object detection ground truth using rectangle labels, which define the position and size of the object in the image.

Augment and Preprocess Data

Using data augmentation provides a way to use limited data sets for training. Minor changes, such as translation, cropping, or transforming an image, provide, new, distinct, and unique images that you can use to train a robust detector. Datastores are a convenient way to read and augment collections of data. Use imageDatastore and the boxLabelDatastore to create datastores for images and labeled bounding box data.

Augment Bounding Boxes for Object Detection (Deep Learning Toolbox)
Preprocess Images for Deep Learning (Deep Learning Toolbox)
Preprocess Data for Domain-Specific Deep Learning Applications (Deep Learning Toolbox)

For more information about augmenting training data using datastores, see Datastores for Deep Learning (Deep Learning Toolbox), and Perform Additional Image Processing Operations Using Built-In Datastores (Deep Learning Toolbox).

Create Object Detection Network

Each object detector contains a unique network architecture. For example, the Faster R-CNN detector uses a two-stage network for detection, whereas the YOLO v2 detector uses a single stage. Use functions like fasterRCNNLayers or yolov2Layers to create a network. You can also design a network layer by layer using the Deep Network Designer (Deep Learning Toolbox).

Pretrained Deep Neural Networks (Deep Learning Toolbox)
Design a YOLO v2 Detection Network
Design an R-CNN, Fast R-CNN, and a Faster R-CNN Model

Train Detector and Evaluate Results

Use the trainFasterRCNNObjectDetector, trainYOLOv2ObjectDetector, trainSSDObjectDetector functions to train an object detector. Use the evaluateDetectionMissRate and evaluateDetectionPrecision functions to evaluate the training results.

Detect Objects Using Deep Learning Detectors

Detect objects in an image using the trained detector. For example, the partial code shown below uses the trained detector on an image I. Use the detect object function on fasterRCNNObjectDetector, yolov2ObjectDetector, or ssdObjectDetector objects to return bounding boxes, detection scores, and categorical labels assigned to the bounding boxes.

I = imread(input_image)
[bboxes,scores,labels] = detect(detector,I)

Documentation