This example shows how to generate CUDA® MEX for a you only look once (YOLO) v2 object detector. A YOLO v2 object detection network is composed of two subnetworks. A feature extraction network followed by a detection network. This example generates code for the network trained in the Object Detection Using YOLO v2 Deep Learning example from Computer Vision Toolbox™. For more information, see Object Detection Using YOLO v2 Deep Learning. You can modify this example to generate CUDA® MEX for the network imported in the Import Pretrained ONNX YOLO v2 Object Detector example from Computer Vision Toolbox™. For more information, see Import Pretrained ONNX YOLO v2 Object Detector.
CUDA enabled NVIDIA® GPU with compute capability 3.2 or higher.
NVIDIA CUDA toolkit and driver.
NVIDIA cuDNN library.
Environment variables for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-party Products (GPU Coder). For setting up the environment variables, see Setting Up the Prerequisite Products (GPU Coder).
GPU Coder Interface for Deep Learning Libraries support package. To install this support package, use the Add-On Explorer.
Use the coder.checkGpuInstall
function to verify that the compilers and libraries necessary for running this example are set up correctly.
envCfg = coder.gpuEnvConfig('host'); envCfg.DeepLibTarget = 'cudnn'; envCfg.DeepCodegen = 1; envCfg.Quiet = 1; coder.checkGpuInstall(envCfg);
net = getYOLOv2();
Downloading pretrained detector (98 MB)...
The DAG network contains 150 layers including convolution, ReLU, and batch normalization layers and the YOLO v2 transform and YOLO v2 output layers. To display an interactive visualization of the deep learning network architecture, use the analyzeNetwork
function.
analyzeNetwork(net);
yolov2_detect
Entry-Point FunctionThe yolov2_detect.m entry-point function takes an image input and runs the detector on the image using the deep learning network saved in the yolov2ResNet50VehicleExample.mat
file. The function loads the network object from the yolov2ResNet50VehicleExample.mat
file into a persistent variable yolov2Obj and reuses the persistent object on subsequent detection calls.
type('yolov2_detect.m')
function outImg = yolov2_detect(in) % Copyright 2018-2019 The MathWorks, Inc. persistent yolov2Obj; if isempty(yolov2Obj) yolov2Obj = coder.loadDeepLearningNetwork('yolov2ResNet50VehicleExample.mat'); end % pass in input [bboxes,~,labels] = yolov2Obj.detect(in,'Threshold',0.5); % convert categorical labels to cell array of charactor vectors for MATLAB % execution if coder.target('MATLAB') labels = cellstr(labels); end % Annotate detections in the image. outImg = insertObjectAnnotation(in,'rectangle',bboxes,labels);
To generate CUDA code for the yolov2_detect.m entry-point function, create a GPU code configuration object for a MEX target and set the target language to C++. Use the coder.DeepLearningConfig
function to create a CuDNN
deep learning configuration object and assign it to the DeepLearningConfig
property of the GPU code configuration object. Run the codegen
command specifying an input size of [224,224,3]. This value corresponds to the input layer size of YOLOv2.
cfg = coder.gpuConfig('mex'); cfg.TargetLang = 'C++'; cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn'); codegen -config cfg yolov2_detect -args {ones(224,224,3,'uint8')} -report
Code generation successful: To view the report, open('codegen/mex/yolov2_detect/html/report.mldatx').
Set up the video file reader and read the input video. Create a video player to display the video and the output detections.
videoFile = 'highway_lanechange.mp4'; videoFreader = vision.VideoFileReader(videoFile,'VideoOutputDataType','uint8'); depVideoPlayer = vision.DeployableVideoPlayer('Size','Custom','CustomSize',[640 480]);
Read the video input frame-by-frame and detect the vehicles in the video using the detector.
cont = ~isDone(videoFreader); while cont I = step(videoFreader); in = imresize(I,[224,224]); out = yolov2_detect_mex(in); step(depVideoPlayer, out); cont = ~isDone(videoFreader) && isOpen(depVideoPlayer); % Exit the loop if the video player figure window is closed end
[1] Redmon, Joseph, and Ali Farhadi. "YOLO9000: Better, Faster, Stronger." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017.