Balance image blocks using bounding boxes and big images
balances bounding box labels that are contained in the big images object
locationSet
= balanceBoxLabels(boxLabels
,bigLabeledImages
,levels
,blockSize
,numObservations
)bigImages
. The function returns locationSet
, a
blockLocationSet
object that contains
numObservations
number of block locations, each of size
blockSize
.
specifies options using one or more name-value pair arguments in addition to the input
arguments from the previous syntax.locationSet
= balanceBoxLabels(___,Name,Value
)
Load box labels data that contains boxes and labels for one image. The height and width of each box is [20,20].
d = load('balanceBoxLabelsData.mat');
bboxes = d.BoxLabels.Boxes;
labels = d.BoxLabels.Labels;
boxLabels = table(bboxes,labels);
Find the class imbalance in the box labels.
blds = boxLabelDatastore(boxLabels); tbl1 = countEachLabel(blds); figure; h1 = histogram('Categories',tbl1.Label,'BinCounts',tbl1.Count);
Find the class imbalance by evaluating if the coefficient of variation is greater than 1.
CVBefore = std(tbl1.Count)/mean(tbl1.Count)
CVBefore = 1.5746
Set the number of observations by finding the median of the count of each class, and multiplying it by the number of classes.
numClasses = height(tbl1); numObservations = mean(tbl1.Count)*numClasses;
Create a big image of size 500-by-500 pixels.
bigImages = bigimage(zeros([500,500]));
Set the image size of each observation.
blockSize = [50,50];
Set the resolution levels for the big image objects.
levels = 1;
Balance the box labels.
locationSet = balanceBoxLabels(boxLabels,bigImages,levels,blockSize,numObservations);
Balancing box labels for 1 images with [==================================================] 100% [==================================================] 100% Balancing box labels complete.
Count the labels that are contained within the image blocks.
bldsBalanced = boxLabelDatastore(boxLabels,locationSet); tbl2 = countEachLabel(bldsBalanced);
Check if box labels are balanced. Compare new and original histograms of label count. If not balanced, use a different value for the number of blocks, numBlocks
. The histograms show that the box labels are balanced.
hold on; h2 = histogram('Categories',tbl2.Label,'BinCounts',tbl2.Count); title(h2.Parent,'Balanced Class Labels');
Check if the coefficient of variation value is less than the original value.
CVAfter = std(tbl2.Count)/mean(tbl2.Count)
CVAfter = 0.3731
boxLabels
— Labeled bounding box dataLabeled bounding box data, specified as a table with two columns.
The first column contains bounding boxes and must be a cell vector. Each element in the cell vector contains M-by-4 matrices in the format [x, y, width, height] for M boxes.
The second column must be a cell vector that contains the label names corresponding to each bounding box. Each element in the cell vector must be an M-by-1 categorical or string vector.
To create a box label table from ground truth data,
Use the Image Labeler or Video Labeler app to label your ground truth. Export the labeled ground truth data to your workspace.
Create a bounding box label datastore using the objectDetectorTrainingData
function.
You can obtain the boxLabels
from the
LabelData
property of the box label datastore returned by
objectDetectorTrainingData
,
( blds.LabelData
).
bigLabeledImages
— Labeled big imagesbigimage
object | vector of bigimage
objectsLabeled big images, specified as a bigimage
object or vector of
bigimage
objects containing pixel label images.
levels
— Resolution levelsResolution levels of blocks from each big image in
bigLabeledImages
, specified as a positive integer scalar or a
vector of positive integers that is equal to the length of the
bigLabeledImages
vector. If you specify a scalar value, then all
big labeled images supply blocks at the same resolution level.
Data Types: double
blockSize
— Block sizeBlock size of read data, specified as a two-element row vector of positive integers, [numrows,numcols]. The first element specifies the number of rows in the block. The second element specifies the number of columns.
numObservations
— Number of block locationsNumber of block locations to return, specified as a positive integer.
Specify optional
comma-separated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
OverlapThreshold
','1'
'OverlapThreshold'
— Overlap threshold1
(default) | scalar in the range [0,1]Overlap threshold, specified as the comma-separated pair consisting of
'OverlapThreshold
' and a positive scalar in the range [0,1].
When the overlap between a bounding box and a cropping window is greater than the
threshold, boxes in the boxLabels
input are clipped to the image
block window border. When the overlap is less than the threshold, the boxes are
discarded. When you lower the threshold, part of an object can get discarded. To
reduce the amount an object can be clipped at the border, increase the threshold.
Increasing the threshold can also cause less-balanced box labels.
The amount of overlap between the bounding box and a cropping window is defined as.
'Verbose'
— Display progress informationtrue
or 1
(default) | false
or 0
Display progress information, specified as the comma-separated pair of
'Verbose'
and a numeric or logical 1
(true
) or 0
(false
). Set
this property to true
to display information.
locationSet
— Balanced box labelsblockLocationSet
objectBalanced box labels, returned as a blockLocationSet
object. The object contains
numObservations
number of locations of balanced blocks, each of
size blockSize
.
To balance box labels, the function over samples classes that are less represented in
the big image. The box labels are counted across the dataset and sorted based on each class
count. Each image size is split into several quadrants, based on the
blockSize
input value. The algorithm randomly picks several blocks
within each quadrant with less-represented classes. The blocks without any objects are
discarded. The balancing stops once the specified number of blocks are selected.
You can check the success of balancing by comparing the histograms of label count before and after balancing. You can also check the coefficient of variation value. For best results, the value should be less than the original value. For more information, see the National Institute of Standards and Technology (NIST) website, see Coefficient of Variation for more information.
You have a modified version of this example. Do you want to open this example with your edits?