ocrText class

Object for storing OCR results

Description

ocrText contains recognized text and metadata collected during optical character recognition (OCR). The ocr function returns the ocrText object. You can access the information contained in the object with the ocrText properties. You can also locate text that matches a specific pattern with the object’s locateText method.

Properties

expand all

`Text` — Text recognized by OCR
array of characters

Text recognized by OCR, specified as an array of characters. The text includes white space and new line characters.

`CharacterBoundingBoxes` — Bounding box locations
M-by-4 matrix

Bounding box locations, stored as an M-by-4 matrix. Each row of the matrix contains a four-element vector, [x y width height]. The [x y] elements correspond to the upper-left corner of the bounding box. The [width height] elements correspond to the size of the rectangular region in pixels. The bounding boxes enclose text found in an image using the ocr function. Bounding boxes width and height that correspond to new line characters are set to zero. Character modifiers found in languages, such as Hindi, Tamil, and Bangalese, are also contained in a zero width and height bounding box.

`CharacterConfidences` — Character recognition confidence
array

Character recognition confidence, specified as an array. The confidence values are in the range [0, 1]. A confidence value, set by the ocr function, should be interpreted as a probability. The ocr function sets confidence values for spaces between words and sets new line characters to NaN. Spaces and new line characters are not explicitly recognized during OCR. You can use the confidence values to identify the location of misclassified text within the image by eliminating characters with low confidence.

`Words` — Recognized words
cell array

Recognized words, specified as a cell array.

`WordBoundingBoxes` — Bounding box location and size
M-by-4 matrix

Bounding box location and size, stored as an M-by-4 matrix. Each row of the matrix contains a four-element vector, [x y width height], that specifies the upper left corner and size of a rectangular region in pixels.

`WordConfidences` — Recognition confidence
array

Recognition confidence, specified as an array. The confidence values are in the range [0, 1]. A confidence value, set by the ocr function, should be interpreted as a probability. The ocr function sets confidence values for spaces between words and sets new line characters to NaN. Spaces and new line characters are not explicitly recognized during OCR. You can use the confidence values to identify the location of misclassified text within the image by eliminating words with low confidence.

Methods

locateText

Locate text pattern

Examples

collapse all

Find and Highlight Text in an Image

Open Live Script

businessCard = imread('businessCard.png');
ocrResults = ocr(businessCard);
bboxes = locateText(ocrResults, 'MathWorks', 'IgnoreCase', true);
Iocr = insertShape(businessCard, 'FilledRectangle', bboxes);
figure; imshow(Iocr);

Find Text Using Regular Expressions

Open Live Script

     businessCard = imread('businessCard.png');
     ocrResults   = ocr(businessCard);
     bboxes = locateText(ocrResults, 'www.*com','UseRegexp', true);
     img    = insertShape(businessCard, 'FilledRectangle', bboxes);
     figure; imshow(img);

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Use in a MATLAB Function block is not supported.
The Words property cannot be accessed in code generation. Use the Text property in place of the Words property to access the OCR results.

Documentation

ocrText class

Description

Properties

`Text` — Text recognized by OCR
array of characters

`CharacterBoundingBoxes` — Bounding box locations
M-by-4 matrix

`CharacterConfidences` — Character recognition confidence
array

`Words` — Recognized words
cell array

`WordBoundingBoxes` — Bounding box location and size
M-by-4 matrix

`WordConfidences` — Recognition confidence
array

Methods

Examples

Find and Highlight Text in an Image

Find Text Using Regular Expressions

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

See Also

Computer Vision Toolbox Documentation

Support

Documentation

ocrText class

Description

Properties

Text — Text recognized by OCR array of characters

CharacterBoundingBoxes — Bounding box locations M-by-4 matrix

CharacterConfidences — Character recognition confidence array

Words — Recognized words cell array

WordBoundingBoxes — Bounding box location and size M-by-4 matrix

WordConfidences — Recognition confidence array

Methods

Examples

Find and Highlight Text in an Image

Find Text Using Regular Expressions

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

See Also

Computer Vision Toolbox Documentation

Support

`Text` — Text recognized by OCR
array of characters

`CharacterBoundingBoxes` — Bounding box locations
M-by-4 matrix

`CharacterConfidences` — Character recognition confidence
array

`Words` — Recognized words
cell array

`WordBoundingBoxes` — Bounding box location and size
M-by-4 matrix

`WordConfidences` — Recognition confidence
array

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.