Skip to main content

OCR or Images

In our planning a project section (link) we've reference the key rationale to make, but simply, the decision on using OCR or Images, can be broken down to the following rule of thumb:

"For text driven UIs with a standard font, use OCR and use Image Recognition for icons and logos"

There are naturally edge cases to this rule, which are defined below with the approach to solve explained.

Complex Fonts (Character Collections):

In this example, we have the text ”OCR will struggle here" in a custom font style that is mimicking hand writing. OCR will struggle to interpret the various lines and shapes in the font even though its readable to the human eye. You could attempt to apply properties to the OCR engine, however a stronger and simpler solution would be to create a character collection (LINK) of images for each font character. The initial overhead to capture each character, will be mitigated by high accuracy in automation - which other tools would struggle with.

Text Is Not A Dictionary Word And Less Than 3 Characters:

Utilising OCR for content that is less than 3 characters can lead to a high risk of false positive matches. An example of this would be an onscreen keyboard on a Desktop machine.

In both examples below, the scripts are attempting to write 'Hello World' in the text editor.

In the Bad Practice Example, we can see that when the automation attempted to click the character "e", the engine identified the key "esc". Again, we could engineer a complex script to solve this, however using the Character Collection, as detailed above, we can see in the Good Practice Example that the output is correct.


When Using OCR Would Require Complex Code:

With the above example, we have a generic menu panel. The first approach would be to try and utilise OCR throughout however, we would expect OCR to struggle with the center aligned text.

To solve, this would require a complex scripted workaround. The simplest solution, would be to capture the images of the icons for all elements that have an icon in the sub menu. For the top menu, which is solely text, we would be confident to utilise OCR.

A Summary of What You Have Learned

Knowing when to use OCR and when to use Images in certain scenarios is key to ensuring that the automation scripts are as easily maintainable as possible.

Bad Practice Example
Good Practice Example

// Step 1
click image:"chrome"
waitfor 20, image:"refresh"

typetext "",returnkey
waitfor 20, image:"nopCommerce"

// Step 2
moveto image:"Computers"
click image:"Desktops"
waitfor 20, "HomeComputersDesktops"

// Step 3
click image:"DigitalStormVANQUISH"
waitfor 20, image:"DigitalStormVANQUISH3CustomPerformancePC"

// Step 4
click image:"ADDTOCART"
waitfor 20,image:"Succesfully Added to Cart"

// Step 5
moveto image:"Basket"
click image:"GOTOCART"
waitfor 20, image:"Shoppingcart"
assert that imagefound(0,"DigitalStormVANQUISH3CustomPerformance")

// Step 1
click image:"chrome" // Icon
waitfor 20, image:"refresh" // Icon

set the searchRectangle to (0,70,1920,1039) // Optimises OCR
typetext "",returnkey
waitfor 20, image:"nopCommerce" // Logo

// Step 2
moveto text:"Computers"
wait 1 // Optimises OCR
click text:"Desktops"
wait 1.5 // Optimises OCR
waitfor 20, text:"Categories"

// Step 3
click text:"Digital Storm VANQUISH 3"
wait 1.5 // Optimises OCR
waitfor 20, text:"Digital Storm Vanquish 3 Desktop PC"

// Step 4
click text:"ADD TO CART"
wait 2 // Optimises OCR
waitfor 20,text:"The product has been added to your shopping cart"

// Step 5
moveto image:"Basket"
wait 1
click text:"GO TO CART"
wait 2 // Optimises OCR
waitfor 20, text:"Shopping Cart"
assert that imagefound(0,text:"Digital Storm VANQUISH 3 Custom Performance PC")

Images Folder:
Images Folder: