Working with Optical Character Recognition
When you want to find text on your system under test (SUT) and capturing an image of the text is not practical
, you can rely on the optical character recognition (OCR) capabilities in Eggplant Functional and SenseTalk. OCR is most useful when searching for dynamic text, although you might find many other practical uses.
We highly recommend that you take time to understand the OCR concepts discussed on this page, as well as the Text Properties used with OCR, so that you can make appropriate adjustments to your OCR searches and achieve better search results in your test environment.
Succeeding with OCR: Using The Most Common OCR Properties
SenseTalk includes a number of text properties that allow you to tailor your OCR searches to your situation and environment. Using a tailored OCR search improves the reliability of the search and helps you get the best results. For a full list of OCR properties, as well as information on which properties can be used for reading text vs. searching for text, see Text Properties.
When using properties to tailor your OCR search, it is important to carefully consider which properties to use, and how many. If a search doesn't work, it can be tempting to keep adding properties, but that is not always the best approach. Sometimes removing a property is necessary. Try using them individually first, then add properties as needed. Use the OCR Tuner to play around with properties and see what works. For more about troubleshooting OCR searches, see Troubleshooting OCR.
Each of the following methods might be used to improve OCR recognition or speed up searches. For more information, see Improving the Speed of OCR Searches.
Search Rectangles
It is almost always helpful to add a search rectangle, which limits what part of the screen OCR searches. When OCR searches the entire screen, it is not only slower, but it can also be less accurate because it is more likely to come up with extra possible matches or no matches at all.
Search rectangles are typically defined using images, though coordinates can also be passed to this property. The hot spot of the captured image defines the point used (see Using the Hot Spot for more information on moving this point).
Using images is ideal because it allows the location of the rectangle to be dynamic as elements display in different places on the SUT screen. For instance, the text might appear in a window that does not always display on the same location on the SUT screen. There might be an icon on that window that you can capture an image of, using it as an anchor for your image-defined search rectangle.
If it is not possible to use images to set the search rectangle, you can use screen coordinates. The Cursor Location toolbar icon in the Viewer window is helpful in this endeavor because it shows the current location of the mouse on the SUT. For instructions on customizing your Viewer window toolbar, see Customize the Toolbar.
Example: Reading Dynamic Text from a Website
You might have a test that navigates to the Google Finance page, searches for a specific company, and then reads the stock price. To make sure that OCR reads the value of the price reliably, and nothing else, you can define a search rectangle using images. The code used could be as simple as this:
Log ReadText ("TLImage","BRImage")
You need to capture two images for the above code to work. In this example, you used images TLImage
and BRImage
to define the top left and bottom right corners of the search rectangle. To capture these images, choose one or more elements of the screen that are stable in relation to the text OCR reads.
This example uses a single element of the screen, the "Company" label.
TLImage
is an image of the "Company" label with the hot spot moved to the upper left corner of the area where the stock price is displayed.
Without even moving the capture area, capture the BRImage
with the hot spot moved to the bottom right corner of the area where the stock price displays.
After capturing the images and writing the SenseTalk code, the above example is able to successfully read the stock price for any company in Google Finance.
"TLImage" and "BRImage" are used to set the search rectangle for a successful ReadText() search using OCR .
Contrast
This property causes Eggplant Functional to see in black and white only. Whatever color is being used as the ContrastColor
(the background color) turns white, and everything else turns black. For black text on a white background, the code looks like this:
Click (Text: "hello", Contrast:On, ContrastColor: White, ValidCharacters: "hello", Searchrectangle: ("UpperLeftImage", "LowerRightImage"))
In this situation, every pixel close to white (within the standard tolerance range of 45) is turned white, and everything else is read as black. This can help OCR to more clearly read the text.
What it actually sees is this image, free of any anti-aliasing:
What OCR Sees with the Contrast On
To see the text image sent to the OCR engine with your Contrast
settings, use the OCR Tuner panel. This panel has a live display showing you what will be sent to the OCR engine for recognition with any combination of contrast-related settings (excluding EnhanceLocalContrast).
When looking for text on a gray background (or another color of medium value), it can get a little more complex. It is good to set the ContrastTolerance
a little lower (down to 20 or so), which narrows the number of pixels that OCR might try to turn white. Notice in the image above, a pixel between the “h” and the “e” in “hello” turned black, joining the two letters together. In this instance, OCR was still able to read the letters, but in some other circumstances, this could make the letters even more difficult for OCR to process. This is why it is always a good idea to try things out and see how they work before running a full script.
Click (Text:"hello", Contrast:On, ContrastColor: White, ContrastTolerance: 20)
If the ContrastColor
is not set for a Contrast
search, then it defaults to the color of the pixel in the top left corner of the search area.
Determining the Background Color
If the background/contrast color is not known, there are two ways to find the RGB values of the background color in a specific location on any platform.
Method 1: Use the Color Picker (Mac)
To use the color picker, follow the steps below:
-
Click the Find Text icon in the toolbar of the remote screen window. This opens the Find Text panel.
-
Click the Text or Background color box. The Colors system panel will display.
-
Click the eye-dropper icon for the color picker, and then hover over the text or background and select the color you want to use for your contrast color setting. Be careful not to select pixels in the anti-aliasing.
Finding the Contrast Color on Mac, Using the Color Picker
Method 2: Use the ColorAtLocation()
function
To use the ColoratLocation()
function, follow the steps below:
- Open the Viewer window in Live Mode. If the Cursor Location toolbar icon is not already on the toolbar, see Customize the Toolbar.
- Now do the following to use the
ColorAtLocation
function. - Move the mouse over the background of which the RGB color value is desired.
- Using the coordinates of the mouse on the SUT, shown in the field of the Cursor Location toolbar icon, run this line of code in the ad hoc do box (AHDB) found at the bottom of the run window, or from a script:
put colorAtLocation(x,y) // where (x,y) refers to the coordinates found in the remote screen window
Running this line of code returns the RGB value for the color at the location specified.
Enhance Local Contrast
Enable this property if you want OCR to automatically increase the local contrast of the text image being sent to the OCR engine. This property may aid recognition when some or all of the text being read has relatively low contrast, such as blue text on a dark background. When Contrast is turned on, this property has no effect, so it is only useful when Contrast
is turned off.
Log ReadText(("TLImage","BRImage"), enhanceLocalContrast: On)
ValidCharacters
and ValidWords
The ValidCharacters
and ValidWords
properties restrict what OCR returns. This includes what it considers a match, and what it returns when used to read text.
The value you pass to these properties is given preference by OCR. In essence, you are giving OCR a hint as to what is looking for. For example, if you set ValidWords:"cat"
and the word "oat" appears on the SUT screen, OCR might read it as "cat".
ValidCharacters
The ValidCharacters property tells OCR which glyphs to look for, and which to ignore. For example, it can be used to prevent OCR from misreading the letter “O” as the number “0″ or vice versa. This can be done manually for optimal control of your script, or by using an asterisk (*) to automatically set validCharacters
to the text being searched for.
ValidCharacters
is most useful when reading text in a situation where you don't know the text ahead of time, but need to limit the character set. For instance, this can be used to assist in distinguishing letters from numbers when working with money.
Examples
//Setting the validCharacters manually:
Log ReadText(["TLImage","BRImage"], ValidCharacters:"$£€.,0123456789") -- reads a numeric value including currency symbols
//Using ValidCharacters with a variable:
Put "Charlie" into MyText
Click (Text: MyText, ValidCharacters: MyText)
//Setting the ValidCharacters to the text being searched for, using an asterisk:
Click (Text:"CoDe13v9065", ValidCharacters:"*", SearchRectangle:("UpperLeftImage","LowerRightImage"))
ValidWords
The ValidWords
property is similar to ValidCharacters
, except that it not only enforces which characters are considers valid, but also the placement of those characters. This is why ValidWords
is often used when searching for a specific phrase.
ValidWords
overrides the set Language
property because it essentially creates a new language library containing only the words provided to ValidWords
.
Example
Put "Charlie Brown" into mytext
Click (text: mytext, searchRectangle:("RT1","RT2"), validwords:mytext)
Telling OCR What to Ignore
Working with OCR is all about providing OCR with the right properties for your text search. Most commonly these properties steer OCR toward what it should recognize, but in some cases, it is better to steer it away from improper results by telling it what not to recognize. The below properties specifically tell OCR what to ignore when working with text.
Ignore Spaces
As a property in a text search, IgnoreSpaces
can be used to do exactly that—ignore any spaces within the search. For example, this approach could be helpful in a scenario where spacing between characters is not consistent. Sometimes OCR sees spaces where there are not any, or ignores spaces where they exist. Setting the IgnoreSpaces
property to ON
causes OCR to match “flowerpot” with a search for “flower pot” and vice versa. When this property is used, OCR strips spaces from both the string for which it is searching and the string that it finds to come up with a match.
Click (Text:"flower pot", validCharacters:"*", ignoreSpaces:ON, searchRectangle:("UpperLeftImage","LowerRightImage"))
Ignore Underscores
The ignoreUnderscores
property causes OCR text searches to treat underscores as spaces during searches. For example, the string "My_Computer" would match "My_Computer" or "My Computer". The ignoreUnderscores
property is on by default because the OCR engine sometimes fails to recognize underscores.
Click (Text:"My_Computer", validCharacters:"*", ignoreUnderscores:ON, searchRectangle:("UpperLeftImage","LowerRightImage"))
Ignore Newlines
A newline is a type of return character that creates a line break. When enabled, ignoreNewlines
causes OCR text searches to ignore line breaks, so a search will match a string even if it's broken over several lines. This property is only available for text searches (not available with ReadText
).
Click(Text:"Constantine Papadopoulos",IgnoreNewlines:On)-- In the case of a long name like this, it's possible that it could wrap to a second line in the interface of an application under test, but the OCR could still read it with IngoreNewlines enabled.