Working with Optical Character Recognition
When you want to find text on your system under test (SUT) and capturing an image of the text is not practical
, you can rely on the optical character recognition (OCR) capabilities in Eggplant Functional and SenseTalk. OCR is most useful when searching for dynamic text, although you might find many other practical uses.
We highly recommend that you take time to understand the OCR concepts discussed on this page, as well as the Text Properties used with OCR, so that you can make appropriate adjustments to your OCR searches and achieve better search results in your test environment.
Succeeding with OCR: Using The Most Common OCR Properties
SenseTalk includes a number of text properties that allow you to tailor your OCR searches to your situation and environment. Using a tailored OCR search improves the reliability of the search and helps you get the best results. For a full list of OCR properties, as well as information on which properties can be used for reading text vs. searching for text, see Text Properties.
When using properties to tailor your OCR search, it is important to carefully consider which properties to use, and how many. If a search doesn't work, it can be tempting to keep adding properties, but that is not always the best approach. Sometimes removing a property is necessary. Try using them individually first, then add properties as needed. Use the OCR Tuner to play around with properties and see what works. For more about troubleshooting OCR searches, see Troubleshooting OCR.
Each of the following methods might be used to improve OCR recognition or speed up searches. For more information, see Improving the Speed of OCR Searches.
Search Rectangles
It is almost always helpful to add a search rectangle, which limits what part of the screen OCR searches. When OCR searches the entire screen, it is not only slower, but it can also be less accurate because it is more likely to come up with extra possible matches or no matches at all.
Search rectangles are typically defined using images, though coordinates can also be passed to this property. The hot spot of the captured image defines the point used (see Using the Hot Spot for more information on moving this point).
Using images is ideal because it allows the location of the rectangle to be dynamic as elements display in different places on the SUT screen. For instance, the text might appear in a window that does not always display on the same location on the SUT screen. There might be an icon on that window that you can capture an image of, using it as an anchor for your image-defined search rectangle.
If it is not possible to use images to set the search rectangle, you can use screen coordinates. The Cursor Location toolbar icon in the Viewer window is helpful in this endeavor because it shows the current location of the mouse on the SUT. For instructions on customizing your Viewer window toolbar, see Customize the Toolbar.
Example: Reading Dynamic Text from a Website
You might have a test that navigates to the Google Finance page, searches for a specific company, and then reads the stock price. To make sure that OCR reads the value of the price reliably, and nothing else, you can define a search rectangle using images. The code used could be as simple as this:
Log ReadText ("TLImage","BRImage")
You need to capture two images for the above code to work. In this example, you used images TLImage
and BRImage
to define the top left and bottom right corners of the search rectangle. To capture these images, choose one or more elements of the screen that are stable in relation to the text OCR reads.
This example uses a single element of the screen, the "Company" label.
TLImage
is an image of the "Company" label with the hot spot moved to the upper left corner of the area where the stock price is displayed.
Without even moving the capture area, capture the BRImage
with the hot spot moved to the bottom right corner of the area where the stock price displays.
After capturing the images and writing the SenseTalk code, the above example is able to successfully read the stock price for any company in Google Finance.
"TLImage" and "BRImage" are used to set the search rectangle for a successful ReadText() search using OCR .
Contrast
This property causes Eggplant Functional to see in black and white only. Whatever color is being used as the ContrastColor
(the background color) turns white, and everything else turns black. For black text on a white background, the code looks like this:
Click (Text: "hello", Contrast:On, ContrastColor: White, ValidCharacters: "hello", Searchrectangle: ("UpperLeftImage", "LowerRightImage"))
In this situation, every pixel close to white (within the standard tolerance range of 45) is turned white, and everything else is read as black. This can help OCR to more clearly read the text.
What it actually sees is this image, free of any anti-aliasing:
What OCR Sees with the Contrast On
To see the text image sent to the OCR engine with your Contrast
settings, use the OCR Tuner panel. This panel has a live display showing you what will be sent to the OCR engine for recognition with any combination of contrast-related settings (excluding EnhanceLocalContrast).
When looking for text on a gray background (or another color of medium value), it can get a little more complex. It is good to set the ContrastTolerance
a little lower (down to 20 or so), which narrows the number of pixels that OCR might try to turn white. Notice in the image above, a pixel between the “h” and the “e” in “hello” turned black, joining the two letters together. In this instance, OCR was still able to read the letters, but in some other circumstances, this could make the letters even more difficult for OCR to process. This is why it is always a good idea to try things out and see how they work before running a full script.
Click (Text:"hello", Contrast:On, ContrastColor: White, ContrastTolerance: 20)
If the ContrastColor
is not set for a Contrast
search, then it defaults to the color of the pixel in the top left corner of the search area.
Determining the Background Color
If the background/contrast color is not known, there are two ways to find the RGB values of the background color in a specific location on any platform.
Method 1: Use the Color Picker (Mac)
To use the color picker, follow the steps below:
-
Click the Find Text icon in the toolbar of the remote screen window. This opens the Find Text panel.
-
Click the Text or Background color box. The Colors system panel will display.
-
Click the eye-dropper icon for the color picker, and then hover over the text or background and select the color you want to use for your contrast color setting. Be careful not to select pixels in the anti-aliasing.
Finding the Contrast Color on Mac, Using the Color Picker
Method 2: Use the ColorAtLocation()
function
To use the ColoratLocation()
function, follow the steps below:
- Open the Viewer window in Live Mode. If the Cursor Location toolbar icon is not already on the toolbar, see Customize the Toolbar.
- Now do the following to use the
ColorAtLocation
function. - Move the mouse over the background of which the RGB color value is desired.
- Using the coordinates of the mouse on the SUT, shown in the field of the Cursor Location toolbar icon, run this line of code in the ad hoc do box (AHDB) found at the bottom of the run window, or from a script:
put colorAtLocation(x,y) // where (x,y) refers to the coordinates found in the remote screen window
Running this line of code returns the RGB value for the color at the location specified.