Text-Reading Functions
The following information describes the SenseTalk commands and functions you can use in Eggplant Functional that work with text recognition through optical character recognition (OCR) searches or by using the ReadCharacters() Function for character searches.
ReadText Function
Behavior: Returns the text in a given screen rectangle or near a designated point.
Parameters: A required pair of images, single image, rectangle (coordinate pair), or point indicating the region of the screen to read. (Using a pair of images usually produces the best results.)
The ReadText function also takes an optional text property list, which can improve text search results. SenseTalk includes many OCR properties that you can use with the ReadText function and other OCR searches. For a full list of OCR properties, see Text Properties. The following properties are typically the most useful in helping the OCR engine recognize text:
- Contrast: Whether or not the SUT display is converted to a high contrast two-color image before it is sent to OCR for analysis. If
contrastis on, a color referred to as the "contrast color" (which can be set using theContrastColorproperty) is considered the primary color of the SUT display, and all other colors are treated as the secondary color. Text can be found in either color. TheContrastproperty is available for use with both searching for (finding) text and reading text.- ContrastColor: If
Contrastis on, the contrast color is considered the primary color of the SUT display, and all other colors are treated as the secondary color. For instructions on finding the background color, see Determining the Background Color. - ContrastTolerance: When
Contrastis on,contrastTolerancesets the maximum per-channel color difference that is allowed for a pixel to be seen as the contrast color.
- ContrastColor: If
- ValidCharacters: The
ValidCharactersproperty limits the characters that may be found by the OCR text engine.ValidCharacterscan be limited to the characters in the string you are searching for by setting the string to "*". This can be useful if you are trying to "force" a text match from characters that are not being recognized. If OCR determines that characters are present in the defined area but they do not match characters provided in theValidCharactersstring, it will return "^". TheValidCharactersproperty is overridden by anyValidWordsproperty that is set. - ValidWords: Limiting the words that OCR can consider a match allows you to steer the OCR engine toward a successful match, or force the engine to recognize your text string correctly. You can use the asterisk (*) as a wildcard so that the OCR engine looks only for the words included in your original text string. This property determines the words provided to OCR as a language dictionary that may be found by the OCR text engine. This means that it overrides the
Languageproperty; for more information, see Customize the OCR Engine Dictionary. TheValidWordsproperty also overrides anyValidCharacterssetting, so alternate orderings of letters in words provided will also be matched. For example, ifValidWordsis set to "top", "pot" will also be found. - IgnoreSpaces: The
ignoreSpacesproperty causes OCR text searches to disregard spaces in your text string. For example, the string "My Computer" would match "MyComputer" or "M y C o m p u t e r". TheignoreSpacesproperty is on by default. This is because the OCR sometimes reads spaces that are not intended, especially in strings that are not discrete words, and in text with unusual letter-spacing. - IgnoreUnderscores: The
ignoreUnderscoresproperty causes OCR text searches to treat underscores as spaces during searches. For example, the string "My_Computer" would match "My_Computer" or "My Computer". TheignoreUnderscoresproperty is on by default, because the OCR sometimes fails to recognize underscores. - EnhanceLocalContrast: Enable this property if you want OCR to automatically increase the local contrast of the text image being sent to the OCR engine. This property may aid recognition when some or all of the text being read has relatively low contrast, such as blue text on a dark background. When contrast is turned on, this property has no effect, so it is only useful when
Contrastis turned off. - Language: The natural language of the text you are searching for. (For a list of supported languages, see OCR Language Support.) OCR uses this as a guide, giving preference to words specified in the dictionary it is using. More than one language can be specified. Eggplant Functional comes with numerous languages by default, and additional languages are available for purchase. If no language is specified OCR will still read text; it just won't have a dictionary to compare its findings to. You can also create a Custom OCR Dictionary.
- ValidPattern: This property takes a regular expression value and returns only characters or words that match the pattern specified. For information on regular expression characters that can be used with SenseTalk, see Using Patterns in SenseTalk. If you want OCR to prefer a pattern but not require it, see PreferredPattern.
See Using OCR Properties in Searches for more information about using these common text properties.
For a full list of all properties available for use with ReadText() see Text Properties. You might need to experiment with these properties in your specific environment to see which ones provide the best results for your text searches.
Example:
put the last character of readtext("UpperLeft","LowerRight",contrast:on, contrastColor:[0,0,128],contrastTolerance:25) // Prints only the last character returned by the readText function
Example:
Log trimAll(ReadText(["UpperLeft","LowerRight"], ValidPattern: "[A-Za-z]+\.py")) // Enforces that readText use a regular expression to read from the screen, and trims all white space, such as tabs and carriage returns, from the output.
Example:
Use code similar to this to read to the left of a label:
put ImageLocation("ClassSchedule") into ILC // Stores the location of the hot spot of "ClassSchedule" in a variable. The hot spot is located at the center of the image.
put [(ILC+[20,60]), (imageRectangle("ReadingAssignmentHeader").BottomLeft + [-10,34])] into TableRectangle // Use the locations of ClassSchedule and ReadingAssignmentHead to set establish the location of the first table cell you want to read
repeat 7 times // Indicates how many rows will be read
log ReadText(TableRectangle) // Logs the content of the cell
add [[0,30],[0,30]] to TableRectangle // Shifts the read area 30 pixels down to cover the next cell
end repeat
Practice:
Use the class schedule under the readTable() Function to practice the next three examples.
Example:
log readText(imagelocation("ReadingAssignmentHeader").x, imagelocation (text:"Introduction").y,singlecolumnmode:true) // Reads around a single point based on the x coordinate of the column and y coordinate of the row
Example:
// Use code similar to this to read every row in a particular table column
put ImageLocation("ClassSchedule") into ILC // Stores the location of the hot spot of "ClassSchedule" in a variable. The hot spot is located at the center of the image.
put ((ILC+(20,60)), (imageRectangle("ReadingAssignmentHeader").BottomLeft+(-10,34))) into TableRectangle // Use the locations of ClassSchedule and ReadingAssignmentHead to set establish the location of the first table cell you want to read
repeat 7 times // Indicates how many rows will be read
log ReadText(TableRectangle) // Logs the content of the cell
add ((0,30),(0,30)) to TableRectangle // Shifts the read area 30 pixels down to cover the next cell
end repeat
Example:
Use code similar to this to read columns for specific rows:
set Characters to 0..9 &&& dash &&& "Chapters" // Puts all numbers, -, and the letters in "Chapters" into a variable as a list
put the readTextSettings into RTS // Stores the current readTextSettings in a variable
set the readTextSettings to {validCharacters:Characters} // Sets the readTextSettings based on the list of validCharacters stored in Characters. Using the readTextSettings is appropriate when executing multiple readText or readTable functions that require the same set of readTextSettings
put [2,4,7] into Weeks // Stores a list of the row identifiers of interest into a variable
put imageRectangle("WeekHeader") into WeekHeader // Stores the rectangle for the found image "WeekHeader"
put imageRectangle("ReadingAssignmentHeader") into ReadingHeader // Stores the rectangle for the found image "ReadingAssignmentHeader"
repeat with each Week of Weeks // Repeats once for each row identifier in Weeks
put [WeekHeader.BottomLeft, [WeekHeader.Right, remoteScreenSize().y]] into WeekColumn // Creates a rectangle based on location of the Weeks column in the table
put imageRectangle(text:Week, searchRectangle:WeekColumn) into RowNum // Stores the rectangle for the row identifer into a variable
put readText(ReadingHeader.Left, RowNum.Top-4, ReadingHeader.Right, RowNum.Bottom+4) into Reading // Reads a rectangle based on the row position and the column position
If Reading is not empty then // Checks whether the return of the readText function is empty
Log Reading
else
Log "No chapter."
end if
end repeat
set the ReadTextSettings to RTS // Sets the ReadTextSettings back to the original property list