OCR Language Support
Using the Language Property
Language dictionaries can be specified either in Text Preferences or in line with your OCR search using the Language property. to any OCR search in Eggplant Functional scripts. The OCR Engine provides its own system dictionaries for the languages that have full built-in dictionary support. The property values listed in Supported OCR Language Properties are case sensitive.
The Language property can be used for both reading and searching for text using OCR. For more on the difference between reading and searching for text, see How to Use OCR.
Examples:
//Using the Language property to find text (searching)
Click (Text:"Aubergine", Language:"French")
//Using the Language property to restrict the readText() function
log ReadText(("TLImage","BRImage"), Language:"French") -- where "TLImage" and "BRImage" are captured images that define a search rectangle by indicating the top left and bottom right corners of that rectangle, respectively.
Supported OCR Language Properties
Eggplant Functional comes with numerous languages out of the box. Additional languages are also available for purchase.
Custom OCR Dictionaries
In addition to selecting specific languages, you can use SenseTalk properties to customize the OCR engine dictionary. You can add specific words that you want text searches to recognize, and you can list words that you want to prohibit the OCR engine from recognizing. For complete information about creating a custom dictionary, see Customize the OCR Engine Dictionary.
Languages Supported
The table below shows a comprehensive list of all languages supported for use with OCR.
* Denotes Full Dictionary Support
Abkhaz | Faeroese | Kurdish | Rundi |
Adyghe | Fijian | Lak | Russian * |
Afrikaans | Finnish * | Lappish | RussianOldSpelling * |
Agul | French * | Latin * | RussianWithAccent * |
Albanian | Frisian | Latvian * | Samoan |
Altaic | Friulian | Lezgin | Selkup |
Arabic | GaelicScottish | Lithuanian * | SerbianCyrillic |
ArmenianEastern * | Gagauz | Luba | SerbianLatin |
ArmenianGrabar * | Galician | Macedonian | Shona |
ArmenianWestern * | Ganda | Malagasy | Sioux (Dakota) |
Awar | German * | Malay | Slovak * |
Aymara | GermanNewSpelling * | Malinke | Slovenian * |
AzeriCyrillic | GermanLuxembourg | Maltese | Somali |
AzeriLatin * | Greek * | Mansi | Sorbian |
Bashkir * | Guarani | Maori | Sotho |
Basque | Hani | Mari | Spanish * |
Belarusian | Hausa | Maya | Sunda |
Bemba | Hawaiian | Miao | Swahili |
Blackfoot | Hebrew | Minankabaw | Swazi |
Breton | Hungarian * | Mixed (Russian and English) * | Swedish * |
Bugotu | Icelandic | Mohawk | Tabassaran |
Bulgarian * | Ido | Moldavian | Tagalog |
Buryat | Indonesian * | Mongol | Tahitian |
Catalan * | Ingush | Mordvin | Tajik |
Chamorro | Interlingua | Nahuatl | Tatar * |
Chechen | Irish | Nenets | Tinpo (Jingpo) |
ChinesePRC | Italian * | Nivkh | Tongan |
ChineseTaiwan | Japanese * | Nogay | Tswana |
Chukcha | Japanese+English * | Norweigan (NorvegianNynorsk and NorvegianBokmal) * | Tun |
Chuvash | JapaneseModern | NorwegianBokmal * | Turkish * |
Corsican | Kabardian | NorwegianNynorsk * | Turkmen |
CrimeanTatar | Kalmyk | Nyanja | TurkmenLatin |
Croatian * | KarachayBalkar | Occidental | Tuvin |
Crow | Karakalpak | Ojibway | Udmurt |
Czech * | Kasub | Ossetic | UighurCyrillic |
Danish * | Kawa | Papiamento | UighurLatin |
Dargwa | Kazakh | PidginEnglish (Tok Pisin language) | Ukrainian * |
Dungan | Khakas | Polish * | UzbekCyrillic |
Dutch * | Khanty | PortugueseBrazilian * | UzbekLatin |
DutchBelgian | Kikuyu | PortugueseStandard * | Visayan (Cebuano) |
English * | Kirgiz | Provencal | Welsh |
EskimoCyrillic | Kongo | Quechua | Wolof |
EskimoLatin | Korean * | RhaetoRomanic | Xhosa |
Esperanto | KoreanHangul * | Romanian * | Yakut |
Estonian * | Koryak | RomanianMoldavia | Zapotec |
Even | Kpelle | Romany | Zulu |
Evenki | Kumyk | Ruanda |
Eggplant Functional scripts recognize other keywords as pre-defined language properties as shown in Other Supported Keywords.
Other Supported Keywords
Basic | CMC7 | E13B | Pascal |
C++ | Cobol | Fortran | OCRA |
Chemistry | Digits | Java | OCRB |