Elements of the SenseTalk Pattern Language
The SenseTalk pattern language lets you define patterns that you can use to match strings in text. As explained in SenseTalk Pattern Language Basics, the pattern matching capabilities are built on top of regular expressions (regex). The SenseTalk pattern language lets you define patterns in easy-to-read syntax.
You can create pattern definitions for simple patterns, such as the occurrence of any three digits. You can also define patterns for complex patterns that have optional or alternative portions, and that can have varying lengths. Every pattern, however simple or complex, is built from a number of basic pattern elements.
Pattern Language Syntax
Pattern definitions in the SenseTalk pattern language consist of the pattern description enclosed in angle brackets (< ... >).
Syntax:
{pattern} < patternLanguageExpression >
Note the word pattern
is optional with the pattern language and typically will be omitted.
A pattern definition — represented in the syntax above by patternLanguageExpression — can be a single element, such as 7 digits
to find any occurrence of seven digits in a row. However, most patterns will include a sequence of elements, or subpatterns. The sequence is specified by listing each subpattern one after another separated by commas, separated by the word then
, or separated by listing each element on a new line (or some combination of these options).
Therefore, the following examples are all equivalent methods of representing a pattern definition for a Social Security identification number:
set ssn to <3 digits then "-", 2 digits then "-", 4 digits>
set ssn to <3 digits, "-", 2 digits, "-", 4 digits>
set ssn to <3 digits then "-"
2 digits, "-",
then 4 digits>
You can use the word or
to specify alternative choices in a subpattern. For example,
<"cat" or "cow">
matches either cat
or cow
.
You can use parentheses to group elements when necessary. For example,
<"cat" or "cow" then 2 digits>