# Using Patterns in SenseTalk

The SenseTalk pattern language lets you search for patterns within a source text with SenseTalk's natural language syntax. For a general description of the pattern language, see SenseTalk Pattern Language Basics. For complete information about how to define patterns, see Elements of the SenseTalk Pattern Language.

Below you will find information for how to use patterns in SenseTalk scripts.

## Operators, Commands, and Functions for Pattern Matching

Patterns are supported by all SenseTalk operators that look for the presence or location of some specified text within a larger string, as well as commands and functions that work with text:

`set ssn to <3 digits then dash then 2 digits then dash then 4 digits>`

put every offset of ssn in textblock into ssnList // Finds the offset (first character position) of all occurrences of ssn pattern match in textblock and places those positions into the ssnList list

You can use any of the following operators, command, and functions with the pattern language:

`contains`

operator`is in`

operator`begins with`

operator`ends with`

operator`offset`

,`range`

,`every offset`

,`every range`

functions`replace`

command`delete`

command`split`

command,`split by`

function`number of occurrences of`

function

In addition, there are operators and functions specific for use with patterns.

`Matches`

Operator

Use the `matches`

operator to test whether a variable or expression is an exact match for a pattern. One of the operands must be a pattern, and the other is treated as a string value. This operator returns `true`

if the pattern fully matches the entire string. If the pattern matches only part of the string or doesn't match at all, the result of the `matches`

operator is `false`

.

**Example:**

`put 83 matches <digit,digit> --> True`

put <"x", 3 chars, "y"> matches "xyzzy" --> True

Note that the `matches`

operator returns `true`

if the pattern can potentially match the full value, even if the usual match of the pattern might not return the full string (due to lazy quantifiers being used):

`put <"$", digits> matches "$895" --> True`

put the occurrence of <"$", digits> in "$895" --> "$8"

For information about lazy and greedy matches as well as quantifiers, see **Elements of the SenseTalk Pattern Language**.

`Match`

and `Every Match`

Functions

Use the `match`

and `every match`

functions to locate a pattern within text. The `match`

function finds the first occurrence of the specified pattern, and `every match`

finds every occurrence of the pattern in the source.

When the `match`

function finds a match for a pattern, it returns a match property list containing a number of properties, depending on the pattern. For a basic pattern, the property list includes `text`

, which is the matched text, and `text_range`

, which is the range where that text was found.

The `every match`

function returns a list of property lists, where each property list includes the same information as the `match`

function.

**Example:**

`put the match of <3 digits> in "123456789" --> {text:"123", text_range:"1" to "3"}`

put every match of <3 digits> in "123456789" --> [{text:"123", text_range:"1" to "3"},{text:"456", text_range:"4" to "6"},{text:"789", text_range:"7" to "9"}]

For detailed information about these functions, see **Pattern Matching Functions**.

`Occurrence`

and `Every Occurrence`

Functions

Use the `occurrence`

and `every occurrence`

functions to return the matched text for a defined pattern. The `occurrence`

function returns the first match found, and `every occurrence`

returns a list of every match found in the source text.

`put the occurrence of <"(",chars,")"> in "sqrt(42)" --> "(42)"`

put every occurrence of <3 digits> in "123456" --> [123,456]

For detailed information about these functions, see **Pattern Matching Functions**.

## Using Variables and Expressions in a Pattern

When you define a pattern, you can use any of the elements described in Pattern Elements. However, sometimes a pattern must include elements that vary from one use to the next.

For example, consider a simple date pattern that matches dates in the form *15-Aug-2018*. The following pattern finds such dates:

`set datePattern to <1 or 2 digits, "-", 3 letters, "-", 4 digits>`

But suppose you don't want to find every date, but only those in a particular month, which will be determined elsewhere in the script. In this case, you can use a variable in place of that element in the pattern. The pattern is constructed using the variable's value at runtime:

`set month to "Oct"`

set datePattern to <1 or 2 digits, "-", month, "-", 4 digits>

The value of the variable `month`

is substituted in the pattern so only dates in October are valid matches for the pattern.

Here is a similar example that sets the `month`

variable to the name of the current month by using the monthName() function:

`set month to the abbreviated monthName -- Use the current month for the pattern`

set datePattern to <1 or 2 digits, "-", month, "-", 4 digits>

You can also use an expression directly in a pattern, so the example above could be done like this:

`set datePattern to<1 or 2 digits, "-", the abbreviated monthName, "-", 4 digits>`

More complex expressions involving operators must be enclosed in parentheses. To find dates in the current month from the previous year, this pattern could be used:

`set datePattern to <1 or 2 digits, "-", the abbreviated monthName, "-", (the year -1)>`

## Embedded Patterns

In addition to variables that insert a specific string into the pattern (e.g., "Oct"), you can use pattern definitions in variables. This ability lets you embed patterns within larger pattern definitions. This technique is useful for constructing complex patterns, which can be built up in manageable pieces.

For example, you might need a pattern to find US phone numbers. A phone number in the US consists of a 3-digit area code and a 7-digit number (broken up into groups of 3 and 4 digits). The area code is optional within the local area, and there are several common ways of separating the groups of digits.

For this example, we'll construct a pattern that can match phone numbers like any of these:

`555-1212`

(800) 123-4567

(123)654-1111

888-222-0987

Let's break it down into pieces, starting with the different ways the area code might appear:

`set areaCodeParen to <"(", 3 digits, ")", maybe a space>`

set areaCodeDash to <3 digits, "-">

We want to allow for either of these options, so we use `or`

to construct our `areaCode`

pattern:

`set areaCode to <areaCodeParen or areaCodeDash>`

Finally, we define the `localNumber`

pattern and put it all together:

`set localNumber to<3 digits, "-", 4 digits>`

set phoneNumber to <maybe areaCode then localNumber>

This approach results in code that is more readable and understandable than if we were to define the entire pattern at once, which would have looked like this:

`set phoneNumber to <maybe (("(", 3 digits, ")", maybe a space) or (3 digits, "-")), 3 digits, "-", 4 digits>`