Skip to main content

Pattern Matching Functions

SenseTalk's Pattern Language lets you use natural language to define patterns that you can use in searches to match against text. You can use any of the following operators, commands, and functions with patterns in place of fixed text strings:

The matches operator is used to test whether some text matches a pattern:

In addition, the operators and functions which are described below are specifically for use with patterns.

Match, Every Match Functions

Behavior: Use the match and every match functions to locate a pattern within text. The match function finds the first occurrence of the specified pattern, and every match finds every occurrence of the pattern in the source.

Parameters:

  • the match of pattern (required): Can be specified using the Pattern Language Syntax within angle brackets ( < ... > ) or as a variable whose value is a pattern.
  • in source (required): Can be specified as a quoted string, in a variable, or as an expression that yields text.
  • after position (optional): Specifies a number for a character position within the text such that the search for the pattern match begins with the next character position.
  • before position (optional): Specifies a number for a character position within the text such that the search for the pattern match ends with the character before that character position.
  • caseSensitivity (optional): Specifies any of the standard case sensitivity phrases (caseSensitive, with case, etc.), to determine whether searches for text are case sensitive or not. Default: as set by the caseSensitive local property.

Syntax:
{the} match of pattern [in | within] source { [before | after] [ {position | location} position | {the} end ] } {caseSensitivity}
every match of pattern [in | within] source { [before | after] [ {position | location} position | {the} end ] } {caseSensitivity}

match( pattern, source {, position {, caseSensitive {, treatPositionAsBefore }}} )
everyMatch( pattern, source {, position {, caseSensitive {, treatPositionAsBefore }}} )

note

When using the match() or everyMatch() traditional function call syntax, the first two parameters are required and the remaining three are optional. The caseSensitive parameter in this case is a boolean (default: False) indicating whether the search is case sensitive. The treatPositionAsBefore parameter is a boolean (default: False) that specifies whether the search should occur before the location given by the position parameter rather than after.

Returns: One or more property lists to provide information about locations in the source text where the pattern was found. The match function returns one property list, and every match returns a property list for each match found. Each match property list contains at least two properties:

  • text: The full text that was matched.
  • text_range: The range of characters in the source where the matched text was located.

When the pattern contains one or more capture groups, the match property list also includes a pair of properties for each capture group included:

  • name: The name of the capture group.
  • name_range: The range where the capture group was found.
note

Because the full matched text is always returned with the property name text, you should not use the name text for any capture groups within a pattern.

Example:

put the match of <punctuation> within "Green 1: 112-14" --> {text:":", text_range:8 to 8}

Example:

put the match of <3 digits> in "1bc3 8472QX905" --> {text:"847", text_range:6 to 8}

Example:

put match(<digit, space, max digits>, "1bc3 8472QX905") --> {text:"3 8472", text_range:4 to 9}

Example:

put every match of <3 digits> in "123456789" --> [{text:"123", text_range:1 to 3},{text:"456", text_range:4 to 6},{text:"789", text_range:7 to 9}]

Example:

put everyMatch (<3 digits>, "987654321") --> [{text:"987", text_range:1 to 3},{text:"654", text_range:4 to 6},{text:"321", text_range:7 to 9}]

Related:

Occurrence, Every Occurrence Functions

Behavior: The occurrence and every occurrence functions return the matched text for a defined pattern. The occurrence function returns the first match found, and every occurrence returns a list of every match found in the source.

Parameters:

  • the occurrence of pattern (required): Can be specified in the pattern language syntax within angle brackets ( < ... > ) or as a variable. For information about the pattern language syntax, see Pattern Language Syntax.
  • in source (required): Can be specified as a quoted string, in a variable, or as an expression that yields text.
  • after position (optional): Specifies a number for a character position within the text such that the search for the pattern match begins with the next character position.
  • before position (optional): Specifies a number for a character position within the text such that the search for the pattern match ends with the character before that character position.
  • caseSensitivity (optional): Specifies any of the standard case sensitivity phrases (caseSensitive, with case, etc.), to determine whether searches for text are case sensitive or not. Default: as set by the caseSensitive local property.

Syntax:
{the} occurrence of pattern [in | within] source { [before | after] [ {position | location} position | {the} end ] } {caseSensitivity}
every occurrence of pattern [in | within] source { [before | after] [ {position | location} position | {the} end ] } {caseSensitivity}

occurrence( pattern, source {, position {, caseSensitive {, treatPositionAsBefore }}} )
everyOccurrence( pattern, source {, position {, caseSensitive {, treatPositionAsBefore }}} )

note

The words occurrence and instance are used interchangeably within SenseTalk scripts—wherever the word occurrence is used in a script, you can use the word instance instead.

Tech Talk

When using the occurrence() or everyOccurrence() traditional function call syntax, the first two parameters are required and the remaining three are optional. The caseSensitive parameter in this case is a boolean (default: False) indicating whether the search is case sensitive. The treatPositionAsBefore parameter is a boolean (default: False) that specifies whether the search should occur before the location given by the position parameter rather than after.

Returns: Text that matches the pattern, if any; otherwise empty.

Example:

put occurrence of <3 digits> in "1bc3 8472QX905" --> "847"

Example:

put the instance of <"$", digits> in "$895" --> "$8"

Example:

put occurrence (<"$", digits>, "$895") --> "$8"

Example:

put every instance of <3 digits> in "123456789" --> ["123","456","789"]

Example:

put occurrence of <max digits> in "Issue G429 was resolved on 15-Jun-2018" --> 429
put the range of <max digits> in "Issue G429 was resolved on 15-Jun-2018" --> 8 to 10
put every instance of <max digits> in "Issue G429 was resolved on 15-Jun-2018" after position 10 --> [15,2018]

Example:

set KingQuotes to {{
Darkness cannot drive out darkness; only light can do that. Hate cannot drive out hate; only love can do that.
The ultimate measure of a man is not where he stands in moments of comfort and convenience, but where he stands at times of challenge and controversy.
Faith is taking the first step even when you don't see the whole staircase.
Our lives begin to end the day we become silent about things that matter.
Injustice anywhere is a threat to justice everywhere.
I look to a day when people will not be judged by the color of their skin, but by the content of their character.
I have decided to stick with love. Hate is too great a burden to bear.
The time is always right to do what is right.
Life's most persistent and urgent question is, "What are you doing for others?"
We must learn to live together as brothers or perish together as fools.
}}

set LWords to <word beginning with "L", chars, word break>
put every occurrence of LWords in KingQuotes
--> [light,love,lives,look,love,Life,learn,live]

set JWords to <"J" at start of a word, chars, end of word>
put every match of JWords in KingQuotes
--> [{text:"justice", text_range:447 to 453},{text:"judged", text_range:507 to 512}]

Related: