Working with Chunks
On this page:
- Storing into Chunks
- Storing into Chunk Ranges
- Storing into Chunks with Patterns
- Storing into Nonexistent Chunks
- Storing into Multiple Chunks
- Deleting Chunks
- Counting Chunks
- Number Function
- Testing for Presence of a Chunk Value
- Is Among Operator
- Determining Chunk Position of a Value
- Counting Occurrences of a Chunk Value
- Iterating Over All Chunks in a Value
- Extracting a List of Chunks using Each Expressions
Storing into Chunks
In addition to accessing a portion of a value, chunk expressions can also be used to store into a portion of a value, provided the thing being accessed is a container.
put "Jack Peterson" into name
put "d" into char 3 of last word of name
put "e" into char -2 of name
put "Olaf" into first word of name
put name -- "Olaf Pedersen"
You can also store something before or after a chunk:
put "The plant is growing" into phrase
put "egg" before word 2 of phrase
put " purple" after word 1 of phrase
put phrase -- "The purple eggplant is growing"
Storing into Chunk Ranges
When storing into chunk ranges, the entire range will be replaced:
put "The great grey green gooey goblin" into monster
put "ugly" into words 2 to 5 of monster
put monster-- "The ugly goblin"
Storing into Chunks with Patterns
You can use occurrence and match with patterns for storing into chunks much like other chunk types, including storing into ranges of matches in the source string.
set text to "[a]hello[b]bonjour[c]hola[d]"
set marker to <"[", character, "]">
put occurrences 2 to 3 of marker in text --> ([b],[c])
put "$$$" into occurrences 2 to 3 of marker in text
put text --> [a]hello$$$hola[d]
For information about using patterns, see SenseTalk Pattern Language Basics.
Storing into Nonexistent Chunks
put "mercury,venus,mars" into gods
put "saturn" into item 5 of gods
put gods-- "mercury,venus,mars,,saturn"
Here, the word saturn was put into the fifth text item of a value that previously had only 3 text items. To accommodate the request, two addtional commas were automatically inserted before the word saturn so that it would become the new fifth item. The actual character inserted matches the current setting of the itemDelimiter property.
When storing into list items beyond the end of a list, the results are similar:
put [dog, cat, mouse] into pets
put rabbit into item 7 of pets
put pets-- [dog,cat,mouse,,,,rabbit]
For lines, the behavior is very similar to that for text items. But because the lineDelimiter can be a list of several possible delimiters, any one of which could indicate a new line, it can't be used to provide the inserted delimiter. Instead, a separate global property called the lineFiller provides the delimiter string (by default, Return) that is inserted as many times as needed to fill the text out to the requested line number.
For word chunks beyond the end of the text, a simple delimiter is not enough. Because a word delimiter can be any amount of whitespace, simply inserting more spaces won't add more words. So the wordFiller global property provides a placeholder "word" (by default, "?") to insert along with spaces to fill out the text to the desired number of words:
put "one two three" into someWords
put "seven" into word 7 of someWords
put someWords-- "one two three ? ? ? seven"
For character chunks, the characterFiller global property (by default, ".") provides text to be repeated as needed to fill the text out to the desired character position:
put "abcdefg" into alpha
put "z" into character 26 of alpha
put alpha-- "abcdefg..................z"
When a negative chunk number larger than the number of chunks is used, the result is similar to the above descriptions for all chunk types, but with fillers or delimiters added at the beginning of the value to achieve the expected result:
put "abc" into backfill
put "X" into character -7 of backfill
put backfill-- "X...abc"
Related Global Properties
As described above, SenseTalk includes global properties to provide filler text for those cases when you use chunk expressions to add characters, lines, or words to chunks that expand them beyond their current limits. These three properteis, the characterFiller, the lineFiller, and the wordFiller, are described in detail on Local and Global Properties for Chunk Expressions.
Storing into Multiple Chunks
You can store into multiple chunks at once by supplying a list of chunk numbers:
put "The great grey green gooey goblin" into monster
put "G" into chars [5,11,16,22,28] of monster
put monster -- "The Great Grey Green Gooey Goblin"
You can store multiple values at once by supplying a list of values as well as of chunk numbers:
put ["Old","Ugly"] into words (5,2) of monster
put monster -- "The Ugly Grey Green Old Goblin"
Deleting Chunks
Chunks of containers, besides being stored into, can also be deleted. This is done with the delete command (described in detail in Text and Data Manipulation):
Example:
put [dog, cat, gorilla, mouse] into pets
delete item 3 of pets-- [dog, cat, mouse]
Example:
put "My large, lumpy lout of a lap dog is lost." into ad
delete words 2 to 7 of ad-- "My dog is lost."
Counting Chunks
To find out how many of a given chunk type are present in some value, use the number function:
Example:
get the number of characters in "extraneously"-- 12
Example:
put number of words in "I knew an old woman"-- 5
Example:
if the number of items in list is less than 12 then ...
Number Function
Behavior: The number function counts the number of characters, words, lines, text items, list items, keys, values, or bytes in a value. Use this function whenever you need to determine how many of a particular chunk type are present in a value. If the value is empty, the result will always be zero. In addition to the usual text chunks and bytes, when expression is an object or property list chunks can be "keys" or "values" to count the number of keys or values that are defined in the object.
Syntax:
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
Example:
put "I wept because I had no answers, until I met a man who had no questions." into quote
put the number of characters in quote-- 72
put the number of words in quote -- 16
put the number of items in quote -- 2
put the number of lines in quote -- 1
Testing for Presence of a Chunk Value
You can find out whether a particular value is present as one of the chunks of another value using the is among or is not among operator.
Is Among Operator
Behavior: The is among operator tests whether a particular value is present among the characters, words, lines, text items, list items, keys, values, or bytes in a value. This will only return true if the target value is equal to one of the specified chunks. Contrast this with the is in or contains operators which will only test whether one text string is a substring of another (see the second example). In addition to the usual text chunks, when expression is an object or property list chunks can be "keys" or "values" to test whether targetValue is one of the keys or values of the object.
Syntax:
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
Example:
put "be" is among the words of "To be or not to be" -- true
Example:
put "be" is among the words of "I believe I am a bee" -- false
Example:
put 7 is among the items of [5,5+1,5+2,5+3] -- true
Example:
put "M" is not among the characters of "Avogadro"-- true
Determining Chunk Position of a Value
You can find the ordinal position of characters, words, lines, text items, and list items within a value (searches are case-insensitive unless “considering case” or “with case” is specified). The number 0 will be returned if the target expression is not found:
Syntax:
{the} chunk number containing targetValue within sourceValue {considering case | ignoring case}
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
Example:
put "The rain, in Spain, is mainly in the plain" into text
put the character number of "t" within text-- 1
put character number of "t" within text considering case-- 34
put the text item number of " in Spain" within text-- 2
put the word number of "mainly" within text-- 6
put the line number of "another line" within text -- 0
To find the word, line, or item number that contains a value (rather than one that is equal to the value), use the word containing instead of of:
put the word number of "main" within text-- 0
put the word number containing "main" within text -- 6
put the text item number containing "Spain" within text-- 2
Counting Occurrences of a Chunk Value
To count how many times a particular chunk value occurs within a source value, use the number of occurrences or number of instances function.
Syntax:
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
Example:
put the number of occurrences of "a" among the chars of "banana"-- 3
Example:
put the number of instances of "be" among the words of "to be or not to be"-- 2
Example:
put the number of occurrences of 15 among the items delimited by "-" of "315-15-4152"-- 1
Example:
put the number of occurrences of <digit> in "64W x 8H" // 3
If a specific chunk type is not named, characters are assumed unless the source value is a list or an object, in which case list items or property values are assumed, respectively:
put number of occurrences of "a" in "banana" -- 3
Example:
put the number of instances of 3 in (1,3,5,6,3,2)-- 2
Example:
put number of occurrences of "Do" in "Do,re,mi,do" -- 2
For case-sensitive comparisons, use “considering case” (or set the caseSensitive property to true).
Example:
put number of instances of "Do" in "Do,re,mi,do" considering case-- 1
As a special case, “among the characters of” can be used not only to count occurrences of a single character, but of a sequence of characters.
Example:
put number of instances of "na" among the chars of "banana" -- 2
Iterating Over All Chunks in a Value
To do something with each of the chunks within a value, use the repeat with each form of the repeat command (which is also described in Script Structure and Control Flow).
Example:
repeat with each line in file "/tmp/output"
if the first word of it is "Error:" then put it
end repeat
Extracting a List of Chunks Using Each Expressions
Any expression of the form each chunkType of sourceValue will yield a list containing all of the chunks of that type (if chunkType is omitted, item will be assumed).
Syntax:
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
Example:
put each character of "Sweet!" -- ["S","w","e","e","t","!"]
Example:
put each word of "Wisdom begins in wonder" -- ["Wisdom","begins","in","wonder"]
More interestingly, an each expression can be part of a larger expression. Within the larger expression, operators apply to each item of the list rather than to the list as a whole.
Example:
put "Z" & each character of "Cat" -- ["ZC","Za","Zt"]
Example:
put 2 + each item of "1,2,5,6" -- [3,4,7,8]
Example:
put the length of each word in "Wisdom begins in wonder"-- [6,6,2,6]
put each word of "Wisdom begins in wonder" begins with "w" -- [true,false,false,true]
Parentheses limit the scope of the larger each expression, limiting the behavior to applying to the list as a whole rather than to each individual item.
Example:
put sum of the length of each word in "Wisdom begins in wonder" -- [6,6,2,6]
put sum of (the length of each word in "Wisdom begins in wonder") -- 20
An each expression can also include a where clause to select a subset of the items in the list. The word each can be used within the where clause to refer to each source item.
Example:
put each word of "Wisdom begins in wonder" where each begins with "w" -- ["Wisdom","wonder"]
Example:
put each item of [1,2,3,4,5,6,7,8,9] where the square root of each is an integer -- [1,4,9]
Related: