Binary Data Manipulation with SenseTalk
On this page:
Data Values
Raw data can be represented directly in a script using a pair of hexadecimal digits for each byte of the data, enclosed in angle brackets, < and >.
put <00> into nullByte -- a single byte, with a value of zero
put <48656c6c6f> into secretMessage -- five bytes of data
The put before and put after forms of the put command can be used to insert additional binary data before or after an existing value.
put <20467269 656e6421> after secretMessage -- append more data
When two known binary data values are compared for equality, they are compared byte for byte to see that they have exactly the same binary contents.
put secretMessage is <48656c6c6f20467269656e6421> -- true
Syntax:
< hexadecimalData >
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
The hexadecimalData must consist of an even number of hexadecimal digits 0 through 9 and A through F. Spaces may be used to break the sequence up for readability.
AsData Function, As Data Operator
Behavior: The asData function, most often called using the as data operator, converts any value to its binary representation.
Use the asData function or as data operator when you want to tell SenseTalk to treat a value as binary data. This is especially useful for reading or writing a file or URL in its raw binary form (as described later in this chapter), but can also be used at any time to work with or display a value in its binary form.
When two known binary data values are compared for equality, they are compared byte for byte to see that they have exactly the same binary contents. Use as data to ensure that such a binary comparison is made.
Syntax:
asData(aValue)
aValue as data
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
The as data operator is usually more readable and natural to use than the asData function, but is otherwise identical in functionally.
Examples:
put "abcdefg" as data -- <61626364 656667>
put file "picture.jpg" as data into rawImageData -- read file contents as data
if file "monet.png" as data is equal to oldData as data then
//Do something
end if
Byte Chunks
The byte chunk type extends SenseTalk's chunk expressions to provide all of the flexibility offered by chunk expressions to working with binary data. The byte chunk type can be used to access a single byte or a range of bytes within a data value:
put <010203040506> into myData
put byte 2 of myData -- <02>
put bytes 3 to 4 of myData -- <0304>
put the last 3 bytes of myData -- <040506>
As with other chunk types, a byte chunk is a container, so it can be used to change the data:
put <010203040506> into myData -- <010203040506>
put <AABB> into bytes 2 to 5 of myData -- <01AABB06>
put <77> after byte 2 of myData -- <01AA77BB06>
delete the first 2 bytes of myData -- <77BB06>
Syntax:
byte
byteNumber of dataSource
bytes firstByte to lastByte of dataSource
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
A byte chunk expression is always treated as data in its immediate context (so there is no need to specify as data with it). The dataSource doesn't need to be specified as data. A non-data value will be converted to data automatically before the requested bytes are extracted.
Delete Byte Chunk
The delete command can also be used to delete a byte chunk from within a binary data value. This will reduce the overall size of the data by the number of bytes that are deleted.
Example:
delete byte 5 of myData
delete the last 2 bytes of myData
delete bytes 100 to 199 of myData
Binary Data Files
One of the most important uses of binary data in scripts is when reading and writing data in binary (non-text) files. There are several ways to work with binary data files, depending on your needs.
Simple Data File Access
The easiest way to access a text file is to treat the file directly as a container. The same approach will work for binary data files, by simply using the as data operator to indicate that the bytes of the file should be read directly:
put file "horse.tiff" as data into tiffData -- read the entire file at once
Writing data to a file can be done in the same way:
put rawBudgetData as data into file "budget.dat" -- write a data file
Remote URL File Access
Accessing a remote file through a URL works exactly the same as a local file. Simply specify URL instead of file, provide the URL instead of the file path, and use as data:
put URL "http://some.company.com/horse.jpg" as data into jpgData
put file "budget.dat" as data into URL remoteBudgetFileURL
Full Binary File Access
When more sophisticated processing is needed, the standard set of file commands including open file, read from file, write to file, seek in file, and close file can be used. The read and write commands have special options available for reading and writing numbers in binary data in a variety of formats. See the description of the read and write commands in File and Folder Interaction and Socket, Process, and Stream Input and Output.
In addition to those numeric data types, the byte chunk type can be used with the read command to read any given number of bytes as data:
read 20 bytes from file "singer.tiff" into formatData
To write binary data into an open file at the current location, just specify as data:
write orbitalCoordinates as data to file jupiter
The as data operator can be omitted if the value being written is specifically data already, such as when writing selected bytes from a data value: write bytes 1 to 16 of temperatureRecord to file saturn
Data Conversions
Binary data values are automatically converted to text whenever needed. There are many contexts in which this may happen, including when writing a value to a file or when a value is displayed. To force a value to be temporarily treated as data and avoid this conversion, use the as data operator:
put encryptedPassword as data into file "key" -- write binary data to a file
put "Secret" as data -- display in binary format: <53656372 6574>
Whenever a string value is converted from text to data, the current setting of the defaultStringEncoding global property is used to control how each character is encoded into the binary data. Conversions in the other direction—from data to text—are controlled by the defaultDataFormat global property.
Base64Encode, Base64Decode Functions, as Base64 Operator
Behavior: The base64Encode function converts binary data to a base64 text representation. This function may also be called using the as base64 operator. The base64Decode function takes text in the base64 format and converts it back to binary data.
Use these functions or the as base64 operator for converting binary data to or from the base64 format. Base64 is a standard format commonly used in a number of applications including email via MIME, and storing complex data in XML.
Syntax:
base64Encode(aValue)
base64Decode(aValue)
aValue as base64
Syntax definitions for language elements follow these formatting guidelines:
- boldface: Indicates words and characters that must be typed exactly
- italic: Indicates expressions or other variable elements
- {} (curly braces): Indicate optional elements.
- [] (square brackets) separated by | (vertical pipes): Indicate alternative options where one or the other can be used, but not both.
Example syntax:
In this example, "open file" is required and must be typed exactly. "fileName" is a variable element; it is the path to and name of the file being opened. The following expression is optional and indicates why the file is being opened. If this expression is added, "for" is required and must be typed exactly. One of the following must be included, but only one, and they also must be typed exactly: "reading", "writing", "readwrite", "appending", or "updating".
The as base64 operator is often more readable and natural to use than the base64Encode function, but is otherwise identical in functionally.
Examples:
put base64Encode(sourceData) into file "/tmp/datastore"
put base64Decode of file "/tmp/datastore" into restoredData
if file "/tmp/datastore" is equal to oldData as base64 then
//Do something
end if