Binary Data Manipulation with SenseTalk
Most SenseTalk scripts work with data in the form of text and numbers, and sometimes other types of values such as dates or colors. When needed, SenseTalk can also deal with data in its binary form (the raw bits and bytes that are stored on a computer).
Data Values
Raw data can be represented directly in a script using a pair of hexadecimal digits for each byte of the data, enclosed in angle brackets, < and >.
put <00> into nullByte // a single byte, with a value of zero
put <48656c6c6f> into secretMessage // five bytes of data
The put before
and put after
forms of the put
command can be used to insert additional binary data before or after an existing value.
put <20467269 656e6421> after secretMessage // append more data
When two known binary data values are compared for equality, they are compared byte for byte to see that they have exactly the same binary contents.
put secretMessage is <48656c6c6f20467269656e6421> --> True
Syntax:
< hexadecimalData >
The hexadecimalData must consist of an even number of hexadecimal digits 0 through 9 and A through F. Spaces may be used to break the sequence up for readability.
AsData
Function, As Data
Operator
Behavior: The asData
function, most often called using the as data
operator, converts any value to its binary representation.
Use the asData
function or as data
operator when you want to tell SenseTalk to treat a value as binary data. This is especially useful for reading or writing a file or URL in its raw binary form (as described later in this chapter), but can also be used at any time to work with or display a value in its binary form.
When two known binary data values are compared for equality, they are compared byte for byte to see that they have exactly the same binary contents. Use as data
to ensure that such a binary comparison is made.
Syntax:
asData( aValue )
aValue as data
The as data
operator is usually more readable and natural to use than the asData
function, but is otherwise identical in functionally.
Examples:
put "abcdefg" as data --> <61626364 656667>
put file "picture.jpg" as data into rawImageData // read file contents as data
if file "monet.png" as data is equal to oldData as data then
// Do something
end if
Byte Chunks
The byte
chunk type extends SenseTalk's chunk expressions to provide all of the flexibility offered by chunk expressions to working with binary data. The byte
chunk type can be used to access a single byte or a range of bytes within a data value:
put <010203040506> into myData
put byte 2 of myData --> <02>
put bytes 3 to 4 of myData --> <0304>
put the last 3 bytes of myData --> <040506>
As with other chunk types, a byte chunk is a container, so it can be used to change the data:
put <010203040506> into myData -- <010203040506>
put <AABB> into bytes 2 to 5 of myData -- <01AABB06>
put <77> after byte 2 of myData -- <01AA77BB06>
delete the first 2 bytes of myData -- <77BB06>
Syntax:
byte byteNumber of dataSource
bytes firstByte to lastByte of dataSource
A byte chunk expression is always treated as data in its immediate context (so there is no need to specify as data
with it). The dataSource doesn't need to be specified as data. A non-data value will be converted to data automatically before the requested bytes are extracted.
Delete Byte Chunk
The delete command can also be used to delete a byte chunk from within a binary data value. This will reduce the overall size of the data by the number of bytes that are deleted.
Example:
delete byte 5 of myData
delete the last 2 bytes of myData
delete bytes 100 to 199 of myData
Binary Data Files
One of the most important uses of binary data in scripts is when reading and writing data in binary (non-text) files. There are several ways to work with binary data files, depending on your needs.
Simple Data File Access
The easiest way to access a text file is to treat the file directly as a container. The same approach will work for binary data files, by simply using the as data
operator to indicate that the bytes of the file should be read directly:
put file "horse.tiff" as data into tiffData // read the entire file at once
Writing data to a file can be done in the same way:
put rawBudgetData as data into file "budget.dat" // write a data file
Remote URL File Access
Accessing a remote file through a URL works exactly the same as a local file. Simply specify URL
instead of file
, provide the URL instead of the file path, and use as data
:
put URL "http://some.company.com/horse.jpg" as data into jpgData
put file "budget.dat" as data into URL remoteBudgetFileURL
Full Binary File Access
When more sophisticated processing is needed, the standard set of file commands including open file
, read from file
, write to file
, seek in file
, and close file
can be used. The read
and write
commands have special options available for reading and writing numbers in binary data in a variety of formats. See the description of the read
and write
commands in File and Folder Interaction and Socket, Process, and Stream Input and Output.
In addition to those numeric data types, the byte
chunk type can be used with the read
command to read any given number of bytes as data:
read 20 bytes from file "singer.tiff" into formatData
To write binary data into an open file at the current location, just specify as data
:
write orbitalCoordinates as data to file jupiter
The as data
operator can be omitted if the value being written is specifically data already, such as when writing selected bytes from a data value: write bytes 1 to 16 of temperatureRecord to file saturn
Data Conversions
Binary data values are automatically converted to text whenever needed. There are many contexts in which this may happen, including when writing a value to a file or when a value is displayed. To force a value to be temporarily treated as data and avoid this conversion, use the as data
operator:
put encryptedPassword as data into file "key" // write binary data to a file
put "Secret" as data --> <53656372 6574> (displays in binary format)
Whenever a string value is converted from text to data, the current setting of the defaultStringEncoding
global property is used to control how each character is encoded into the binary data. Conversions in the other direction—from data to text—are controlled by the defaultDataFormat
global property.
Base64Encode
, Base64Decode
Functions, as Base64
Operator
Behavior: The base64Encode
function converts binary data to a base64 text representation. This function may also be called using the as base64
operator. The base64Decode
function takes text in the base64 format and converts it back to binary data.
Use these functions or the as base64
operator for converting binary data to or from the base64
format. Base64
is a standard format commonly used in a number of applications including email via MIME, and storing complex data in XML.
Syntax:
base64Encode( aValue )
base64Decode( aValue )
aValue as base64
The as base64
operator is often more readable and natural to use than the base64Encode
function, but is otherwise identical in functionally.
Examples
put base64Encode(sourceData) into file "/tmp/datastore"
put base64Decode of file "/tmp/datastore" into restoredData
if file "/tmp/datastore" is equal to oldData as base64 then
// Do something
end if