Skip to main content

Binary Data Manipulation with SenseTalk

Most SenseTalk scripts work with data in the form of text and numbers, and sometimes other types of values such as dates or colors. When needed, SenseTalk can also deal with data in its binary form (the raw bits and bytes that are stored on a computer).

Data Values

Raw data can be represented directly in a script using a pair of hexadecimal digits for each byte of the data, enclosed in angle brackets, < and >.

put <00> into nullByte // a single byte, with a value of zero
put <48656c6c6f> into secretMessage // five bytes of data

The put before and put after forms of the put command can be used to insert additional binary data before or after an existing value.

put <20467269 656e6421> after secretMessage // append more data

When two known binary data values are compared for equality, they are compared byte for byte to see that they have exactly the same binary contents.

put secretMessage is <48656c6c6f20467269656e6421> --> True

Syntax:
< hexadecimalData >

The hexadecimalData must consist of an even number of hexadecimal digits 0 through 9 and A through F. Spaces may be used to break the sequence up for readability.

AsData Function, As Data Operator

Behavior: The asData function, most often called using the as data operator, converts any value to its binary representation.

Use the asData function or as data operator when you want to tell SenseTalk to treat a value as binary data. This is especially useful for reading or writing a file or URL in its raw binary form (as described later in this chapter), but can also be used at any time to work with or display a value in its binary form.

When two known binary data values are compared for equality, they are compared byte for byte to see that they have exactly the same binary contents. Use as data to ensure that such a binary comparison is made.

Syntax:
asData( aValue )
aValue as data

The as data operator is usually more readable and natural to use than the asData function, but is otherwise identical in functionally.

Examples:

put "abcdefg" as data -->  <61626364 656667>
put file "picture.jpg" as data into rawImageData // read file contents as data
if file "monet.png" as data is equal to oldData as data then
// Do something
end if

Byte Chunks

The byte chunk type extends SenseTalk's chunk expressions to provide all of the flexibility offered by chunk expressions to working with binary data. The byte chunk type can be used to access a single byte or a range of bytes within a data value:

put <010203040506> into myData
put byte 2 of myData --> <02>
put bytes 3 to 4 of myData --> <0304>
put the last 3 bytes of myData --> <040506>

As with other chunk types, a byte chunk is a container, so it can be used to change the data:

put <010203040506> into myData --  <010203040506>
put <AABB> into bytes 2 to 5 of myData -- <01AABB06>
put <77> after byte 2 of myData -- <01AA77BB06>
delete the first 2 bytes of myData -- <77BB06>

Syntax:
byte byteNumber of dataSource
bytes firstByte to lastByte of dataSource

A byte chunk expression is always treated as data in its immediate context (so there is no need to specify as data with it). The dataSource doesn't need to be specified as data. A non-data value will be converted to data automatically before the requested bytes are extracted.

Delete Byte Chunk

The delete command can also be used to delete a byte chunk from within a binary data value. This will reduce the overall size of the data by the number of bytes that are deleted.

Example:

delete byte 5 of myData
delete the last 2 bytes of myData
delete bytes 100 to 199 of myData

Binary Data Files

One of the most important uses of binary data in scripts is when reading and writing data in binary (non-text) files. There are several ways to work with binary data files, depending on your needs.

Simple Data File Access

The easiest way to access a text file is to treat the file directly as a container. The same approach will work for binary data files, by simply using the as data operator to indicate that the bytes of the file should be read directly:

put file "horse.tiff" as data into tiffData // read the entire file at once

Writing data to a file can be done in the same way:

put rawBudgetData as data into file "budget.dat" // write a data file

Remote URL File Access

Accessing a remote file through a URL works exactly the same as a local file. Simply specify URL instead of file, provide the URL instead of the file path, and use as data:

put URL "http://some.company.com/horse.jpg" as data into jpgData
put file "budget.dat" as data into URL remoteBudgetFileURL

Full Binary File Access

When more sophisticated processing is needed, the standard set of file commands including open file, read from file, write to file, seek in file, and close file can be used. The read and write commands have special options available for reading and writing numbers in binary data in a variety of formats. See the description of the read and write commands in File and Folder Interaction and Socket, Process, and Stream Input and Output.

In addition to those numeric data types, the byte chunk type can be used with the read command to read any given number of bytes as data:

read 20 bytes from file "singer.tiff" into formatData

To write binary data into an open file at the current location, just specify as data:

write orbitalCoordinates as data to file jupiter

The as data operator can be omitted if the value being written is specifically data already, such as when writing selected bytes from a data value: write bytes 1 to 16 of temperatureRecord to file saturn

Data Conversions

Binary data values are automatically converted to text whenever needed. There are many contexts in which this may happen, including when writing a value to a file or when a value is displayed. To force a value to be temporarily treated as data and avoid this conversion, use the as data operator:

put encryptedPassword as data into file "key"  // write binary data to a file
put "Secret" as data --> <53656372 6574> (displays in binary format)

Whenever a string value is converted from text to data, the current setting of the defaultStringEncoding global property is used to control how each character is encoded into the binary data. Conversions in the other direction—from data to text—are controlled by the defaultDataFormat global property.

Base64Encode, Base64Decode Functions, as Base64 Operator

Behavior: The base64Encode function converts binary data to a base64 text representation. This function may also be called using the as base64 operator. The base64Decode function takes text in the base64 format and converts it back to binary data.

Use these functions or the as base64 operator for converting binary data to or from the base64 format. Base64 is a standard format commonly used in a number of applications including email via MIME, and storing complex data in XML.

Syntax:
base64Encode( aValue )
base64Decode( aValue )
aValue as base64

The as base64 operator is often more readable and natural to use than the base64Encode function, but is otherwise identical in functionally.

Examples

put base64Encode(sourceData) into file "/tmp/datastore"
put base64Decode of file "/tmp/datastore" into restoredData
if file "/tmp/datastore" is equal to oldData as base64 then
// Do something
end if