TSN - Tcl SECS Notation


Tcl SECS Notation (TSN) is a text-based representation of Semiconductor Equipment Communication Standard-II (SECS-II) binary messages. The notation is similar to the representations of SECS-II messages found in SEMI Standards except the list formatting conventions of the Tcl programming language are followed. TSN is well-suited for use in Tcl applications since the usual text and manipulation features of the Tcl language are efficiently applied. In contrast to other software tools that have been popular, there is no compilation of TSN message libraries prior to use; the Tcl SECS software uses TSN messages directly.

Using TSN, a SECS-II data item is represented as a space delimited list. The first element in the list is a Type Code which indicates the type of data item. An example Type Code is "U2" which indicates two-byte, unsigned integers. The second element, and additional elements in a data item list, specify values of the indicated type.

For example, the list:

{U2 34 56}

specifies an array of two-byte unsigned integers consisting of the values 34 and 56.

The example list:

{B 0}

specifies a single, one-byte binary code of 0.

The example:

{A "this is a string"}

represents a sequence of ASCII character data consisting of the characters inside of the quotation marks.

The following types of SECS-II data items are supported. With each type, a Type Code and description are given. The Type Code specifies the binary representation that is passed "over-the-wire" when the TSN message is used for communication. Also, the numeric code that SEMI Standards use to refer to the type is given. In general, a developer must carefully insure that the type of data item passed in a message corresponds exactly to type of item expected by the receiver. The TSN Type Code is not case sensitive; either alphabetic case may be used to represent a type of data.

List. A list item specifies an ordered set of items. The items included in a list may be of any type, including lists. In fact, it is common to have nesting of lists to three or four levels.

ASCII text (SEMI "20"). The A Type Code specifies a sequence of ASCII text characters commonly called a string. Strings are specified by delineating them with double quotes or braces. Tcl allows you to specify non-printable characters by using escape character sequences, see the manpage Tcl. For example, the string "test\x0a" has an embedded newline character.

Binary (SEMI "10"). The B Type Code specifies a one byte binary code. In TSN, B data values can be specified using integers from 0 through 255.

F4 and F8
Floating point (SEMI "44" and "40"). The F4 and F8 Type Codes are used to specify 4 byte and 8 byte IEEE floating point values. The TSN conversion routines do not allow values outside the range of numbers that have normalized representations. For F4, the range of absolute values is >= 1.1754943508222875e-38 and <= 3.4028234663852886e+38. For F8, the range of absolute values is >= 2.2250738585072014e-308 and <= 1.7976931348623158e+308. Compiler runtime libraries on some platforms slightly diminish the obtainable range.

I1, I2, I4, and I8
Signed integers (SEMI "31", "32", "34", and "30"). The I1, I2, I4, and I8 Type Codes are used to specify 1-byte, 2-byte, 4-byte, and 8-byte signed integers. The possible range for each type is:
Type Code   Range
I1 (-128, 127)
I2 (-32768, 32767)
I4 (-2147483648, 2147483647)
I8 (-9223372036854775808, 9223372036854775807)

For the specification of integer values, a leading 0x may be used to indicate hexadecimal notation. Similarly a leading 0 is used to indicate octal notation. 

Boolean (SEMI "11"). The TF Type Code specifies a one byte boolean data type. In TSN notation, True is represented by 1, T, Y, True, or Yes, in either alphabetic case or in mixed case. Similarly, False is represented by 0, F, N, False, or No in any case. TSN messages that are generated by the SECS software will always output 1 for True and 0 for False. You may also use the Type Code BOOLEAN or BL to specify this data type. System formatted TSN will always use the TF code.

U1, U2, U4, and U8
Unsigned integers (SEMI "51", "52", "54", and "50"). The U1, U2, U4, and U8 Type Codes are used to specify 1-byte, 2-byte, 4-byte, and 8-byte unsigned integers. The possible range for each type is:
Type Code   Range
U1 (0, 255)
U2 (0, 65535)
U4 (0, 4294967295)
U8 (0, 18446744073709551615)


Variant data (SEMI "22") also known as Localized Character Strings. With this item type, the standard provides for the following encoding values:

V Type Code encoding
V0 none - cannot be used
V1 Unicode 2.0
V2 UTF-8
V3 ASCII 7-bit
V4 ISO 8859-1 (ISO Latin-1, Western Euroupe)
V5 ISO 8859-11 (Thai, * not installed)
V6 TIS 620 (Thai)
V7 IS 13194 (* not installed)
V8 Shift JIS (Japanese)
V9 EUC-JP (Japanese)
VA EUC-KR (Korean)
VB gb1988 (Simplified Chinese)
VC EUC-CN (Simplified Chinese)
VD Big5 (Traditional Chinese)
VE EUC-TW (* not installed))
VF - V7FFF reserved
V8000 - VFFFF user defined

Internal to the Tcl software and in DMH messages, string values are represented as UTF-8 data on all platforms. These strings are converted as needed to different encodings when being displayed, or when sent in SECS messages. Tcl also understands the use of imbedded Unicode characters, and converts these to UTF-8 byte sequences as they are encountered. Source code files or data files which use non-default encodings are useable in Tcl, see the encoding command. The V<code> Type Codes in the table above are used with a single Tcl string argument which is converted to the specified encoding during conversion to SECS data.

gemhost put S65F65 {V1 {hello world sent as Unicode}}

#set le_question "Ou est le caf\u00e9;, Fran\u00e7ois?"
set le_question "Ou est le café, François?"
set mon_msg [list V4 $le_question]
The plain V Type Code can also be used to specify the item data as a list of integer codes. The standard requires that there is always two bytes of item data, specifying the encoding of the data. For example, the Unicode string for "hello" could be specified as:
   set uhello {V 0 1 0x00 0x68 0x00 0x65 0x00 0x6c 0x00 0x6c 0x00 0x6f}

Integer data such as Unicode is always transmitted during network communication in Network Byte Order which is big-endian. Thus, the high byte characters, which are 0 for ordinary ASCII, are sent before the low byte characters. Network Byte Order is so commonly assumed and understood, that the authors of E5-0301 neglected to mention it with respect to Unicode. The SECS standards do specify Network Byte Ordering for all other data items such as integers. On platforms which are little-endian such as Windows NT, it is common to represent the data in little-endian form, so there is always the possibility of encountering a naive, erroneous implementation.

As indicated in the table above, some of the less common character set encodings are supported by the software logic, but the character encoding tables are not installed by default. If you intend to use these less common encodings, you need to install additional data tables in the Tcl library encoding directory.

You should design your application so that the specific binary representation of a value is of minor consequence. Within a Tcl application, TSN and the string-based nature of Tcl allow you to manipulate SECS data as list elements or string text without regard to the SECS-II Type Code. For data types that are numeric, you can also apply numeric functions and expressions such as the expr and incr commands.

Here are some examples of TSN representations. In these examples, some Tcl programming statements are shown so that typical usage contexts may be seen.

set msg {L {A "m 1.03"} {A "ver001"}}
$sp1 whenmsg S1F1 "$sp1 put S1F2 $msg ; $sp1 whenmsg again"

The above example registers a two element list message that is sent whenever a SECS Stream 1, Function 1 message is received by the secsport connection named by $sp1. The message consists of two ASCII strings. The Tcl language uses either double quotes or braces to delineate the elements of lists. Additional space characters, or other whitespace characters such as newlines, are ignored in the context of parsing lists. Tcl performs substitution of variable values within quoted delimiters but not within brace delimiters. In the above example, substitution for the values of variables sp1 and msg occur in the second statement because it is delineated by quotations. Tcl also substitutes for command evaluations within quote delimiters but not within braces. A command evaluation is indicated by using sqare brackets.

Here is a set of examples where the same TSN message is constructed using different variations of substitution and command evaluation.

set MDLN "echx1"
set SOFTREV "1.03a"
set s1f2_reply "L {A $MDLN} {A $SOFTREV}"

In the above example, the braces do not prevent substitution of the $MDLN and $SOFTREV values because the braces are buried in the quoted string and are not interpreted as delimiters.
set s1f2_reply [list  L  [list  A  $MDLN]  [list  A  $SOFTREV]]
set s1f2_reply "L [list A $MDLN] [list A $SOFTREV]"
set s1f2_reply "L \
{A $MDLN} \

The last example demonstrates that quoted delimiters can extend over multiple input lines. Escape characters (\) can be used to keep the newline characters out of the quoted string, but this is not strictly necessary. The following alternative also works:
set s1f2_reply "
Here is another example which builds the same message using the list append command.
set s1f2_reply L
lappend s1f2_reply [list A $MDLN]
lappend s1f2_reply [list A $SOFTREV]

Here's a tip for type A data since it can contain spaces, or even quotation marks. The list command that we have used in "[list A $MDLN]" is safer than the direct substitution in "{A $MDLN}". The latter expression causes an error if MDLN has imbedded spaces. If we want to allow for the possibility that MDLN has imbedded space characters, we can safely perform substitution as "{A {$MDLN}}". You do not need additional braces with numeric or boolean data since it never contains additional spaces or problematic characters.

Here is an example that demonstrates using hexadecimal notation within an array of binary codes.

$sp1  put  s2F25  {B 10 11 12 13 0x1b 0x7f 0xff 0}

As the example shows, you may freely mix hexadecimal notation with the usual decimal representation of integers.

The SECS-II standard also requires the ability to send or receive zero-length arrays. These can be specified by indicating only the type code; or by indicating an empty string for values. For example, here are several alternatives for a zero-length list:

{ L }
"L {}"
{L {}}

TSN notation allows the user to optionally specify an array size when specifying the Type Code of a SECS-II item. The array size is indicated by appending a colon and an integer size to the Type Code. For example:

set msg {I4:2 123 987}
set ZLL L:0

No spaces are allowed between the Type Code and the optional size specification. The size specification is optional because it is determined by the TSN software from the provided data. The size specification can be used by the application developer as a check when constructing messages. The conversion software (the command TSN_to_secs) returns an error if the indicated size of a text item is smaller than the actual data; or if the indicated size of a non-text array does not match the actual size.
% TSN_to_secs "A:6 sfdsfsfsd"
Specified optional size is too small for actual string data

Messages that are received by your application and converted to TSN by the command secs_to_TSN by default show the optional size. The -nosize option suppresses it.

% secs_to_TSN {1 0}
% secs_to_TSN -nosize {1 0}

The explicit size information can be suppressed for received messages on secsport, or hsms connections, by using the TSNSIZE configuration option.


TSN_to_secs Tcl secs_to_TSN hsms secsport

To parse TSN messages, refer to the manual pages lindex, lrange, rset, scan, split and vset.

To create TSN messages, refer to the manual pages append, concat, format, join, lappend, list, lrange, lset, and string


Ed Hume, Hume Integration Software, Austin, TX