Previous chapter
·
Next chapter
·
Table of Contents
You should try to master the conventional portion of SNOBOL4 first. When you're comfortable with it, you can move on to pattern matching. Pattern matching by itself is a very large subject, and this manual can only offer an introduction. The sample programs accompanying Vanilla SNOBOL4, as well as the many SNOBOL4 books available from Catspaw can be studied for a deeper understanding of patterns and their application.
We'll begin by discussing data types, operators, and variables.
14 -234 0 0012 +12832 -9395 +0
These are incorrect in SNOBOL4:
13.4 fractional part is not allowed
49723 larger than 32767
- number must contain at least one digit
3,076 comma is not allowed
Use the CODE.SNO program to test different integer values. Try
both legal and illegal values. Here are some sample test lines:
Enter SNOBOL4 statements:
? OUTPUT = 42
42
? OUTPUT = -825
-825
? OUTPUT = 73768
Compilation error: Erroneous integer, re-enter:
Normally, the maximum length of a string is 5,000 characters, although you can tell SNOBOL4 to accept longer strings. A string of length zero (no characters) is called the null string. At first, you may find the idea of an empty string disturbing: it's a string, but it has no characters. Its role in SNOBOL4 is similar to the role of zero in the natural number system.
Strings may appear literally in your program, or may be created during execution. To place a literal string in your program, enclose it in apostrophes (')1 or double quotation marks ("). Either may be used, but the beginning and ending marks must be the same. The string itself may contain one type of mark if the other is used to enclose the string. The null string is represented by two successive marks, with no intervening characters. Here are some samples to try with CODE.SNO:
? OUTPUT = 'STRING LITERAL'
STRING LITERAL
? OUTPUT = "So is this"
So is this
? OUTPUT = ''
? OUTPUT = 'WHO COINED THE WORD "BYTE"?'
WHO COINED THE WORD "BYTE"?
? OUTPUT = "WON'T"
WON'T
SNOBOL4 operators require either one or two items of data, called operands. For example, the minus sign (-) can be used with one object. In this form, the operator is considered unary:
-6
or as a binary operator with two operands:
4 - 1
In the first case, the minus sign negates the number. The second
example subtracts 1 from 4. The minus sign's meaning depends
on the context in which it appears. SNOBOL4 has a very simple
rule for determining if an operator is binary or unary:
Unary operators are placed immediately to the left of their operand. No blank or tab character may appear between operator and operand.The blank or tab requirement for binary operators causes problems for programmers first learning SNOBOL4. Most other languages make these white space characters optional. Omitting the right hand blank after a binary operator will produce a unary operator, and while the statement may be syntactically correct, it will probably produce unexpected results. Fortunately, blanks and binary operators quickly become a way of SNOBOL4 life, and after some initial forgetfulness there are few problems.Binary operators have one or more blank or tab characters on each side.
Operation: Assignment
Symbol: = (equals sign)
You've already met one binary operator, the equals sign (=).
It appeared in the first sample program:
OUTPUT = 'Hello world!'
It assigns, or transfers, the value of the object on the right
('Hello world!') to the object on the left (variable OUTPUT).
Operation: Arithmetic
Symbols: **, *, /, +, -
These characters provide the arithmetic operations -- exponentiation,
multiplication, division, addition, and subtraction
respectively. Each is assigned a priority, so SNOBOL4 knows
which to perform first if more than one appear in an expression.
Exponentiation is performed first, followed by multiplication,
division, and finally addition and subtraction. SNOBOL4 is
unusual in giving multiplication higher priority than division;
most programming languages treat them equally.
You may use parentheses to change the order of operations. Division of an integer by another integer will produce a truncated integer result; the fractional result is discarded. Try the following:
? OUTPUT = 3 - 6 + 2
-1
? OUTPUT = 2 * (10 + 4)
28
? OUTPUT = 7 / 4
1
? OUTPUT = 3 ** 5
243
? OUTPUT = 10 / 2 * 5
1
? OUTPUT = (10 / 2) * 5
25
When the same operator occurs more than once in an expression,
which one should be performed first? The governing principle is
called associativity, and is either left or right. Multiple
instances of *, /, + and - are performed left to right, while
**'s are performed right to left. Again, parentheses may be used
to change the default order. Try a few examples:
? OUTPUT = 24 / 4 / 2
3
? OUTPUT = 24 / (4 / 2)
12
? OUTPUT = 2 ** 2 ** 3
256
? OUTPUT = (2 ** 2) ** 3
64
Here's the first bit of SNOBOL4 magic: what happens if either
operand is a string rather than an integer or real number? The
action taken is one which is widespread throughout the SNOBOL4
language; the system tries to convert the operand to a suitable
data type. Given the statement
? OUTPUT = 14 + '54'
68
SNOBOL4 detects the addition of an integer and a string, and
tries to convert the string to a numeric value. Here the conversion
succeeds, and the integers 14 and 54 are added together. If
the characters in the string do not form an acceptable integer,
SNOBOL4 produces the error message "Illegal data type."
SNOBOL4 is strict about the composition of strings being converted to numeric values: leading or trailing blanks or tabs are not allowed. The null string is permitted, and converted to integer 0. Try producing some arithmetic errors:
? OUTPUT = 14 + ' 54'
Execution error #1, Illegal data type
Failure
? OUTPUT = 'A' + 1
Execution error #1, Illegal data type
Failure
Note: Error numbers are listed in Chapter 9
of the Reference Manual,
"System Messages."
Operation: Concatenation
Symbols: blank or tab
This is the fundamental operator for assembling strings. Two
strings are concatenated simply by writing one after the other,
with one or more blank or tab characters between them. There is
no explicit symbol for concatenation (it is special in this
regard), the white space between two objects serves to define
this operator. The blank or tab character merely specifies the
operation; it is not included in the resulting string.
The string that results from concatenation is the right string appended to the end of the left. The two strings remain unchanged and a third string emerges as the result. Try a few simple concatenations with CODE.SNO:
? OUTPUT = 'CONCAT' 'ENATION'
CONCATENATION
? OUTPUT = 'ONE,' 'TWO,' 'THREE'
ONE,TWO,THREE
? OUTPUT = 'A' 'B' 'C'
ABC
? OUTPUT = 'BEGINNING ' 'AND ' 'END.'
BEGINNING AND END.
The string resulting from concatenation can not be longer than
the maximum allowable string size.
The concatenation operator works only on character strings, but if an operand is not a string, SNOBOL4 will convert it to its string form. For example,
? OUTPUT = (20 - 17) ' DOG NIGHT'
3 DOG NIGHT
? OUTPUT = 19 (12 / 3)
194
In the first case, concatenation's right operand is the string
' DOG NIGHT', but the left operand is an integer expression
(20 - 17). SNOBOL4 performs the subtraction, converts the result
to the string '3', and produces the final result '3 DOG NIGHT'.
In the second example, the integer operands are converted to the
strings '19' and '4', to produce the result string '194'. This
is not exactly good math, but it is correct concatenation.
You must be careful however. If you accidentally omit an operator, SNOBOL4 will think you intended to perform concatenation. In the example above, perhaps we omitted a minus sign and had really meant to say:
? OUTPUT = 19 - (12 / 3)
15
It is always possible for concatenation to automatically convert
a number to a string. But there is one important exception
when SNOBOL4 doesn't try to do this: if either operand is the
null string, the other operand is returned unchanged. It is not
coerced into the string data type. If the first example were
changed to:
? OUTPUT = (20 - 17) ''
3
the result is the INTEGER 3. You'll find you'll use this aspect
of null string concatenations extensively in your SNOBOL4 programming.
Before we proceed, let's think about the null string one more time as the string equivalent of the number zero. First of all, adding zero to a number does not change its value, and concatenating the null string with an object doesn't change it, either. Second, just as a calculator is cleared to zero before adding a series of numbers, the null string can serve as the starting place for concatenating a series of strings.
There aren't many interesting unary operators at this point in your tour of SNOBOL4. Most of them appear in connection with pattern matching, discussed later. Note, however, that all unary operations are performed before binary operations, unless precedence is altered by parentheses.
Operation: Arithmetic
Symbols: +, -
These unary operators require a single numeric operand, which
must immediately follow the operator, without an intervening
blank or tab. Unary minus (-) changes the arithmetic sign of its
operand; unary plus (+) leaves the sign unchanged. If the
operand is a string, SNOBOL4 will try to convert it to a number.
The null string is converted to integer 0. Coercing a string to
a number with unary plus is a noteworthy technique. Try unary
plus and minus with CODE.SNO:
? OUTPUT = -(3 * 5)
-15
? OUTPUT = +''
0
A variable is a place to store an item of data. The number of variables you may have is unlimited, provided you give each one a unique name. Think of a variable as a box, marked on the outside with a permanent name, able to hold any data value or type. Many programming languages require that you formally declare what kind of entity the box will contain -- integer, real, string, etc. -- but SNOBOL4 is more flexible. A variable's contents may change repeatedly during program execution. The size of the box contracts or expands as necessary. One moment it might contain an integer, then a 2,000 character string, then the null string; in fact, any SNOBOL4 data type.
There are only a few rules about composing a variable's name when it appears in your program:
Here are some correct SNOBOL4 names:
WAGER P23 VerbClause SUM.OF.SQUARES Buffer
Normally, SNOBOL4 performs "case-folding" on names. Lower-case
alphabetic characters are changed to upper-case when they appear
in names -- Buffer and BUFFER are equivalent. Naturally, casefolding
of data does not occur within a string literal. Casefolding
can be disabled by the command line option /C.
In some languages, the initial value of a new variable is undefined. SNOBOL4 guarantees that a new variable's initial value is the null string. However, except in very small programs, you should always initialize variables. This prevents unexpected results when a program is modified or a program segment is reexecuted.
You store something in a variable by making it the object of an assignment operation. You can retrieve its contents simply by using it wherever its value is needed. Using a variable's value is nondestructive; the value in the box remains unchanged. Try creating some variables using CODE.SNO:
? ABC = 'EGG'
? OUTPUT = ABC
EGG
? D = 'SHELL'
? OUTPUT = abc d (Same as ABC D)
EGGSHELL
? OUTPUT = NONESUCH (New variable is null)
? OUTPUT = ABC NULL D
EGGSHELL
? N1 = 43
? D = 17
? OUTPUT = N1 + D
60
? output = ABC D
EGG17
OUTPUT is a variable with special properties; when a value is
stored in its box, it is also displayed on your screen. There is
a corresponding variable named INPUT, which reads data from your
keyboard. Its box has no permanent contents. Whenever SNOBOL4
is asked to fetch its value, a complete line is read from the
keyboard and used instead. If INPUT were used twice in one
statement, two separate lines of input would be read. Try these
examples:
? OUTPUT = INPUT
TYPE ANYTHING YOU DESIRE
TYPE ANYTHING YOU DESIRE
? TWO.LINES = INPUT '-AND-' INPUT
FIRST LINE
SECOND LINE
? OUTPUT = TWO.LINES
FIRST LINE-AND-SECOND LINE
SNOBOL4 variables are global in scope -- any variable may be
referenced anywhere in the program.
Previous chapter
·
Next chapter
·
Table of Contents