Getting started with #snobol – a programming curiousity

Snobol is a computer language from the 60’s with some intriguing pattern-matching factilities. It was designed originally for the symbolic manipulation of polynomials. At first glance it looks like an “esoteric programming language” – a fun but useless language – but there is some real power in it. You could use it for parsing XML, for example, and it may be even easier to parse using Snobol than any other toolkit out there. Think “regular expressions on acid”.

Snobol makes a virtue out gotos, and programs read like a bunch of jump tables, to me, anyway.

Snobol’s successor is the Icon Programming Language, which is in turn succeeded by Unicon. I have not been able to produce working code out of Icon or Unicon, though. I seem to be having more luck with Snobol.

You can install Snobol4 from Arch’s AUR. If you are using a lesser distribution, then you will need to build it from source:

A Snobol statement has 3 parts to it: LABEL STATEMENT JUMPS. LABEL is an  optional identifying name for the statement, STATEMENT is the statement to be executed, and JUMP is an optional section where you jump to label based on whether the statement worked (passed) or failed. If you don’t want a label, you should use at least one space to start a line.

So here’s the problem I was interested in solving: I have a string: ‘baz,,go,F.mAgic’ and I want to extract the string variables, which I define as “F.” followed by some alphabetical characters.

I can write the line:

 inp = 'baz,,go,F.mAgic'

which just sets the string inp. Note that there is a leading space. Next I define the letters of the alphabet:

 ALPHAS = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

The complete program looks as follows:

   inp = 'baz,,go,F.mAgic'
   ALPHAS = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
myloop inp ('F.' SPAN(ALPHAS)) . hit REM . inp :F(finis)
   output = 'Computer says:' hit :(myloop)
finis output = 'Finished'

The line beginning myloop contains all three parts to a typical statement. The label name is myloop, the statement is
inp (‘F.’ SPAN(ALPHAS)) . hit REM . inp
and the conditional part is

What the statement is doing is matching inp against the regular expression (‘F.’ SPAN(ALPHAS)) (being “F.” followed by as many alphas as possible) and assign it to the variable “hit”, whilst the remainder of the expression, signified by the keyword REM should be assigned to the variable “inp” (in this case “re-assigned”).

Snobol is an interesting language because it has the notion of success and failure. A statement can either succeed (a match was possible) or it can fail (a match was not possible).

My jump condition :F(finis) means that in the case that an expression failed (signified by F), I should jump to the label finis. It is also possible to do a jump based on success (use S instead of F), or to jump unconditionally (just use a colon and a label enclosed in parantheses).

In my example, if I have finished all possible matched, I jump to the label finis. I could have written :F(END) which would have halted execution.

If the statement succeeds in my program, it just executes the next line, which is

 output = 'Computer says:' hit :(myloop)

If I write output = …, it means to output whatever is on the RHS of the expression. So it prints the phrase ‘Computer says:’ followed by the value of hit. I then unconditionally jump to myloop again.

So basically, I search for all matches to the pattern, consuming the string as I go, and bail out when I can’t match any more.

The line

 finis output = 'Finished'

is a courtesy message saying that we’ve finished the program.

The program then executes END, which obviously stops the program.

Suppose I save the program in a file called spat.sno. I can then execute it using a command like:

snobol4 -b spat.sno

The b flag just supresses some banner information that I don’t want printed.

The output from the program is

Computer says:F.mAgic 

I have made everything available as a gist. Good fun. Very retro. And it could be quite a useful programming language.


About mcturra2000

Computer programmer living in Scotland.
This entry was posted in Computers. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s