#forthlang: Problem Oriented Language

I saw a construct in Perl 5:

use vars qw($var1 $var2);

It is a pragma that is equivalent to

my $var1;my $var;

I guess there must be an equivalent in Raku, but I don’t know what it is.

There’s a bigger idea waiting to get out of that simple statement: that of using something (the “vars”) on something else (the qw). It’s not function application, but macro application.

I played for a little while with Raku’s experimental macro facility to see if I could create a macro that did the equivalent of my. So I could say something like:

mymy $foo;

and that would be the exact equivalent of

my $foo;

I couldn’t figure it out. Any rakuers out there know how to do it?

I’m beginning to wonder if the idea of creating metalanguages either via macros is going to be a busted flush.

Slangs are good, but look complicated, and I haven’t committed to making one, yet.

Let’s turn to Forth, now. I’ll be using my own implementation throughout this post. You could undoubtedly do the same thing using a standardised Forth.

The usual way of declaring a variable in Forth is as follows:

variable foo

Nothing difficult about that. Suppose you want to be able to create a whole bunch of them in one go. Repeatedly typing the word variable is tedious. It would be nice if we could do something like:

vars: foo bar baz

Can we do that in Forth? Of course we can! Here’s an example implementation:

: vars: begin parse-word dup while $create 0 , repeat drop ;

Don’t worry if you don’t understand the explanations which follow. My aim is more to get you hyped about the possibilities in Forth, rather than a deep understanding.

Here, : defines a word (in this case vars:), ; finishes the definition, and begin … while … repeat is obviously a looping construct.

We call parse-word to find the next word in the input stream. In the first instance, it will be “foo”. Then we call $create, which creates that word in the dictionary. “0 ,” lays down the value 0 onto the heap. So it basically does the same as “variable”, but instead of looking at the input stream to obtain a variable name, it looks on the stack.

Actually, here is how the word VARIABLE can be define:

: VARIABLE create 0 , ;

It is very close to the fragment

$create 0 ,

that we used in our own definition.

Here’s how we would type out a string in Forth:

"hello" type

Suppose, as an exercise, we wanted to swap the order. You could do it like this:

' type "hello" swap execute

What is going on here? The tick (‘) operator looks at the next word in the input stream (in this case TYPE) and puts its execution token on the stack. Basically, a pointer to a function (with some frill). “hello” puts a pointer to the string “hello” onto the stack. SWAP swaps the order of the string pointer and the execution token so that the latter is now on top of the stack. EXECUTE executes the pointer at the top of the stack. In other words, it calls TYPE. TYPE takes the top of the stack (now “hello”) and prints it.

We can wrap these concepts into a word:

: APPLY ' parse-word pt , ; immediate

Now we can write

apply type "hello"

within a compiled word.

I have introduced two new words here: PT, and IMMEDIATE. PT stands for “process token”. It has somewhat complex behaviour, but in this case, it will turn the string on the top of the stack and embed it into the heap. The comma embeds what’s on the stack (in this case the execution token for “TYPE”) .

IMMEDIATE marks the latest word being defined (in this case APPLY) as an immediate word, i.e. a word that is executed straight away, rather than being compiled into the heap during compilation.

I’ve heard is said that DOES> is the pearl of Forth. I actually think it might be IMMEDIATE, as it allows us to achieve macro-like behaviour.

Let’s be even more ambitious! Suppose we wanted to apply the same word to several objects. I chose to define the words “<<” and “>>” to bracket the collection. So then I’d be able to do something like:

: f << type "foo" "bar" "baz" >> ;

and it will print out foobarbaz. Here’s an implementation:

: << ' begin parse-word dup ">>" str= not 
     while pt dup , repeat drop drop ; immediate

OK, a bit hairy that one. In simple terms, what it’s doing is taking the first word as an action (via the tick) and parsing each subsequent word. If that word is  “>>”, then it knows it has finished. Otherwise, it embeds to word onto the heap, embeds the action onto the heap, and loops.

Here’s another example:

: f << . 12 13 >> ;

Running F will output 12 and 13 to the console. We can decompile F to see what << actually did. Typing SEE F gives the output:

: f

It compiled in a literal 12, the DOT (to print the top of the stack), a literal 13, and another DOT.

Maybe that was too much work for too little payoff. It does, however, lift our code into something that looks much more like a POL (Problem-Oriented Language), aka DSL (Domain Specific Language).

In my accounts package, for example, there are things called ntrans (normal transactions) and etrans (equity transactions). I keep a transaction file. It is a plain-text file. Here is a typical entry:

ntran 2020-02-05 msc cb 2.53 "rs components"

it declares itself to be a normal transaction, has a date, a debit account, a credit account, an amount, and a description.

Now, in a language like Perl I would have to separate out the command and dispatch on it, then separate out the components for the fields (date, etc.). I guess I could write a grammar. I doubt that it would be as compact as writing something like I might be able to produce in Forth:

: ntran << !! date dr cr amount desc >> ..ntran ;

Here, I have defined the word “NTRAN” to exactly coincide with the commands in my file. No need to separate parsing and dispatches on commands. The << and >> are as I defined above. DATE, DR, etc. are just predeclared variables, and !! is defined so that it parses the input stream and puts to values that it defines into the variables. “..NTRAN” then does some further processing.

This, to me, is the real beauty of Forth. Its syntax can be what you want it to be, and you can produce very declarative-looking expressions. At its highest level, it looks likes a home-brewed EBNF; a grammar language that I chose myself.

That’s not just agile development, that’s ninja development!

Truthfully, my Forth skills aren’t that good. I have yet to wrap my way around the way “proper” Forths work. My own Forth is based on my stab at their understanding. As the saying goes, when you’ve seen one Forth, then you’ve seen one Forth. If I were to write a Forth again, I might actually be tempted to use the excellent JonesForth as basis. Actually, I find that if one looks at various Forths, one inevitably finds little quirks and twists that make you think “hey, that’s a cool idea, I want that in my Forth.”

A couple of ideas I had borrowed from John Walker’s atlast Forth, for instance, is to dispense with the float stack, and just use the data stack. Also, strings are now regular strings(“like so”) rather than z” like so”. I also introduced a lexical type, so that I can play with grammars.

Ficl is a Forth which says you can change the tokeniser. Way cool! Maybe it’s a way of producing a really good compiler.

One thing about Forth is that it does take a while to get into. I’m attracted to it because of the elegance with which you can create DSLs. I’m amazed by some of the mind-blowing stuff that real Forthers have produced, and how you can, potentially, create programs that have the elegant simplicity. Almost as if they were Haikus.

One needs to gain a sufficient proficiency as to how to use IMMEDIATE, POSTPONE, and the meta stuff in order to do this. It’s not an easy journey. Most tutorials on Forth just tell you how to jiggle the stack. No wonder people are put off.

One thing I have been contemplating lately is incorporating polymorphism into Forth.

The desirability of generics is actually fairly evident from when you start a Forth implementation. On thing you do is fill TIB (the input buffer) using fgets, which you then parse. It also desirable to be able to parse strings. How to do that? Well, my own Forth currently avoids that issue by not doing it.

You could do dispatching on type: fill TIB one way if you’re reading from a file, and another if you’re using a string.

The alternative is to use a polymorphism, where you have a table of virtual functions. It seems more difficult to implement, but then again, you do have a mechanism that is more easily extended.

When you think about going down that route, you begin to wonder how else to use it. For example, why do we use “.” to output the top of the stack as an integer, “F.” to output a float, and “TYPE” to output a string? Why not just “.”?

Maybe that introduces too much complexity. One might argue that polymorphism is just a fad, a simple “inversion of control”. But there’s an interesting talk by Rich Hickey: Simple Made Easy. He talks about the difference between simple, easy, and complex. It’s a very interesting talk.

One thing he mentions is that the roots of “simple” is “sim” and “plex”, which means “one twist”. The opposite is complex, with multiple twists, or “braided together”. This switch statements are complex according to this definition, as they represent different execution strands that have been corralled together. Polymorphism doesn’t do this, and is therefore simple.

Well, I hope that’s wetted your appetite for interesting ideas in languages. Have fun!

About mcturra2000

Computer programmer living in Scotland.
This entry was posted in Forth. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s