“Will Hare replace C? Or Rust? Or Zig? Or anything else?” (link)
Zig, Nim, Rust, Odin, and occasionally D, as languages that appear on Hacker News.
It inspired me to start mucking around with a silly exercise. It is based on a soup of ideas: bootstrapping C implementations, quines (self-replicating code), Forth, and Val Schorre’s paper on Meta II (link)
So I came up with Daft, a minimalist language that can produce a quine of itself. Here is a quine in Daft:
S>S2TT
>S2TT
What the hell, and what’s the point?
Meta II is a “meta-circular compiler”, which means that it can compile itself. The problem is that it needs it’s own assembly instructions to do it. You have to write that assembler. So we have the classical chicken-and-egg problem.
My idea is to dispense with writing that assembler, and instead writing a different assembler, an assembler that is much easier to implement.
Forth seems an interesting idea, but can we make it even simpler? Yes, we can. We use ASCII characters as opcodes. This is what the above “program” uses. Specifically:
S - push the current heap pointer onto a stack, read a string from stdin until the newline, putting the read string on the heap
2 - duplicate the top item on the stack
> - read the next char from stdin, and echo it to stdout
T - type the zero-terminated string pointed to by the top of the stack to stdout, then a newline, and pop the top of the stack
With just 4 instructions, we are able to produce a quine.
Now, admittedly, this isn’t enough in itself to make a general-purpose compiler. It’s just a thought experiment. There are no test conditions or jump instructions, for example, that would be needed to create a compiler.
Here is the implementation of the interpreter:
#include <stdio.h>
int stack[100];
int sidx = 0;
char heap[1000];
int hidx = 0;
int pop(void) { return stack[--sidx]; }
void push(int val) { stack[sidx++] = val; }
void dup(void) { stack[sidx++] = stack[sidx-1]; }
int main()
{
int c;
while((c = getchar()) >=0) {
switch(c) {
case 'S':
push(hidx);
while((c = getchar()) != '\n') {
heap[hidx++] = c;
}
heap[hidx++] = 0;
break;
case '2':
dup();
break;
case 'T':
puts(heap + pop());
break;
case '>':
putchar(getchar());
break;
}
}
return 0;
}
40 lines of code. That’s all. The question is: what would we need to add to have the same power as Meta-II?
And what will this buy us? Well, we could create a grammar for some language that we were interest in writing. Maybe it’s not a computer language, maybe it’s a string for something like RTTTL, which is a ringtone description language. I did one of those in C using lex and yacc.
Meta II and lex and yacc require separate compilation phases. My system could potentially embed an interpreter inside the runtime itself, like Forth, but geared more towards generalised language. The whole thing is so small that you could use it in a microcontroller.
To get in and out of the interpreter, maybe you could use a symbol like “¬”. code blocks could be implemented between “[” “]” pairs, with escape sequences for strings. Bracketed pairs would be inefficient, but then it would mean that you don’t have to implement backtracking of addresses, and all that.
Anyway, just some ideas. I might see if I can develop it further.
Hey, WordPress seems to be getting worse by the month.
Link to github code.