Toying with a data query #grammar in #perl6

I have written an accounts package in C++. I have a minuscule knowledge of COBOL which ought to be well-suited the task, but I have found it almost cumbersome to wrangle. COBOL is still interesting in the sense that it tries to be a declarative language for exactly this kind of domain.

The thought occurred to me that it would be great if we had a much better DSL for handling data. COBOL, but better, plus SQL, but better, and as a first-class construct of the language, and untethered to a database.

Perl6 seemed an obvious choice to use for this purpose.

My experiment is very preliminary at the moment, and the code consists of ~100 lines. You can view the complete source here.

So, you create a record structure like so:

record person
        name string;
        age  int;
end-record

That’s not spectacular so far, of course. Pascal, COBOL, C, etc. already give you syntax for creating structures. It’s where we’re going that is perhaps more important. The grammar for this is straight-forward:

grammar rec {
	token TOP { * }	
	token record-spec { * 'record' \s+  +  'end-record'  }	
	token rec-name { \S+ }
	token field-descriptor { *  +  * ';' }
	token field-name { \S+ }
	token field-type { <[a..z]>+ }
	token ws { <[\r\n\t\ ]> }
}

Due to my newness with P6 (perl 6), construction took a long time. I should also probably be working in terms of rules and protos instead of just using tokens all the time. The feature set of P6 is vast, so just getting something working is good enough for me at this stage.

Two classes naturally suggest themselves: Fields, which store information about the name and type of a field, and Recs, which aggregate the fields into a record. The definition of a Field is straightforward:

class Field {
        has Str $.namo is rw;
        has Str $.type is rw;
}

A Rec is a little more complicated:

class Rec {
        has Str $.namo;
        has Field @.fields;
        has %.flup; # look up from field name to an indexed number
        has @.fnames; # field names

        method add_field(Field $f) {
                @.fields.push: $f;
                @.fnames.push: $f.namo;
                %.flup{$f.namo} = %.flup.elems;
        }
}

‘namo’ is the name we want to assign to the record, and ‘fields’ is a list of the component Fields. We also want a couple of convenience variables: ‘flup’, to obtain an index number for the position of field, and ‘fnames’, just the names of the fields as an array.

We create a class of actions, qryActs, which creates some data structures for us:

class qryActs {
	has Rec %.recs is rw;

	method record-spec ($/) { 
		my $r = Rec.new(namo => $.Str);
		for $ -> $fd {
			$r.add_field($fd.made);
		}
		%.recs{$.Str} = $r;
	}

	method field-descriptor ($/) { make Field.new(namo =>$.Str, type => $.Str); }
}

We parse our textual description ($desc) of the record(s) using:

my $qa = qryActs.new;
my $r1 = rec.parse($desc, :actions($qa));

and we extract the column names for the record like so:

my @cols = $qa.recs{"person"}.fnames;

Let’s define some inputs to play with:

my $inp = q:to"FIN";
adam    26
joe     23
mark    51
FIN

and extract them to an array of arrays, splitting up the input by newlines and whitespace:

my @m = (split /\n/, (trim-trailing $inp)).map( -> $x { split /\s+/, $x ; } );

‘@cols ‘ is useful, because we can use the P6 module ‘Text::Table::Simple’ to print a table:

sub print_table(@data) {
        lol2table(@cols, @data).join("\n").say;
}

and print out what we have so far:

print_table @m;
O------O-----O
| name | age |
O======O=====O
| adam | 26  |
| joe  | 23  |
| mark | 51  |
--------------

Ideally we want to incorporate this into our grammar; perhaps something like this:

import tabbed file "mydata.txt" of person into people
show people

An inlining facility would be useful. There is a world of possibilities for importation. We may want to import CSV, QIF files, all sorts. Perhaps some way is required to extend the language given the general nature of importation. Or perhaps an external utility would be a better and more flexible approach.

What we would like next is a way to filter data. Suppose we wanted a table of all people who were less than 50 years old. We want to write ‘age < 50’ for such a query. I created a new grammar to handle that:

grammar predi {
        token TOP { <ws>* <arg> <ws>* <rel> <ws>* <arg> <ws>* }
        token arg { <field-name> | <value> }
        token field-name { <[a..z]> \S+ }
        token value { <[0..9]>+ }
        token ws { <[\r\n\t\ ]> }
        token rel { '<' }
}

and a function that calls the grammar to create a subset:

sub filter-sub($pred-str) {
        my $pr = predi.parse($pred-str);

        sub get-val($idx, $row) {
                my $v = $pr<arg>[$idx];
                my $ret;
                if $v<field-name>:exists {
                        my $fnum = $qa.recs{"person"}.flup{$v};
                        $ret = $row[$fnum];
                } else {
                        $ret = $v<value> ;
                }
                $ret;
        }

        my @filtered;
        for @m -> $row { 
                my $v1 = get-val(0, $row);
                my $v2 = get-val(1, $row);
                if $v1 < $v2 { @filtered.append: $row; }
        }
                        
        @filtered;
}

The grammar is only primitive at the moment. It does not allow for logical operations, and has ‘<‘ hard-wired as a comparison operator. Still, it is at least possible to do:

my @some = filter-sub("age < 50");
print_table @some;
O------O-----O
| name | age |
O======O=====O
| adam | 26  |
| joe  | 23  |
--------------

How cool is that?

There are many many directions that this idea can expand in. The predicate logic really needs to be completed and merged in with the main grammar as a first step.

One possibility for extension: maybe we do not know the names or the types of the records initially. So there would need to be a way of creating data structures on the fly, as you might need for a generic dataframe library.

Conversely, maybe you do know the record layout ab initio, and you could like to generate static C++ or COBOL code as a back-end. You would then be able to create a processor that is very fast.

Other obvious extensions: support for derived fields, report-writing, nested records, user-level record editing, table joining, heuristics for guessing datatypes (think dataframes), statistics, effortless serialisation, and grouping.

One last thing. Although the idea of using natural languages for programming is discredited, I’m wondering: what about constructed languages? Here, I am thinking along the lines of Esperanto or Lojban. Experanto, because you can always deduce the object, subject, adjectives, etc. in a sentence merely by looking at the word. Lobjan because it is apparently an unambiguous language aimed at the precise expression of ideas. An idea so insane that it might just work?

 

Posted in Uncategorized | Leave a comment

Debenhams disappoints market

DEB (Debanhams) issued its interims and strategy review today. Shares are down 3.5% to 53.35p as at writing, so not what the market was expecting.

LFL was up 0.5%, but gross margins were down 30bps. EBITDA was down 2.5%. Group profit before tax was in line with expectations, and net debt was down. So coasting along, nothing exciting to report.

Looking at my records, I bought some DEB shares in Jun 2015 at 94p, where, if I recall correctly, it had a high Stockopedia ranking. I sold in Dec 2016 at 55p. DEB seemed lackluster, and I decided to move on. I see that I hadn’t missed much.

DEB shares has been trading sideways since Jul 2016; much like its underlying business, I see. It’s fair to say that DEB’s best days are behind it, and now it just seems to bumble along. Maybe a turnaround is possible, but I suspect its long-term fate is ultimately not a happy one. There’s still some life in it yet, though.

The Stockopedia score is 74, so I would not be looking to buy. On the other hand, its value score is 98, PE is 8.6, dividend yield is 6.2%, PBV is 0.77, and EV/EBITDA is 4.22. So there’s clearly some value there if you don’t mind a bit of a wet blanket for a share.

If trading improves in the next report, then their shares should do well. On the balance of probabilities, I would say that it won’t happen, though. Things seem a bit tight retail-wise, so a sudden turnaround in fortunes seems far from a done deal.

I have not been in Debenhams in many years, and my own feeling on them was that their goods were a bit pricey.

I’m sure Paul Scott will be commenting on DEB today, and it will be interesting to see if our views match. If not, then just do what Paul says.

Meanwhile, I’m off to Curry’s today: I am interested in buying a second monitor-cum-TV for my computer. My old TV is on its last legs, and I thought I would use that opportunity to upgrade both my computing rig and my TV. I am not looking for a large TV, but I do want it to be 1080p.

But I digress.

Stay safe out there.

53.35p.

Posted in Uncategorized | Leave a comment

Red pencillers

I have just started reading “Software tools in Pascal”, by Brian W. Kernighan and P.J. Plauger. Even better, I am “borrowing” it for free from Open Library.

My interest was prompted by my work on “neoleo”, my fork of the GNU oleo spreadsheet program. The program is a real rats nest of code, where simple modifications are difficult. I am interested in ripping most of the interface code from it, and creating a kind of “neoleo is to spreadsheets what ed is to text editors”.

Neoleo consists of ~55k of (mostly) C code. It would be great if I could reduce that code by 90%. I have already stripped out its Motif code. Taking out the X11 interface is high on my agenda. Neoleo has an “abstract interface” designed work with curses, Motif and X11. By reducing the code to only one interface, there would no longer be any need for this abstraction.

The interface code consists of a very large body of code, and it is complicated. It does keymap translations, window frame stacking, keymaps which are context-sensitive to the type of window frame, and so on. If I were to eliminate all this interfacing, thereby reducing neoleo to a program that processes stdin and outputs to stdout, I could eliminate a lot of code and complexity. It has yet to be determined if I could remove 90%, but it should be a high percentage. Migrating to C++ should also help reduce the code base, and memory leaks, too.

Modern computers are more accessible than they have ever been, what with office suites and so on. Whilst on the one hand that is good, I think we have lost the point of what computers are supposed to do: to automate tasks. With all this modernity, we have managed to regress the computer from an automatic processing machine to a device that has to be used manually.

That being the case, I think we need to roll back our thinking. Here is what the book’s authors said, even back in 1981 [emph mine]:

Suppose you have a 5000-line Pascal program and you need to find all references to the variable time … How would you do it?

One possibility is to get a listing and mark it up with a red pencil. But it doesn’t take much imagination to see what’s wrong with red-pencilling a hundred pages of computer paper. …

Far too many programmers are red pencillers. Some are literal red pencillers who do things by hand that should be done by machine. Others are figurative red pencillers whose use of the machine is so clumsy and awkward that it might as well be manual.

The book explains how UNIX programs could be constructed, by a guy who was instrumental in its creation.

I think it behooves us as programmers to read through this old material with a mind to see how the way we approach writing applications.

It is time to recount my XML anecdote: a few years ago, I was working on a Fortran program (not my own) that took a plain text file that had the daily oil production for a platform. It basically consisted of several arrays, of fixed size, with each value on a line.

Fortran is really good at reading array from files, you just use the read statement. Think about it: with just a few of these statements, you could load data, no parsing necessary. Fortran gave you that.

One day, I happened to pass by a meeting in which an outside programming consultant from a big IT firm was chatting with some of our oil and gas non-programming consultants. I overheard him say something along the lines of “and of course, if we think we need that output, we could always put it in the XML file”.

I was thinking to myself, great, now to read in data we will have to rely on third-party libraries and scan through a data hierarchy in order to find the data we need.

I should also point out that the Fortran program compiled without incident when I installed it on my machine. I was amazed that compiling programs could Just Work (TM). If this XML idea was introduced, then it would be very difficult to integrate into Fortran, and create many external dependencies.

That, my friends, is why we need to relearn the computing lessons from the 50’s, 60’s and 70’s in order to create high-quality programs. And why I am reading the book.

 

images.jpg

Posted in Uncategorized | Leave a comment

#Fedora Linux is a clusterfark

I have not used Fedora for a few years, so I thought it was worth a try before the new Ubuntu came out. I downloaded Fedora-Workstation-Live-x86_64-25-1.3.iso (where are the torrents?), and installed it to my machine.

Installation was fairly straightforward, but not as intuitive as Ubuntu.

Minor grumbles are that the desktop failed to default to UK keyboard, even though I set it up during installation, and that it played havoc with my clock when I rebooted to Windows. There’s obviously been a mix-up between local and universal time that Fedora has messed with. I had undoubtedly overlooked a configuration option during installation.

But the situation deteriorated. There were a number of other problems which peeved me. I wrote them down, but threw the piece of paper away. So the rest of this article is a watered-down version of what I experienced.

Let’s install vim. Oh, turns out vim is incompatible with vim-minimal, or whatever, which is already installed. You do get vi, though. So let’s get rid of vi … except that it also removes “sudo”. Really? Yes, really. They seem to be mutually-dependent. If you have one, you have to have the other. Great; not. It turns out that you need to install vim-something-else (my memory fails me at this point), rather than vim. It seems like unnecessarily-confused packaging to me.

Carrying on …

I like the program “cdargs”, which is a command-line directory selector. It’s an old program, but it does its job well. The package is no longer available in Fedora, which is disappointing. Arch and Ubuntu support it, so why not Fedora?

OK, so I decide to download the sources and compile it myself. I also needed to install autoconf in order to create the configure script. When I ran autoreconf, though, it said something about Perl4 being an incompatible architecture. WTF, dude?

As far as I can tell, this is not a problem with my downloaded sources of cdargs, but with the Fedora distro itself. I can only presume that the powers that be had some weird incompatible config parameters set when they built the binaries. This is an unforgivable sin, as it means that you basically cannot build software on the system.

At this point, I came to the conclusion that Fedora was not for me.

The desktop looked pretty; so there’s that, I guess. I wouldn’t necessarily say functional, though. Windows decorations consumed too much real-estate for my liking. Plus there’s no minimise and restore buttons. The shortcomings of Gnome desktop have been expounded in great detail elsewhere, so I won’t bother continuing now.

Windowing systems had basically been perfected in the 90’s. The only innovations since the task panel has been areosnap. Pretty much anything else since then has been reinventing the wheel badly, but with nicer colours. We seem to gleefully discard the lessons of the past.

My Arch system rarely gives me problems, despite it being continually-updates, and supposedly “unstable”. I run LXDE, which has nice small xterms, so I can open lots of them. Fedora’s terminal seems to want to gobble up about a fifth of the screen. Needless to say, LXDE consumes a fraction of the memory of Gnome.

That’s me done with Fedora, then. Ubuntu is better. My favourite is Arch, but I admit it is not for everybody. Slackware is also worthy of respect, but its mileage is somewhat limited. Fedora, and Red Hat for that matter, seem very sloppy in their thinking.

Verdict: computer says “No”.

Posted in Uncategorized | Leave a comment

Magic Hat – RM. stays in

The MHP (Magic Hat Portfolio) on Stockopedia (http://www.stockopedia.com/fantasy-funds/magic-hat-463/) is an experiment by me to see if a human can improve on a mechanical Greeblatt Magic Formula screen. I am trying to weed out “mistakes” that I feel the screening commits: unseasoned companies, scams, foreign companies (particularly Chinese), fishy accounting, and statistical quirks. Apart from that, I am agnostic as to the sector the company operates in, although I will try to avoid heavy concentration in any one sector. I will mostly apply “Strategic Ignorance”, by which I mean that I wont try to be clever in my stockpicking. My picking will be mostly mechanical. A summary of most of my Magic Hat articles can be found on the web page http://www.markcarter.me.uk/money/greenblatt.htm This will allow you to see, at a glance, what shares have been bought and sold in the past, as well as what shares have been rejected from consideration and why.

Not much to say this month. Software and computer services company RM is due for eviction from the portfolio today. It still passes the Greenblatt Screen, and has a Stockopedia StockRank of 95. So it stays in. That was easy.

Take care out there.

Posted in Uncategorized | Leave a comment

$TAST.L – Tasty – crashes on prelims

Restaurant group TAST (Tasty) announced its prelims today (https://goo.gl/Xcb824), sending the shares down 38% to 71.25p. This comes as a personal pain to me, as I am a holder in the shares. They announced revenues up 28%, gross profit up 26%, but the killer blow was:

Trading since year has proved challenging and the Directors are now expecting headline operating profit for 2017 to be below that achieved in 2016

The group has also decided to slow down its expansion from 15 new restaurant openings to 7, which dented its credentials as a growth company, and hence the price investors are willing to pay for growth.

Paul Scott, at Stockopedia, gives the following opinion (https://goo.gl/qBWZ4u):

I’m prepared to give the company the benefit of the doubt … I’ve satisfied myself that I’m happy to continue holding

The bulletin boards have been uniformly negative, saying that the shares are overvalued even at current prices.

It looks like the market anticipated the problems we see before us, as the chart from Stockopedia demonstrates:

The shares started going wrong near September of last year. What’s interesting is that the 50dMA was in downtrend, and provided heavy resistance.

I had bought some shares in TAST on two occasions: October 2012 and April 2013. I was up around 200% at one point, but that has been scaled back. I am up only 18% as of writing.

I have been unhappy with the performance of my portfolio since 2016, and TAST adds to the growing list of disappointments for me. I have thought about it, though, and I will continue to hold TAST for now. My thinking seems to be very much along Paul’s lines. I had not topped up, however. I am very loathe to catch falling knives these days. It also demonstrates, once again, that no matter how confidently you might think something will work out, you can never be sure.

Although it does not attract much attention, I am finding that my public Magic Hat portfolio over on Stockopedia has been acceptable in terms of its performance. For my regular write-ups, I basically just work through the like of Greenblatt Screen stocks, and pick any one that has a stock rank of at least 90. It’s so easy, quick and requires no skill! You won’t obtain the kind of returns that someone like Paul Scott or others can obtain, but you should be able to obtain market-beating returns.

Other Stockopedia screens that I would class as my favourites: CAN-SLIM (its performance has been really remarkable, and I think that there is a likelihood that its performance is durable), Naked Trader-esque screen, Richard Driehaus Screen, and Winning Income & Growth. There are others which are pretty good, too, so if your preferences differ, then fine. I like the idea of having a favoured screen in each category, and rotating between them. That way, if value is doing poorly, say, then maybe growth, or perhaps momentum, etc. is doing well.

Stay safe out there.

71.25p

Posted in Uncategorized | Leave a comment

Magic Hat – DFS in, IHG out

The MHP (Magic Hat Portfolio) on Stockopedia (http://www.stockopedia.com/fantasy-funds/magic-hat-463/) is an experiment by me to see if a human can improve on a mechanical Greeblatt Magic Formula screen. I am trying to weed out “mistakes” that I feel the screening commits: unseasoned companies, scams, foreign companies (particularly Chinese), fishy accounting, and statistical quirks. Apart from that, I am agnostic as to the sector the company operates in, although I will try to avoid heavy concentration in any one sector. I will mostly apply “Strategic Ignorance”, by which I mean that I wont try to be clever in my stockpicking. My picking will be mostly mechanical. A summary of most of my Magic Hat articles can be found on the web page http://www.markcarter.me.uk/money/greenblatt.htm This will allow you to see, at a glance, what shares have been bought and sold in the past, as well as what shares have been rejected from consideration and why.

IHG is ejected from the portfolio by rotation. DFS enters. SPO was higher on the list, but it is the subject of a special situation, which is best avoided in an automated portfolio. It is not clear that the fund will correctly account for the changes.

I now have an even lazier way of selecting candidates to enter the portfolio: I look for Stock Ranks of at least 90. I still give preference to companies outside AIM, and avoid foreign ones.

IHG made an 18% gain for the portfolio, which I am pleased with.

MHP has been on a tear since November, outstripping the wider indices by a significant margin. It would not surprise me if much of the gains reversed over the next 6 months. What I notice about the MHP is that it steadily eaks out gains over the indices, and seems to hold onto them better in a downturn. The portfolio has not made Buffett-esque gains, but it has handily beaten the market. I dare say that compared to professional fund managers, it would be in the top quartile.

The indices have gained quite a boost from the recovery in commodities over the last year. The portfolio has not invested in them, so it is pleasing to see that the portfolio has outperformed despite this fact. Two stocks in the portfolio have doubled, and one nearly so. The fact that three out of the twelve stocks have more or less doubled is gratifying.

I have noticed that some stocks that I have sold from the portfolio have since gone on to perform well. That is why I emphasise that MHP is largely a robotic portfolio, so shares sold do not necessarily mean that you should sell. My suspicion is that you could hold shares for two years, rather than just a year, and expect to perform at least as well.

Stay safe out there.

Posted in Uncategorized | Leave a comment