Escaping the grip of the #Python

I’m a programmer with a knowledge of a fair range of languages: Basic (obviously), C , C++ (urgh!), Fortran, Java, VBA, Lisp, Scheme, Smalltalk, Forth, and my beloved Python.

I did my PhD using Fortran, and professionally using C, maybe with a bit of C++. I’ve toyed with Lisp, Scheme and Forth. My go-to language is Python. It’s the language to use when you just want to get things done.

Perhaps the most earth-shattering languages I’ve seen are Lisp, Smalltalk, and Forth. I think that they are also siren languages, so beguiling with their music, yet you always end up dashed on a rocky shore. I’ve always managed to snarl up my Smalltalk images, for example. Lisp is very good. It does have “the library problem”, although QuickLisp alleviates much of this. I would go so far as to say that QuickLisp is the best thing that happened to Lisp in a decade. In the end, I think Lisp can be an overwrought solution. The pathnames are something of a mystery to me. I also had problems with FFI calls – calls to libraries couldn’t seem to be frozen into CLisp applications, which was a real problem if I had the notion of deploying a solution. Lisp is a beautiful and powerful language with so much expressiveness; yet, at the end of the day, it seems too much work to accomplish something simple. Lisp may make hard things possible, but it also makes simple things hard.

Forth. It’s interesting. I love the idea of it as a high-level assembler. I’ve tried a variety of Forths, and was very interested in trying out colorForth. colorForth has modes, and 25 keys. That’s right! You only use 25 keys. The keyboard layout isn’t, of course, standard. “But Mark”, I hear you cry, “How do you type something so simple as the number ‘1’”. To which I reply: “You keep pressing keys in the hope that you get to the mode that allows numeric input”. In the end, it was all a big WTF? It’s a great language provided you don’t want to accomplish anything.

VBA is actually quite a good language, and I have to remind myself not to be too snobbish about it. In many respects, I don’t think there’s too much “wrong” with it as a language.

It’s a bit like Fortran, in that respect. I once joked that all programmers should be forced to write in Fortran. Why? Because Forth is such a simple language with few libraries that it forces you to think in terms of simple solutions to problems.

Java? It works, in it’s own bloat-tastic way.

Perl? I tried it once. Didn’t like it.

Ruby? Pretty good, and there are some neat twists to it. But I always keep going back to Python. I don’t know why.

Tcl? Interesting language. It should probably be used more. Again, as for Ruby, I always end up going back to Python.

So I had come to the conclusion about a year back that Thou Shalt Write In Python. It Just Works (TM). On all platforms. All other Gods are False Gods, and only disappoint in the end.

Then two things happened to me: RPi (Raspberry Pi), and Py3 (Python 3). I bought my RPi last year to act as a little NAS and eco-webserver. I continue to be amazed at how effective it is for its purpose for such a small and cheap machine. I am thinking – only thinking, mind – about replacing it for something beefier.

I have an accounts package that I wrote using Python (2). I have experimented using other open-source software, but none of them were satisfactory to me. John Wiegley’s “ledger” is interesting. I found it could be a bit idiosyncratic. Also, it doesn’t output in a way that made further processing easy. John had originally written ledger in Lisp, but then abandoned that for C using Boost. I’ve never used Boost before, but it struck me as part of the bloat problem. Ironically, I reckon if John had stuck to Lisp, the application would not have been that much bigger, it would have been just as cross-platform, plus the user gets a scripting engine to extend the functionality. The latter is essentially “for free”. Such is the power of Lisp.

Back to my python accounts package. It took a long time to run on my RPi – longer than 30 seconds. That seemed rather slow for such a simple application of about 1600 lines. The code base was over 3000 lines at one point. Fortunately, Python has a really good profiler, and I was able to make massive inroads into that figure. I had decided to re-write the input parsing using flex and C, and link it to Python using swig.

This hybrid approach proved to be finicky, although it did speed up my program considerably. So I was able to reduce a program that took nearly 40s down to around 5s. Here’s my time output:
real 0m4.741s
user 0m4.280s
sys 0m0.300s

Contrast this with my main PC:
real 0m0.365s
user 0m0.312s
sys 0m0.012s

As you can see, RPis are considerably slower than regular PCs. My PC is no speed demon, but it is quite capable. It runs faster than my work laptop. Not that my laptop is necessarily underspecced. But, when you have to authenticate log-in via America, run virus-scan software, connect to corporate networks, are forced to run corporate
screensavers, with the whole of technical support in India, some with a tenuous grip on the English language, then yes, it WILL be slow.

I decided to do a re-write of the software in C. I had forgotten much of the syntax, so there was a lot of “guess the syntax” work involved. What I discovered was that C was not “that” bad. It was more verbose than Python, obviously, but I wasn’t sitting there thinking that I needed 10X as much C code as I did for Python. Here’s my timing output on the RPi:
real 0m0.800s
user 0m0.750s
sys 0m0.030s

You can see there’s a considerable speedup. Included in that, though, is a driver program written in lua. Although I haven’t timed the program formally without using lua, anecdotally, I reckon that the inclusion of lua does introduce a significant performance penalty, despite lua’s reputation for speed. A good fix would be for me to strip out lua entirely, and just hard-code whatever it is that I did in lua.

I should mention that none of the above timings relate to the download capabilities of my python code. I own shares, and I use Python to connect to Google to download the latest share prices. For much of the project’s life I downloaded share prices asynchronously (i.e. on after the other), which was quite time-consuming of course because you tend to have a lot of latency in making requests. I wanted to speed things up. I’ve never done threading before, so I Googled around to look for some code on threading. I found some, and implemented it succesfully. That provided quite a performance boost when my accounts package ran in download mode, because all those latencies are essentially compressed into one, rather than being drawn out in a series.

It was around this time that I decided to switch to Python 3. I’m not sure of my exact reasoning, but I think I figured that there were some nice libraries in 3 that weren’t in 2. Upgrading my software from 2 to 3 actually took more work than I was expecting; and I soured a little on the Python experience. I had introduced a small compatibility layer so that, touch wood, my software should run on both Python 2 and 3. There was a lot of fiddle and faff with getting the right modules, installing them, and so on, and I began to feel that “hey, this doesn’t feel like Python, this feels like getting stuff to compile on Slackware”. Also, when I tried to compile my code on cygwin, Python was giving me a hard time with my custom parser.

This all bought me up to a point of a few days ago. Would I be better off switching to another language? The candidates were C, golang, and ocaml. I had never used golang or ocaml before, except to the extent of compiling “hello world”. The syntax of golang seemed rather mysterious, and I had read some good and bad things about Ocaml. Ocaml is reckoned to be blindingly fast, but there do appear to be debates as to whether or not it is suitable for “in the large”.

I was mindful of the fact that I wanted to do synchronous http requests. It was something that looked eminently suitable for golang. It also seemed to adopt a “batteries included” approach. I guess I was worried about package installation for ocaml, and whether or not it was going to be suitable for me, work well on cygwin, and so on.

So I opted for golang. I don’t think I’d know if Ocaml would be better unless I tried. In Ocaml’s favour, it does have a repl, is symbolic, and has an interface to curl. I’ve never used curl before, but I hear that it does synchronous downloads out of the box. So maybe Ocaml would be a better fit for, and something I should try down the road.

For now, I’m running my project using golang. What are my initial impressions? Well, the syntax is still somewhat mysterious, and I haven’t gotten my head around what something like []interface{} actually “means”. I’m understanding the declarations somewhat better. Mostly I’m aping what I see elsewhere. There’s still a lot of mystery there, though, and I’m puzzled as to how interfaces actually “work”.

Packaging and layout are a bit confusing. I think golang wants to mandate Thou Shalt layout directory structures in a certain way, whereas I, like most programmers, think I’m a unique snowflake who wants to set up package structures my own way, for my own particular reasons. My original solution was to use symbolic links to where the real source code was. But that wont work if you want to compile on a Windows machine, as Windows doesn’t support proper symbolic links, bless their monopoly. You can’t compile it using cygwin, either, golang for cygwin doesn’t exist. If you try to compile golang on cygwin, it throws a tantrum, and throws its toys out of the pram.

All is not lost, though, because you can specify multiple paths in your GOPATH. I think that’s a way forward.

Golang does have a few neat touches though. It’s incredibly fast to compile for starters, although it does create monster executables. One thing I really like is “go get”, so that I can include projects from places like github. This is even easier to use than pip in Python, or QuickLisp in Lisp. I can’t help but get the feeling that all this will lead to Dependency Hell. You might need a particular version of a foreign package – although I guess you can manually check out a particular tagged version from a website. If not, then I guess you could fork the foreign package, and create your own tags. That way you’ll definitely be building against a known version. But, what happens if you have two projects which rely on two different versions of a package? That’s Dependency Hell, and something that seems against the golang way. Maybe some would call it a bug, whilst golang would call it a feature? This is, no doubt, why some golang programmers are trying to come up with their own equivalent of Python virtualenvs.

The neat thing is that I managed to get my synchronous webscraping going in golang even more easily than with Python. On that score, Python felt like trying to hammer a square peg into a round hole, whereas in golang, the solution seemed more “natural”. I also found that decoding the output to be easier than under Python. Python seemed to fight me, whereas golang did not. It could be, though, that I had learnt a bit about the requirements and problems of my solution when I implemented it in Python, and was able to obviate much of it when implementing it in golang.

So that’s where I am at the moment. I intend to do a full port of my accounting package from Python to Golang. I reckon that the port should be minimally pained between Windows and Linux when I’ve finished, too.

If I get that working, then the next step might be to try the equivalent in Ocaml, and see how they compare.

For those with a bit of a retro vibe, Golang reminds me a little of AmigaE, a programming language created by Wouter van Oortmersson for the Amiga. E was inspired by C, had lists, a module system, and many other goodies, and was very fast to compile. Although Wouter doesn’t command the sheer respect that Rob Pike does with Golang, it must be remembered that E was actually used for commercial
applications, so E shouldn’t be considered a mere toy programming language.

Ah. Nostalgia. It ain’t what it used to be.

Advertisements

About mcturra2000

Computer programmer living in Scotland.
This entry was posted in Computers. Bookmark the permalink.

2 Responses to Escaping the grip of the #Python

  1. LT says:

    Considered releasing your accounting package? As you say there are quite a few open ones out there but I’ve never found one that feels right – quite a few of them struggle when you introduce shares/commodities.

    • mcturra2000 says:

      I do have a little project written in C over at Github, although it doesn’t do any downloading of commodities. It’s actually reasonably short, and has an interface with lua.

      In principle, I don’t mind open-sourcing my Python code, but it is, shall we say, highly tailored to my own weird requirements. People will also need swig to compile a module. So, although it works for me, I think others will find it too idiosyncratic.Having said that, if you’re REALLY interested, then give me another shout, and I’ll try to bash it into a usable form.

      OTOH, maybe my rewrite into golang would be better, as I could then distribute pre-compiled binaries for Windows with ease.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s