More thoughts on C++ vs Haskell

I have always joked that when C++ finally becomes Haskell, Bjarne can retire.

I installed MS (Microsoft) Visual studio the other day, and I decided to toy around with writing a MFC (Microsoft Foundation Class) GUI program. What I noticed is that MS has its own representation of strings (CString), and there is a lot of type-casting between different types. You need to do this if you want to set a text field in a GUI from a CString, for example. It seems to be necessary to litter your code with these typecasts, and it is not necessarily always that clear what it is you are supposed to be casting from, or casting to.

I can see why C++ is regarded as being for experts rather than beginners. If modern C++ were used, then all of this casting would be unecessary. We would use string instead of CString, and pass the string into the function directly, without casting. So much easier, and so much safer.

I am keeping my eye on C++ 2017, and I am eager to see it finalised, and compilers implement their features. I do not use Clang, preferring instead the GNU CC. I do not know which one will win the race to be feature-complete, but my impression is that they both work hard at it. It seems likely that users of MSVS are going to be among the last people to receive the goodies.

The features that I am looking forward to the most are filesystems and variants/options. In regard to filesystems, I of course know that there is a close boost implementation , and that C provides some functionality in any case. I want something higher level than C, though. I also want standardisation. Standardisation is great, because I no longer have to worry about whether I am using a POSIX system, or a Microsoft system. I just make a standard library call.

I am glad to see that C++ is moving towards a “batteries included” model. External libraries are good and all, but let’s be honest, who doesn’t interact with filesystems? Perhaps you have written plenty of code that does not use filesystems. In which case, be sure not to tell me about it, because you’re an exception, not the general rule.

Another library we need is for networking. Although not as ubiquitous as filesystem handling, it is of sufficient importance that we need a standard, cross-platform way of doing it, too. I also want some high-level functions for doing networking, too, so I don’t have to grub around with nuances if I don’t want to. I want to be able to call http_get(“http://www.itjustworks.com”) and fetch a page. If I need it multithreaded, then I can add my own.

Things like JSON parsing libraries don’t need to be part of the standard. There are plenty of libraries to choose from, and more importantly, they are not platform-specific.

Enough about libraries, let’s talk about variant/optional. I think this could be the secret gamechanger in C++. There is a good talk by Ben Deane on Youtube (https://youtu.be/ojZbFIQSdl8) on “Using Types Effectively”.

C++ is finally … finally … making a move to having sum types aka ADT (Algebraic Data Types). I can’t help but think that C++ programmers have been hacking classes for years, all for the want of ADTs. OK, even C had union types, so we could always simulate ADTs. But now we are moving in the direction of having ADTs properly.

It remains to be seen how C++ can guarantee that all types are handled with the strictness of Haskell. Would the C++ language need to be extended? Perhaps if we wanted to process a sum type we could supply a tuple of lambdas, and with sufficient static analysis operators the correctness could be determined at compile time. The program may look a bit of a mess, mind.

Ben talked about “total functions”, a concept that has important implications. Conceptually, a “total function” is where all the inputs have an output. So, incrementing an integer by 1 is a total function (ignoring the case of arithmetic overflow), because adding 1 to an integer always gives another integer. The square root function, is not total, however, because if you supply a negative number, you do not receive a real number out.

Collections are also problematical, and a very important class of problem. Suppose you wanted to return the first element in a vector? Usually you’re OK, but what if the vector is empty? Also, maps. What if you try to look up a non-existent key in a map?

These are not total functions, and can cause exceptions to be thrown. However, it can be solved using options. So a map lookup function can return an optional value. Ben suggests that this can completely change the way we write code, particularly reusable code.

I could imagine a neat trick, then, where you could write something like:
for(; c= next_object(), is_just(c);) { process(c);}

Once you explore that rabbit hole, though, be warned, you inevitably end up in the land of Monads. Because, if you use something like optional, then you will inevitably have to deal with whether the option is valid, or not. You then either have to litter your code with checks, or basically decide that it is easier (exclamation mark, question mark, question mark) to start binding functions to monads, and letting the monads take care of them.

Hey, let’s have the Either monad too, that has plenty of use cases.

See what I mean? You start by implementing lambdas, and then you slide down the slippery slope towards monads. Except this is C++, so it will be a syntactic jungle compared to Haskell.

A saying by Haskellers is “state is the root of all evil”. I would like to challenge that statement. I think there are two components of this: one must distinguish between what I call “macro” state, and what I call “micro” state.

Being a purely functional language, Haskell does not have “micro” state. I would argue that it does have “macro” state, though. In fact, all programs of any significance have “macro” state.

What do I mean by this? By “micro”, I am referring to the local mutability of data. Haskell does not allow mutability, and so you have to think unnaturally, and turn iterative constructs into recursive constructs.

However, I think micro state need not be so much of a problem, particularly with modern C++. The problem with state is being able to reason about it. In that sense, it’s like gotos. They’re not difficult to reason about just so long as you can reason about them locally. That’s the key!

Here’s the thing. If you write small C++ functions that take consts are arguments, and return a return value, then you have gone a long way to improving code readability. You KNOW that the inputs are not altered. The fact that the function is doing a small amount of mutation internally is not a problem. Subject to the caveat “within reason”. I am claiming that the judicous use of state is like the judicious use of gotos: it can help, rather than hinder, code readibility. Of course, you can create the equivalent of pre-70’s Fortran goto nightmares, but why would you want to do that?

So that’s micro state. What do I mean by “macro” state? Well, as the name implies, I mean “in the large”. So, for example, a function might need to process a list. The list is subject to the precondition that it is sorted into date order. The question is: has the list been sorted into date order, or has the programmer made a mistake, and the precondition is not met?

In other words, what is the “state” of the input? What is the program’s “state” at any particular point in the code? This is a hard question for a programmer to answer, because to answer that question, he must know about what computations have gone before.

To put it another way: the programmer can’t think locally, he has to reason globally. And that’s a hard thing to do. Even Haskell doesn’t really address this, because it’s not really a language construct. It’s more of an “emergent property” of putting lots of functions together.

Having said that, I think that all is not lost. C++ does have asserts, making it possible to document preconditions as code. It is only executed for debug builds. Contracts are also coming down the pike, which may be a nicer alternative.

That might not be the only mechanism. I hear that promises might be useful.

Perhaps the most promising of them all is that detailed by Ben in his video: phantom types. Apparently Haskell already has them, which I never knew about. Maybe they should be used much more often. C++ is also getting enumerated classes. What phantom types buys you is that it “lifts” (in an informal way) information from the procedural code into the type space.

I will need to look into these phantom states in more detail.

Time for some lunch.

Advertisements

About mcturra2000

Computer programmer living in Scotland.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s