I’m working on my rewrite of my accounting program from C into Haskell. I’m tackling the mentally challenging problem of aggregating two lists together.

What I have is a list of equities and a list of equity transactions. I want to create a number of portfolio views, where each portfolio may depend on the type of equity it is, and to what account it belongs.

I first started off by trying to group the transactions by equities, and basically spent some time writing my own groupBy, only to discover that Haskell already provides it. If you aggregate the transactions to a group of equities, you can then perform a reduction on that group by, saying totalling the number of shares in that account, then you have something useful.

Except, I realised that after all that work I did, it wasn’t what I was looking for. Yo may want to aggregate lists together in ways that don’t allow for a simple heirarchy. It is with this in mind that I wrote a collation function.

First, though, I wrote a function called ‘finding’:

finding p lst = foldl f (Nothing, []) lst where f (res, acc) el = if ( (Nothing == res) && (p el) ) then (Just el, acc) else (res, acc ++ [el]) testFinding = finding (== 10) [12, 13, 10, 14, 10, 15] -- => (Just 10,[12,13,14,10,15])

‘finding’ is a cross between ‘first’ and ‘partition’. ‘First’ finds an element in the list, but then discards the list, whilst ‘partition’ returns a tuple of all matching criteria, and non-matching criteria. What I wanted was a tuple of the first matching element in a list, and the remaining list.

Now that I have that convenience function, I can write my ‘collate’ function proper. It needs to be split into two special cases: one where the left list contains precisely one element, and another where it contains multiple elements. Here is the function for the one-element list:

collate p (l:[]) rs = [(Just l, hit)] ++ misses where (hit, miss) = finding (p l) rs misses = map (\m -> (Nothing, Just m)) miss

I have something on the left, so I match it, if possible, with something on the right using a predicate p. I may have elements on the right left over, which implies that I have exhausted the list on the left. So I construct remaining tuples consisting of (Nothing, Just m) for the non-matching rights. Now, it may be that an element that doesn’t match on the left can be ignored, or it could signify an error. The function doesn’t care. That’s for the caller to decide.

Now we need another case for collate, where the list on the left has more than one element:

collate p (l:ls) rs = [(Just l, hit)] ++ (collate p ls misses) where (hit, misses) = finding (p l) rs

This is a recursive call, where we combine the head of the list with anything we’ve found on the right of the list. We then concatenate that with the remainer of the list on the left with the unmatched list on the right.

Here is an example call:

testCollate2 = collate (\l r -> l == r) [10, 11] [12, 13, 11]

This function collates a list of left integers [10, 11] with a list of right integers [12, 13, 11]. The test for ‘sameness’ is simple numerical equals. In practise, the list on the left, and the list on the right will be some kind of structured data, and the predicate would define some kind of relevant key matching. Here is the result of evaluating the function:

[(Just 10,Nothing), (Just 11,Just 11), (Nothing,Just 12), (Nothing,Just 13)]

You see that the 10 on the left doesn’t match on the right. The 11 matches fine. The 12 and the 13 didn’t match on the left.

Phew.