So, I thought I knew my way around a red-black tree, maybe not great, but I’ve implemented it a few times now.
The contest analysis for Q3 in GCJ12 R1A states that the question can be performed in O(N^2 log N) time, but leaves how as an exercise to the reader. I decided to take up this challenge, and came to the conclusion that what I needed was a data structure which took items which have an order, and could join/split subsets of that order in log(N) time.
I actually thought I could do such a data structure with O(1) join/split – but then I realised that I needed to associate state with the joined groups, and hence the split operations were becoming O(N) or accessing the state would take O(N) if I started from the wrong node. So I went back to the drawing board. As is often the case when you see a solution with O(1) for some operations and O(N) for others, there is a way to do somewhere in between for all operations and it often involves trees…
So, if I represented each connected section as a binary tree, I could get from any element to the root of its section, and associate the state with that root, giving O(log N) lookups. Good start. But what about the joins and splits? Could I join a binary search tree with another adjacent one in O(log N) time, or even worse, can I split a binary search tree at an arbitrary point in O(log N) time? I felt the answer was going to be yes, and not just because of the Contest Analysis suggestion of a O(N^2 log N) lower bound.
This called for some research, and it didn’t take long to find a couple of papers describing how to do concatenate and split in red-black trees. The first one was confusing me, it kept adding nodes to the tree, or removing them and not putting them back. (And the PDF had the pages in reverse order!) But the second one was much clearer, and explained that the former was using a variant of red-black trees where only leaf nodes contained values, and how to augment these algorithms to work with what I considered a more standard red-black tree approach.
So, off to implement I went. For my new data structure I decided to reuse a single memory allocation containing all the tree nodes, and perform the splits/joins and manage the multiple trees all in the one array. This held promise of best performance, something of a target considering I am going to make this part of TMD.Algo’s next release. But… it did mean I had to write my red-black tree logic from scratch, since my other red-black tree implementations all use heap allocated nodes and pointers and sentinel nodes. This took a while on its own, and since my data structure only has 3 API members (Add, Union, Split) – and I had only written 1 (Add), it was not very favourable to testing yet…
Union was up next. Union requires the Insertion fix-up logic from a standard red-black tree, but is otherwise not actually that hard. First, remove the left-most element of the right tree. Then spend another O(log N) steps to find the black-height of both trees, and the black node on the higher tree which is the same height as the smaller tree. Then you take that left-most element you removed, make it red, and add the two nodes with equal black-height you have identified as children, then add it back in to the taller tree where you just stole a sub-tree to use as a child. (Unless they trees are of equal height, then there is nothing to add it back to.) Pretty simple, but still you end up with 3 code paths, and having to handle corner cases like the right tree being a single node to start with, or either of the trees being entirely empty. At this point my code got its first real test-case, but it was hardly capable of stressing much. I was probably a couple of hours in to writing code at this point.
Split. When I first started this journey, I didn’t expect split to be all that much worse than Union. But it certainly was! The description starts out simple… Walk the tree from the root to the split location. At each node you visit, you are either at the target node, or the target node is to the left or right. If the target node is left, union the right sub-tree with the right split result (starts as null) and store the current node to use as a pivot for the next union rather than having to take the left most element out of the right tree. Similar (but reversed) for if the target node is right. And if the target node is equal, do both, but then add the current node to the left tree (or right if you want) rather than storing it to use as the next pivot.
The devil is of course in the details. In order for this Split algorithm to be O(log N) the unions have to be performed in O(1) time. But the Union implementation takes O(log N) time?!? Also not mentioned is that you have to handle cases where you’ve stored a node as a pivot, but when you get to the point where you would use it the current node has no right sub-tree, or you have no accumulated result tree, or both…
But back to the performance problem – how do we do Union in O(1) time? Well it gets mentioned that if you augment your tree with black height information in each node, and keep it updated through all the insertions and stuff you can do Union in time equal to the difference in the height of the trees. Because when it comes down to it, Union is O(1), you just have to find where to perform the union itself. But keeping track of the black height as part of the nodes isn’t exactly trivial, and so I went with the approach suggested in the paper I was reading, which was to track the relative heights between the items which you are merging. You can even start by saying that the original root node of the tree is height 0, since it is only the relative heights that matter and a red-black tree starts with the property of every path having the same black height.
This is still easier said than done! Black height of the current node, that is easy, you just decrement it by 1 each time you leave a black node. So you know the black height of that right (or left) sub-tree you want to concatenate with the (right or left) result easily enough. But what is the effective black height of the (left or right) result, and how can we be sure that the difference in heights is effectively O(1)? It was at this point that I had to do some actual thinking, because it wasn’t expanded upon in the paper. The first sub-tree was easy enough, we knew its black height when we got it, so we just stored it, and the height difference between it and the next sub-tree is going to be at most equal to the number of steps between sub-trees being on the same side. This isn’t O(1), but clearly the sum of these O(steps) is at most O(log N), which is the goal. This however will only work if we can move between the first sub-tree and the next in O(steps).
So what happens when we union these trees? We know there is a pivot, we know that the pivot’s black height starts as the same as the smaller tree (and the trees that we are joining are in non-increasing height sequence, so that is always the newer tree for which the black-height is easy to calculate), since the pivot is always red to start with. Will this pivot always be on the outside edge of the tree once we have run the insertion fix-up, and will it always have that same black height? The answers to this is yes and no respectively. Most scenarios just recolour, but there can be rotations. However, these rotations all occur at the parent or grand parent level. With the parent level rotating to push the pivot down the outside and the grand parent level to pull it up the outside. The recolouring also occurs at parent levels and above, so you might think the black height won’t change either. However there is one exception, if the pivot is or becomes the root of the tree, it gets forced to black to maintain the root node is black requirement. This increases the black height by 1. This increase by one is worst case going to double the cost of walking to the next join point (1 step becomes 2), so the order of magnitude hasn’t changed.
So, we can use the pivot point, and we can track its black height efficiently. All that remains is to put it all together, handle the black heights in the corner cases where there is no sub-tree and trying not to make too many errors writing almost the same rather complicated logic 4 times.
Now that I had ‘Split’ I could write some real stress tests. I like to have a random (with fixed seed) test which runs a large number of repetitions for cases like this, because manually identifying all the corner cases to write test cases for is harder than writing the code! I started out with 1000 elements and 100k random operations. The first time I ran my test case it crashed on operation 1. (I hadn’t handled splitting something which wasn’t unioned.) and it wasn’t long before I realised that 1000 elements was too many to debug all the problems in my code, so I dropped it down to 3. It passed! (Yay! My code could handle some trivial cases!) I stepped up to 4. Fail. Fix, fix fix. Works, 5, Fail. Fix, fix fix. Works, 6, Fail. At 7 the only failure was for splitting the perfectly balanced tree. At 8 I found I had implemented the algorithm incorrectly. Not a typo or off-by-one or similar like all my previous bugs, I had implemented the wrong thing. Specifically I had missed the requirement that union could only work if the nodes at same black level were both black. You couldn’t use a red parent of the equal black level nodes from the larger tree. I wondered why only my split implementation failed, then I realised that my union walked down to the leaves then walked back up to find equal height, so stopping early was always stopping in the right place. For Split, I was always walking down from the last pivot point, which in this case was red and started with the right black height. I fixed that and it works. 9, 10, 20, 30, 60, 130. All works.
One last thing I want to mention was my test assertions. After each operation, union or split, I could trivially assert that the union had joined or the split has split. But a failure at one step might not get noticed for dozens of steps later. So I ran a full suite of compliance tests after each step as well. Significantly slowed down execution, since they were O(n^2) and each step was O (log N) otherwise, but I tested the following and every one of them helped me find a bug.
- Each root node was black. (red black tree requirement 0)
- Each node was only in one tree, and no cycles.
- Each red node only has black children (red-black tree requirement 1)
- The black distance from each node with 0 or 1 children to root is the same as the black distance from root to left most child. (red-black tree requirement 2)
- At each point in each tree, every node to the left is smaller and every node to the right is larger. (I originally only checked that immediate children satisfied this property, but found my tree mangling was sometimes too subtle to get picked up.)
- For every pair of root nodes, that the left most edge of one tree isn’t on the same side of the right-most edge of the other as its root, or vice versa.
All in all I think I spent almost 10 hours to get this to work. Certainly glad I wasn’t trying to do it during a programming competition! (And I haven’t even written the solution to the actual problem…)