April 2012 – The Missing Docs .Net

Red-Black Trees, Advanced

So, I thought I knew my way around a red-black tree, maybe not great, but I’ve implemented it a few times now.

The contest analysis for Q3 in GCJ12 R1A states that the question can be performed in O(N^2 log N) time, but leaves how as an exercise to the reader. I decided to take up this challenge, and came to the conclusion that what I needed was a data structure which took items which have an order, and could join/split subsets of that order in log(N) time.

I actually thought I could do such a data structure with O(1) join/split – but then I realised that I needed to associate state with the joined groups, and hence the split operations were becoming O(N) or accessing the state would take O(N) if I started from the wrong node. So I went back to the drawing board. As is often the case when you see a solution with O(1) for some operations and O(N) for others, there is a way to do somewhere in between for all operations and it often involves trees…

So, if I represented each connected section as a binary tree, I could get from any element to the root of its section, and associate the state with that root, giving O(log N) lookups. Good start. But what about the joins and splits? Could I join a binary search tree with another adjacent one in O(log N) time, or even worse, can I split a binary search tree at an arbitrary point in O(log N) time? I felt the answer was going to be yes, and not just because of the Contest Analysis suggestion of a O(N^2 log N) lower bound.

This called for some research, and it didn’t take long to find a couple of papers describing how to do concatenate and split in red-black trees. The first one was confusing me, it kept adding nodes to the tree, or removing them and not putting them back. (And the PDF had the pages in reverse order!) But the second one was much clearer, and explained that the former was using a variant of red-black trees where only leaf nodes contained values, and how to augment these algorithms to work with what I considered a more standard red-black tree approach.

So, off to implement I went. For my new data structure I decided to reuse a single memory allocation containing all the tree nodes, and perform the splits/joins and manage the multiple trees all in the one array. This held promise of best performance, something of a target considering I am going to make this part of TMD.Algo’s next release. But… it did mean I had to write my red-black tree logic from scratch, since my other red-black tree implementations all use heap allocated nodes and pointers and sentinel nodes. This took a while on its own, and since my data structure only has 3 API members (Add, Union, Split) – and I had only written 1 (Add), it was not very favourable to testing yet…

Union was up next. Union requires the Insertion fix-up logic from a standard red-black tree, but is otherwise not actually that hard. First, remove the left-most element of the right tree. Then spend another O(log N) steps to find the black-height of both trees, and the black node on the higher tree which is the same height as the smaller tree. Then you take that left-most element you removed, make it red, and add the two nodes with equal black-height you have identified as children, then add it back in to the taller tree where you just stole a sub-tree to use as a child. (Unless they trees are of equal height, then there is nothing to add it back to.) Pretty simple, but still you end up with 3 code paths, and having to handle corner cases like the right tree being a single node to start with, or either of the trees being entirely empty. At this point my code got its first real test-case, but it was hardly capable of stressing much. I was probably a couple of hours in to writing code at this point.

Split. When I first started this journey, I didn’t expect split to be all that much worse than Union. But it certainly was! The description starts out simple… Walk the tree from the root to the split location. At each node you visit, you are either at the target node, or the target node is to the left or right. If the target node is left, union the right sub-tree with the right split result (starts as null) and store the current node to use as a pivot for the next union rather than having to take the left most element out of the right tree. Similar (but reversed) for if the target node is right. And if the target node is equal, do both, but then add the current node to the left tree (or right if you want) rather than storing it to use as the next pivot.

The devil is of course in the details. In order for this Split algorithm to be O(log N) the unions have to be performed in O(1) time. But the Union implementation takes O(log N) time?!? Also not mentioned is that you have to handle cases where you’ve stored a node as a pivot, but when you get to the point where you would use it the current node has no right sub-tree, or you have no accumulated result tree, or both…

But back to the performance problem – how do we do Union in O(1) time? Well it gets mentioned that if you augment your tree with black height information in each node, and keep it updated through all the insertions and stuff you can do Union in time equal to the difference in the height of the trees. Because when it comes down to it, Union is O(1), you just have to find where to perform the union itself. But keeping track of the black height as part of the nodes isn’t exactly trivial, and so I went with the approach suggested in the paper I was reading, which was to track the relative heights between the items which you are merging. You can even start by saying that the original root node of the tree is height 0, since it is only the relative heights that matter and a red-black tree starts with the property of every path having the same black height.

This is still easier said than done! Black height of the current node, that is easy, you just decrement it by 1 each time you leave a black node. So you know the black height of that right (or left) sub-tree you want to concatenate with the (right or left) result easily enough. But what is the effective black height of the (left or right) result, and how can we be sure that the difference in heights is effectively O(1)? It was at this point that I had to do some actual thinking, because it wasn’t expanded upon in the paper. The first sub-tree was easy enough, we knew its black height when we got it, so we just stored it, and the height difference between it and the next sub-tree is going to be at most equal to the number of steps between sub-trees being on the same side. This isn’t O(1), but clearly the sum of these O(steps) is at most O(log N), which is the goal. This however will only work if we can move between the first sub-tree and the next in O(steps).

So what happens when we union these trees? We know there is a pivot, we know that the pivot’s black height starts as the same as the smaller tree (and the trees that we are joining are in non-increasing height sequence, so that is always the newer tree for which the black-height is easy to calculate), since the pivot is always red to start with. Will this pivot always be on the outside edge of the tree once we have run the insertion fix-up, and will it always have that same black height? The answers to this is yes and no respectively. Most scenarios just recolour, but there can be rotations. However, these rotations all occur at the parent or grand parent level. With the parent level rotating to push the pivot down the outside and the grand parent level to pull it up the outside. The recolouring also occurs at parent levels and above, so you might think the black height won’t change either. However there is one exception, if the pivot is or becomes the root of the tree, it gets forced to black to maintain the root node is black requirement. This increases the black height by 1. This increase by one is worst case going to double the cost of walking to the next join point (1 step becomes 2), so the order of magnitude hasn’t changed.

So, we can use the pivot point, and we can track its black height efficiently. All that remains is to put it all together, handle the black heights in the corner cases where there is no sub-tree and trying not to make too many errors writing almost the same rather complicated logic 4 times.

Now that I had ‘Split’ I could write some real stress tests. I like to have a random (with fixed seed) test which runs a large number of repetitions for cases like this, because manually identifying all the corner cases to write test cases for is harder than writing the code! I started out with 1000 elements and 100k random operations. The first time I ran my test case it crashed on operation 1. (I hadn’t handled splitting something which wasn’t unioned.) and it wasn’t long before I realised that 1000 elements was too many to debug all the problems in my code, so I dropped it down to 3. It passed! (Yay! My code could handle some trivial cases!) I stepped up to 4. Fail. Fix, fix fix. Works, 5, Fail. Fix, fix fix. Works, 6, Fail. At 7 the only failure was for splitting the perfectly balanced tree. At 8 I found I had implemented the algorithm incorrectly. Not a typo or off-by-one or similar like all my previous bugs, I had implemented the wrong thing. Specifically I had missed the requirement that union could only work if the nodes at same black level were both black. You couldn’t use a red parent of the equal black level nodes from the larger tree. I wondered why only my split implementation failed, then I realised that my union walked down to the leaves then walked back up to find equal height, so stopping early was always stopping in the right place. For Split, I was always walking down from the last pivot point, which in this case was red and started with the right black height. I fixed that and it works. 9, 10, 20, 30, 60, 130. All works.

One last thing I want to mention was my test assertions. After each operation, union or split, I could trivially assert that the union had joined or the split has split. But a failure at one step might not get noticed for dozens of steps later. So I ran a full suite of compliance tests after each step as well. Significantly slowed down execution, since they were O(n^2) and each step was O (log N) otherwise, but I tested the following and every one of them helped me find a bug.

Each root node was black. (red black tree requirement 0)
Each node was only in one tree, and no cycles.
Each red node only has black children (red-black tree requirement 1)
The black distance from each node with 0 or 1 children to root is the same as the black distance from root to left most child. (red-black tree requirement 2)
At each point in each tree, every node to the left is smaller and every node to the right is larger. (I originally only checked that immediate children satisfied this property, but found my tree mangling was sometimes too subtle to get picked up.)
For every pair of root nodes, that the left most edge of one tree isn’t on the same side of the right-most edge of the other as its root, or vice versa.

All in all I think I spent almost 10 hours to get this to work. Certainly glad I wasn’t trying to do it during a programming competition! (And I haven’t even written the solution to the actual problem…)

GCJ12 R1A

Now this is more like it… 44th! (Sure the time of this round was pretty bad for all of europe, but still…)

I finished the first 2 problems in about an hour, leaving 90 minutes for the 3rd problem. However, I could not see how to formulate the DP (or greedy) to pass the large input after a good long while thinking, so I decided to just do the brute-force possible small input. Even the small input had its corner cases, but my Fraction class came to my rescue once again and after realising a stupid mistake on my first submission, I got a correct on my second attempt.

Luckily I didn’t make any mistakes in the large input for the first two problems, as failing the second of the two would have dropped my score below the 1000th place cut-off, even with my submission of the 3rd problem small input. (Points were 10,10 15,18 17,30 and the pass cut-off was 53 points in a time of 2h 15 or better.)

Q1) Given that you are typing a k character string, and you have typed m characters, and each of those characters has an individual probability of being correct (which is provided) but every character you type from now on will be correct, what is the expected number of keystrokes to get your password correct if you choose the best strategy. You can press delete 0 or more times to clear some of the existing characters and then fill in the rest and have to press the enter key each time you submit. Alternatively you can press enter immediately and start again.

A1) So the enter immediately scenario is trivial, 2 + k. Its what we need to beat. This leaves k+1 scenarios, with probabilities. We can walk through these easily.

First scenario is delete everything and type it out. Probability of success is 1. Cost is m+k+1 on success. Second scenario is delete all but 1, probability of success is the probability of success for the first letter. Cost is m+k+1-(2) on success and m+k+1-(2)+k+1 on failure. Next scenario is the same, but the number in brackets is 4 (2 more than before) and probability of success is multiple of the first 2 probabilities. Just loop through them all and return the best.

Q2) Given a list of tasks which you get either 2 or 1 points depending on how good your best effort is, and a minimum number of points before you are capable of performing a 1 or 2 point effort (where this minimum is specified individually for every task and point outcome), determine the minimum number of task attempts required to get a 2 point effort on every task.

A2) It took me a little bit of thinking before I worked out exactly how, but this seemed like a problem with a greedy solution. And my solution passed, so that is promising.

The approach is to sort the tasks in order of least to greatest points required to get a 2 point result. Then you take turns performing all tasks you are capable of getting 2 points on, then 1 task that you can get 1 point on. The first part is obvious, the list is sorted we can just work our way up it. But how to choose that 1 task to potentially unlock some more 2 points? The greedy approach is to say that you want to choose the task which is hardest to get 2 points on, its the one you are going to try last, so we maximise the opportunity to get 2 points in a single go. So, just walk the sorted list from the other end, choosing the first task which you haven’t already scored 1 point or more, and are capable of scoring 1 point on. You don’t have to worry about checking whether you can score 2 points, because you already greedily took all of those before you started your search for a potential 1 point task.

Finally you need to terminate this loop if you stop making any progress or finally get 2 points on all tasks. If the former, the problem isn’t possible, and you return as such.

Q3) Given a 2 lane road with cars initially in 1 lane or the other, and having initial speed and initial rear-bumper position relative to some start line, determine the maximum time before a car has to change speed to avoid a collision, presuming instantaneous car lane switches and a car length of 5 metres. Cars can touch bumper-to-bumper without problem, so long as the cars have same speed or the faster car is in front.

A3) So the speeds and positions and car length are all integers, but the answer is not. Using double risks incorrectly deciding that a car cannot switch lanes, but some epsilon usage is probably feasible to get it right. However since I had my Fraction class, I used it to ensure I didn’t have to worry about getting the epislons correct.

The small input size was a maximum of 6 cars, the large was a maximum of 50. In either case I expect the first step is to determine the points in time which are of interest. Sorting the cars by speed, determine the point in time where the front of a faster car meets the back of a slower car. For 6 cars, this is at most 18 points. And at each of these 18 points, there is only 2 options of interest, showing a very plausible brute force search space.

So brute force approach, you walk through the points of interest in time order, at each one there are 2 scenarios. If the two cars are in the same lane, one of them has to swap. If the 2 cars are in different lanes, either they can both swap, or they can stay where they are. This both swap case is important because future seemingly unavoidable collisions might be avoided by trying it. Recursively search swapping car lanes and moving on to next interesting point in time, and undoing the swaps when the recursion returns, is not too complicated at all. The important bit being ensuring that you don’t swap a car when it can’t be swapped. This is easy enough, you just consider each car, move it to the point in time under consideration check if it is in the target lane and would overlap the target car position. This is where I think the greatest risk from using double and epsilons would be, but fractions handle it easily.

The above brute force approach however has a problem. If two events happen simultaneously, what do you do? Indeed one of the sample inputs fails if you don’t handle this. Thinking about how it can fail you come to the conclusion that it is where there are 3 cars which are in bumper alignment at once. In this case the recursion will decide to move the one moveable car, but then further down the recursion, it moves it back, inadvertently returning to where you were, but not reconsidering the scenario. One solution is to change the recursion to handle this, and you probably could, but I found it simpler to instead walk every pair of points of interest up front, and determine if they were at the same time and if so whether they shared the same fast or slow car (I only handled same slow car in my first submission… oops). If you find a match the answer cannot be greater than that time point. So choose the smallest of all matches and cap any returned value from the brute force, to that. This is a bit simpler to implement, and returns the identical result.

So what about the large input? Well, the solution is in caching I am sure, but what is your key for the DP table or memotization lookup? Since we are moving cars between lanes, and there are up to 50 cars, the naive cache ends up with 1225*2^50 slots (1225 being the maximum number of points of interest) and not being very sparse. Just as the competition ended I realised that at any point you only need to consider the cars which are involved in future points of interest. Defining what that actually means, obviously includes both cars at the point, but also up to 2 more cars which might be blocking the ability to switch lanes. But I don’t know if this reduces the search space enough, but it should be very significant at the leaves, and even before the leaves I think the actual state usage will probably be pretty sparse and so it might be possible.

Or maybe a completely different approach is required, I haven’t checked anyone solutions to see.

Contest analysis is up – and they do indeed use a quite different approach for solving the large input. Looks like the trick is to gather the cars together into groups. When two cars are passing, they are locked, they cannot switch lanes. This locking cascades. If you track both events regarding 2 cars travelling at different speeds meet and pass, you can work out when to split or join these groups. When they aren’t in a group, they are free to change at will, so the only way you can get a collision is if the two groups join and they are both fixed, which is to say they started the simulation locked. Fixed is contagious, but when a group size returns down to one it loses its fixed state. So we just need to consider the N^2 events, updating the joined status and returning the time if we get an unsolvable collision. Each update can easily be performed in O(N^2) which with N=50 brings the total running time quite high, but still feasible. It also isn’t too hard to make it perform better than that.

This solution is nice, wish I had thought of it 😛

TMD.Algo 0.0.5.0

So, since its Google code jam time, I figure I’ll do another TMD.Algo release. Still the same licensing as before.

New features: (Beware bugs due to insufficient testing… especially in the first 2)

MaxFlowMinCost algorithm for Graph.
Generic TernarySearch extension for lists.
SortedDictionary2 – a sorted dictionary with ‘near’ lookup support. Finds the element equal or less than the search element and then you can enumerate from that position.
Pattern support. Algorithms for efficiently finding the sum or value at index in a repeating pattern with optional non-repeating start. Uses sequence generation state as a key to detect loops.
Memotizer – a simple function adapter to automatically cache results – not exactly nice to use if the function needs to be recursive…
LookupQueue – amortized O(1) lookup/remove/append queue using a dictionary augmented linked list. Presumes queue elements are all distinct. (Internal implementation uses arrays to avoid the GC and random location dereferencing penalties of a normal linked list.)

Improvements:

Fraction improvements: Truncate to closest integer in direction of 0, explicit cast for integer to fraction, <= and >= operators added, absolute value function.

Bug fixes:

Corner case in reverse comparer (if base comparer returned int.MinValue it would fail).
One of the error messages in LookupHeap had a typo.

Link: http://www.themissingdocs.net/downloads/TMD.Algo.0.0.5.0.zip

TCO12 R2A Update

They made this round official with no changes. So my incorrect code has successfully improved my topcoder rating by 90 points. Yay for me…

TCO12 R2A

Only 1362 out of 2001 qualifiers turned up for this round. (Well above the 1% existence failure that Otinn uses for his odds calculations. So maybe I have >1% chance of getting a t-shirt after all…)

374th before system tests, I was just hoping to get a non-zero score this round. The first question had potential for problematic test cases and lots of successful challenges (the number of red ranked coders on 0 points or lower was pretty scary…), and the rest were too difficult for me to even consider at this late hour. (Although I swear the second question was familiar…)

Waiting on systests … and the announcement comes through that because the challenge phase was broken for the last 3 minutes, the round *might* be cancelled. Which would mean another 2am night back-to-back.

Systests finished, so if this round actually counts I’ll be in 285th. Which in any other year would be a t-shirt placement, but given the fact that round 2 has 3 parts and splitting of t-shirts over the best 350 ranks of all it won’t be good enough. (Unless subsequent rounds are even harder and hence not enough people get past the minimum 0 points requirement…). Only 461 people got a positive score this round (2/5 Aussies, including myself got a positive score).

Q1) Given a list of sets of states for light switches and corresponding states for lamps, but with no knowledge of which light switch actually corresponds to which lamp – determine the minimum number of extra state sets required to correctly identify each lamp. Each switch and its light are in sync, it is a simple wiring. Each individual state on its own will be valid, but there may be no solution which satisfies all states, in which case the answer is -1.

This problem looks like it needs brute force, but with bounds of up to 50 lights and up to 50 state sets, that certainly seems implausible. So I created a 2 dimensional array of whether switch i could be lamp j. If a single state set says a switch i is on and a lamp j is off, obviously switch i cannot connect to lamp j and vice-versa. If two state sets have the same switch state then those switches cannot be in the set of lamps with different lamp states and similarly for switches in different states. The question then becomes, is this enough? And since I passed system tests, the answer might be yes… (I fully expected it to be no though…)

Finally count up how many switches each light could be. Then take the ceiling of the base 2 log of the highest of those counts, since each count corresponds to a set of switches which we know nothing about, and that set of switches can be solved in base 2 log tests, and each of the groups of switches can independently be solved at the same time.

Post Sleep Update 2: A better approach is to consider the pattern of each switch and each lamp. Presuming that there is a solution any given pattern must appear the same number of times amongst switches and lamps. The pattern with the highest count is the same as the switch with the largest count in my version, and from there the code is identical. This approach is quite obviously correct, so is it possible to define a test case which breaks my approach?

The answer is yes. Using an a quick written app to compare the output of the 2 approaches for random inputs I found the following test case (and many others…)

switches = YNYNNNYNNYYNY, YNNNYYNYNYNNY, NYYNYYYYNNYNN
lamps = NYYYNNYYNNYNN, NNYYYNNYYYNNN, YNNNYNYYYNYNY

My code incorrectly believes such test cases to be possible, but the pattern NYN only appears in the lamps not the switches.

This is twice in a row that I have managed to get past the system tests with incorrect code!

Q2) Given a list of items to visit in order on a 1-dimensional line, and the ability to add up to K instant teleports, what is the minimum travel distance to visit the items in the order specified?

A2) No real clue yet… but it isn’t the semi greedy approach of choosing the best location for the first 2 teleport locations, then the best location of each subsequent one… (I wrote that during the comp and it failed one of the samples.)

Looking at one solution the approach appears to be to calculate the minimum distance to travel in the n left most items with up to K teleports, but I’m going to need some sleep before I make sense of it.

Post Sleep Update 1: So the approach is indeed the minimum distance to travel for the n left most items with up to k teleports. More specifically it is the minimum distance travelled in the area to the left of the n left most items where the kth teleport is at the same location as the nth item. In order to solve the question the component required is to be able to answer the question ‘what is the distance spent travelling in the segment a,b if a and b are both teleports. Then you can treat the initial state as having teleports at -infinity and +infinity and add teleports left to right, adding the cost of the segment previous teleport location to new teleport location.

The cost of a segment bounded by teleports only needs to consider the items which are inside that segment. Obviously the path for every item outside that teleport either does not attempt to enter the segment, or skips the segment to get to another item outside the segment, or jumps to whichever end is closest to its target inside the segment. So for each item inside the segment, we need to consider the item before and after it in the sequence. For each of these if the other element is outside the sequence we simply add the distance to closest end of the segment. If the other element is inside the sequence, we ignore the before and only consider the after to avoid double counting. The cost here is either the sum of the costs to the closest endpoints or the direct distance in between the 2 elements, whichever is smaller.

During the competition I had thought that something along these lines would be needed, but only at about 7 minutes left on the clock and I thought the cost for a segment would be much more difficult to calculate.

Q3) Given n rooms connected by a set of 1 direction tunnels which do not form any loops, and the knowledge that some rooms are clear and some rooms might be obstructed and if so could not be entered, determine how many scenarios there are where the number of paths from room 0 to room 1 is even. Up to 50 rooms, and at least all but 32 of those will be listed as clear. (Also: at most 500 tunnels)

A3) Only 2 correct solutions for this during the competition, obviously brute force is too slow, but at first glance the question seems plausible… I’ll take a closer look after some sleep.

GCJ 12 Qualifying Round

So, the 25 hours are up, and at final count about 230 people submitted all the problems (including myself). After the elimination of incorrect ‘large input’ solutions I managed to remain part of the 163 people who got a perfect score in this round.

Over 18000 people got a positive score, so I imagine that the number of competitors was pretty high. 15692 people qualified, so lots of competitors for R1A.

Q1) Given an input string, perform letter substitution to generate another string. (Letter substitution is a bijection and some examples provided.)

A1) So, wow… simplest question description ever? No large input, since well, what could they do? The small difficulty of this question was that the provided samples were missing 2 letters. The first of these letters was provided in the question text, but it was provided as a forward mapping and the actual question asked for a reverse mapping. The final letter can be deduced by the fact that the substitution is a bijection, and hence if you know all but one path, the final path is between the only unused input and output. So, you could have copied the samples into your code and written some moderately smart code to work out the substitution for you … or you could just do what I did and work it out by hand and hard code the mapping, using indexes into a handcrafted string.

Best part of this problem? The answers. I have to wonder how many people submit their outputs without actually checking them. Because if you did, then you missed out! After decoding many of the answers were amusing or interestingly absurd.

Q2) Given a set of numbers which are known to each be the sum of 3 non-negative integers, determine how many of these triplets can have a largest number of at least k. The triplets are further limited that difference between the lowest and highest values of the triplet is at most 2 for ‘n’ of them, and at most 1 for the rest.

A2) The small input limit was that there was at most 3 numbers. This meant that you could brute force every allocation of the difference of 2 and just return whatever was the best answer. The large input was not feasible using this approach, but I doubt that there were many people who would have used the brute force approach anyway…

So, if we look at each triplet, if the difference limit is 1, then the triplet consists only of Floor(sum/3) or Floor(sum/3) + 1. If the sum is 0 modulo 3, they are all the first, otherwise there is one of one kind and two of the other. So we can go through the list determining the triplet for difference limit of 1. If the resultant triplet already has a highest value equal to k or higher, then add it to the result. Otherwise, if the highest is equal to k-1, then maybe we can use a difference of 2 to reach the limit.

Specifically the sum must be at least 2 (as each integer is non-negative) and the modulo must not be 1. If the modulo is 1 then changing to a difference of 2, doesn’t actually increase the highest value, it just increase the number of numbers at the highest value. Accumulate all the cases where switching to 2 would make a difference, then limit it by n, since only at most n of these can be used. Add the result to the total and you are done.

Q3) Determine the number of pairs of numbers n and m where n is less than m, n is greater than or equal to A and m is less than or equal to B, such that n and m have the same number of digits and are rotations of each other. A and B are also guaranteed to have the same number of digits and A is less than or equal to B.

A3) Small input limits B to be at most 1000, which in practice means 3 digits since if B has 4 digits, so does A and hence A equals B and there are no pairs. Large input limits B to be at most 2million, which gives us some cases with 6 digits and a maximum range of 1million numbers to check.

It might seem that the large input will need some fancy tricks to deal with how large it is, but, that is not really the case, brute force approach will run in time unless the implementation is particularly slow. For each number in the range A to B, generate all rotations, if the rotation is greater than the starting number and less than or equal to B, add one to your total.

Generating rotations is not hard, first count the number of digits in A, by dividing by 10 until the answer is 0. (A is guaranteed not to be 0, so you don’t have to worry about that corner case.) To generate a rotation simply store the mod 10, then divide by 10 and add the modulo times 10^length. Repeatedly execute until you are back at the same number you started with. The fact that many rotations may end up starting with 0 is not relevant, since you are going to compare the value against the original number which does not start with 0, and hence all of these will be skipped.

Q4) Given a rectangular grid of square cells where cells are either part of a wall or not, and all edges of a wall are mirrored and you are a 0 size object at the centre of a specific cell which is not a wall, determine how many reflections of yourself can be seen where the light travels at most a distance D and can’t pass through yourself. The outer edge of the grid is guaranteed to all be set. Light bounces mostly as you would expect, but there are two specific cases of interest. Firstly, if light exactly hits an inner corner, where 3 wall cells meet, the light completely reverses direction. Finally if the light hits an outer corner with the light heading towards the 1 filled cell adjacent to that intersection, the light is destroyed. Light can glance a corner just fine, even two diagonally touching corners, it just passes straight on. Grid size is at most 30×30 and D is at most 50.

A4) So, here it is. The question to see if you have some spare time (or are really really good…). The small input had a strange limitation which was that the number of walls would be at most double the width plus double height, minus 4. It took me a while to realise that this meant that only the edges would be filled in, since half the samples violated this condition. (Looking back, at least 1 sample from each question with a large input violated the small input conditions. I’m not sure if this is new this year, but it is actually kind of good…) This restriction is actually quite powerful and would allow for a much simpler solution as I will explain in a minute.

So, how do we determine the solution? Even a simple case like the 3rd sample which is a 2×1 open space with a maximum distance of 8 has 68 reflections. One thing to think about is where the light starts and ends up. It is in the middle of a cell on both occasions. If you think about what happens each time the light reflects off an edge, its like the light keeps travelling but into a mirrored version of the puzzle about that edge. A mirrored puzzle with a mirrored starting position. This mirroring is always aligned at cell edges, so the resultant grid remains aligned with the starting grid and hence in order for the light to travel from start to start through multiple reflections, the initial angle must be one which passes through the centre of two cells in the infinite grid plane. That is an infinite number of scenarios to check, but lucky for us we have a maximum distance D, so we can easily limit this to angles which connect the centre of two cells at most a distance D from the starting position. A simple implementation just considers every cell in the +/-D square.

Now we have to follow the light. This is where the small input becomes much easier. Because it is a simple rectangular room, you can expand it out to cover the entire +/-D square through reflection, populating each square with either nothing or a reflected position of the original starting point. Then simply count how many starting points are within distance D of the original starting point. The double reflection in an inner corner thing plays along nicely with this, and the outer corner light destroy never happens in a simple rectangular room. You don’t even have to actually create the large grid, you can use some tricky divs and modulo to map each potential square straight back to the input data.

Oh, except that doesn’t quite work. This approach counts the same angle multiple times. Think of the 1×1 room with a maximum D of 4. The reflected grid contains a starting point in every cell, but obviously walking a multiple of 45 degree angles passes through an intermediary reflected starting point before it reaches the outermost one, which is disallowed. The solution to this is to instead count the distinct angles. Use a lookup table to check whether you’ve already counted an angle, where the lookup reduces the dx and dy by their gcd (or reduces them to 1 if the other is 0).

Now the large input, that is a whole different scenario… If we try to reflect through each wall and overlay the maps, we get inconsistencies depending on order of reflection, and the outer corner light destroy is a pain. So, we really need to actually simulate the light path. At this point I grew concerned at how much work was involved in writing the solution. I once a very long time a go wrote a wolfenstein engine clone, which involves a technique called raycasting, specifically raycasting onto a square grid. I remembered that being kind of painful to get right, and here we are not just doing raycasting, but having to recast the rays at each reflection – and finally we need to be absolutely perfect in calculating distance or we’ll fail test cases involving angles based off of 3/4/5 triangles.

But I wanted to solve this problem, so I dived in. First up, I realised that the cardinal directions needed to be handled separately. Back when I was implementing my wolfenstein engine clone, I had hacks in place to ensure pure 90 degree angles never actually occurred as you got divide by 0 bugs, this time said hacks would cause the question to fail. Luckily these 90 degree angles are quite easy, just walk cells in the specific direction until you find a wall, determine the number of in between cells, double it, and add 1 (for the 2 halves exiting and entering the first cell) and check if that is too far or not.

So all we have to do now is simulate the non-90 degree angles. First problem to consider is the accuracy problem. My solution to this was to use the fraction class from TMD.Algo. By doing this I ensured perfect accuracy, so long as the numerator and denominator never overflowed. There would be a lot of calculations, but the fraction class repeatedly reduces to minimal form after each operation, so it really depended on what kind of fractions could be found. We were going to be looking at the positions of intersections with grid lines (for walls) and half grid lines (for the starting location), and a slope of dx/dy. Some handwaving and I convinced myself that the fractions would never have a denominator greater than 2*dx*dy after reduction, and square that for pre-reduction. Since D is at most 50, thats about 25million. Numerators shouldn’t get above D +2 times the denominator so long as we abort the simulation sooner rather than later. So numerator should cap out around 1.3billion, easily handled by the 64bit numbers my Fraction class uses internally.

Next problem… This one is kind of subtle, but even with fractions how do we ensure our distance check is perfect? If we calculate the distance to each reflection and sum these, we have to calculate the square root of a fraction, which is not actually very likely to be another fraction, so we’ll be adding together doubles and we’ll lose our perfect accuracy. The solution comes from remembering that we can pretend the light goes in a straight line with the world reflecting about as needed. The length of this line is defined by two numbers, the cumulative absolute dx and dy. After each simulation step we can add how far we’ve moved dx and dy to an accumulator for each, taking the absolute value so it continues to accumulate after each reflection. We can then compare the sum of squares of these accumulators with D^2, an exact comparison (this comparison might actually have triggered the need for 64bit integers in the fraction type now that I think about it).

So, all that remains is to perform the simulation. Given a point and a direction, we need to determine the next possible half integer coordinate. First step is to determine the half-integer coordinate. If we aren’t already on a half-integer (check denominator 1 or 2) then calculate the floor of double the value, and add 1 if we are moving in the positive direction then halve. If we are on a half-integer, we simply add or subtract half depending on whether direction of motion is positive in that dimension. Now we have 2 possible coordinates to move to next, one for x and one for y. We calculate the delta between current and new location for the coordinate calculated then multiply by dx/dy or dy/dx to calculate the movement in the other coordinate. Add this to the current position of that other coordinate and we have the ‘projected’ coordinates. Compare the projected x coordinate for the next y to the next x, and determine which is closer in the direction of motion. Whichever it is, that is the new current location. Update the accumulators at this point. Once the accumulators are updated, verify that we haven’t travelled too far, fail if we are in excess of D. Now check if we’re back where we started. If yes, return success. Finally check if we are at an integral boundary, is denominator 1 for either x or y coordinates. If yes for only 1 of x or y, check to see if we’ve hit a wall by calculating the floor of the coordinates and subtracting 1 from one of those coordinates if we’re heading into a wall in the negative direction. (Fiddly…) If yes for both x and y coordinates, we’ve hit a corner. If the far side of the corner is occupied (similar algorithm to before, but we might have to subtract 1 from both x and y – extra fidly) then we need to work out what happened. We need to look at the other 2 cells adjacent to the corner. If 1 of these is filled, we’ve got a normal wall and reflect (making sure you reflect the right way… Fiddly!). If both are filled, double reflect. If neither are filled return fail for this angle.

Stick the above complicated fiddly code in a loop and presto you are done… (No, I didn’t get it right first time, but it did at least fail on the samples, so I didn’t have to find out by running the small input…) When I ran the small input I ran it in debug mode with the debugger attached. It took 2 minutes, which given the 4 minute time frame for submission was kind of nerve racking. I thought about the large input for a bit and decided it couldn’t be much slower, since I was simulating every half integral coordinate intersection regardless, and whether it reflected or not was a negligible contribution to the processing time, so given that there was 8 minutes for that, I should have been fine. But to be sure I changed to run in release mode. I think the problems for the large input were actually a bit larger D on average, so even in release mode it still took 2 minutes. But that left plenty of time… Still, considering how many corner cases there were I was sceptical that it would pass tests. But it did, so I am happy.

TCO 2012 R1C

Easy problems this week, in fact in what is a rare turn of events I made submissions for every question.

Before systests I was 147th, but I already knew that was going to change, as I had proven one of my solutions wrong already. However the systests did not agree?!? ‘Final’ placing 101st. Even if they were to rerun with a more comprehensive systest I would still be about 140th apparently.

So finally, a round easy enough for me to do at 2am.

Q1) Given a set of guesses where you know exactly one character in the guess is wrong in each case, determine how many possible answers remain. (Up to 50 characters, each in the range ‘0’ to ‘9’. Up to 50 guesses.)

A1) This is a simple question which tried to trick you by making the return value a 64bit integer type. The worst case is actually where you have the same guess made 50 times, and there is 50 characters.

For a single guess you can enumerate all possibilities by substituting each character with the other 9 possibilities. Then the answer is just the size of the intersection of the output for enumerating for each guess. Use a dictionary to count how many times you see each generated possibility, and count how many times the count equals the size of the input – done. Even if you have 50 different guesses at 50 digits each, there is only ~20k entries in your dictionary. Or just perform Set intersections, each set has a maximum of 450 entries, so you could even do it using lists and contains and still not run slow…(not that I recommend getting into that practice!)

I however, made a fatal mistake. I didn’t realise that a 50 digit number can’t be converted to a long until the break before challenge phase… I was converting my generated values into numbers and using them as my dictionary keys rather than creating strings. Its not like the code was any easier my way, I just had less string concat calls, which I have developed a natural aversion too it seems. Despite this fatal mistake I somehow passed system tests!?!

Q2) Given a grid where you can move from grid intersection to intersection at a variable cost, but you can only move right or down, determine the minimum cost to get from top left to bottom right if you change the costs such that every path has the same total cost, but the only change you can make is to increase the cost of an edge. (Grid is 40×40)

A2) Well, this one looks easy at first. Although I can’t fully explain my justification any more… The idea is that the minimum cost path has at least one edge which is not part of any maximum cost path – unless the minimum and maximum are the same. Presuming this possibly unjustified leap of logic, it then simply becomes obvious that if you can work out the maximum path cost at the beginning, that is the same as the maximum path cost at the end. Maximum cost is easy, for each vertex determine the maximum of cost from above or left, where the maximum cost for those locations is already calculated and cached.

Q3) Given a sequence of letters where no letter appears more than 2 times, and there are at most 50 letters, determine the minimum number of swaps to turn it into a palindrome.

A3) So I submitted an answer to this, but I certainly didn’t have any form of proof to go with it. First, determine if the answer can be a palindrome, if there is more than 1 letter which appears only once, it cannot be, otherwise it can (due to the restriction of maximum appearance of a given letter being 2). If the length is odd and the single appearance letter is not in the middle, swap it there from wherever it is and add 1 to running total count.

Then walk from front to back, at each point if its the first time you’ve seen that character, store its position. If it isn’t switch the character with the position inferred from the first location you saw the character (if it isn’t already in the right spot) and add 1 to running total count. If a switch occurs, you need to remember to check the same spot again, since the newly placed character may or may not be in the right spot.

Very unusual for a greedy approach like this to be the solution for the 1000pt question, but it passes. Then again, given my incorrect solution to the first problem which ALSO passed systests, maybe I am giving system tests too much credit 😛

GCJ Qualifying under way

So, I can’t do a write up about the questions until tomorrow (sometime after I sleep post TCO12 R1C), but I did finally finish them. Penalty time of 7:20 (which doesn’t count in qualifying since all you have to do is score 20 out of 100 – and even if all my large inputs fail somehow, I’ll still get 50) – but I had lunch and played computer games for 4 hours or so in between the 3rd and 4th questions. So I think I could have done it in about 3 hours which I think isn’t too bad. At time of writing only 40 people have submitted solutions to all inputs. I think that is kind of low compared to normal, even if it is early days…

New this year was a question with no large input, not that exciting, but still something different.

TCO 2012 R1B

Again a late night, and again a poor performance.

I was faster on the 250pt question this week, but I probably spent more time testing than I needed to. Still 217 is acceptable. Unfortunately, at the prior to systest phase, the top 600 lowest score is 300 and I failed to get the 500pt. So I sit in 750th, hoping for a huge number of failures in the 500pt problem that I struggled with. I definitely saw a few solutions which were greedy in nature in my room, and although I couldn’t find a counter example, greedy is usually high risk. I guess I’ll find out shortly.

Greedy solution passes… but still quite a few failures, enough for me to go up to 646th. Just 11 points from cut-off this time. R1C here I come?

Q1) Given up to 50 objects with lengths between 1 and 50 inclusive, determine the largest k where you can make k rows of length k either by using a single object or 2 objects separated by a gap of 1.

A1) Nice easy problem. The answer is obviously in the range 0-50, and its not too hard to verify possibility, so try each one descending until it works, or you reach 0. To verify, sort the objects, any object larger than k can be discarded, every object equal to k, count that up. Then perform an inward walk of those left checking for sum equal to k-1. If sum equals step both ends inwards and increase your count, if sum smaller move the bottom point up, if the sum is greater move the top point down. Continue until the points meet or cross. If the count is greater or equal to k, then its possible to make the k rows needed.

Q2) Given up to 50 tasks which take time between 1-3600 time units, determine the minimum time to complete the tasks subject to the following strange constraint. You have a single worker, each worker can only do one task. However a single worker can split into 2 workers in time t.

A2) The greedy solution is to sort in descending order and replace the 2 smallest entries with the largest of those two plus the split time. Repeat until there is only one task remaining and return its time. Very simple, but is it correct? The answer is yes.

The path I erroneously took was to try and determine the minimum time to perform the largest i tasks and have k workers left over. However I couldn’t get this to work because my formulation didn’t allow for the other tasks occurring after a split but still concurrently with another task. I actually realised this fairly early on and tried to formulate a reverse search more like the greedy solution, but I got confused, went round in a circle and started working on the same broken approach again. (Way too tired!)

‘Proof’ for greedy solution. You can treat the splitting of workers as creating a tree. Each leaf node of the tree does some work. There are no empty leaf nodes because any empty leaf node could be removed and its parent replaced with the other branch. In an optimal solution the 2 smallest items will be at the deepest level of the tree. As if they are not, then they can be switched with any non-deepest leaf node and the solution either improves or stays the same. Furthermore all deepest levels are reached at the same time point, so you can freely interchange any leaf at the deepest level without changing the total time for that tree. Hence, the two smallest nodes can be made to share a common parent. This pair of two smallest nodes and their parent can be replaced with a single leaf which takes as long as the longest of the two smallest nodes plus the split time. This results in an equivalent problem with an identical solution presuming we have started with an optimal solution for the original problem (I don’t have a strong proof of that, but it appears obvious – decomposing a node can only make it easier to arrange the tasks to achieve an optimum so if we are already optimum, we are also optimum for composition). We can now start all over again, but we have 1 less leaf node. Hence we can repeat until we only have 1 node, and the total cost is now obvious.

Q3) Given two rows of equal length of up to 16 people each with heights between 140 and 190, determine the minimum number of adjacent people swaps such that every member of the second row is larger than its corresponding member in the first row.

A3) Interesting question… the limits of only up to 16 people suggests an exponential search is feasible.

Looking at one of the passing answers, it looks like an exponential search on the minimum number of swaps to make the first k elements correct, looking at every possible subset of the k-1 swap locations. I’m not sure how this accounts for swap ordering… maybe I’ll work it out after some sleep.

So, post sleep revisit… The approach is to determine the minimum number of swaps required to have index i and smaller satisfy the criteria using i out of the available slots. The tricky bit in understanding the solutions is that they use a compact state space. If you use a bit field, with i bits removed, the fact that there are i bits removed indicates that those numbers were placed in the first i slots, in some order. So all bits set means any ordering, so that is where we start. For each set bit, we consider moving that value to the beginning. This will take a number of swaps equal to the distance between the start and the candidate, the rest of the values will retain the same order, and obviously we only do this if moving it results in a satisfiable condition. We now have a bunch of new bit fields to determine the minimum swaps for, with the bit corresponding to the moved index removed. So looking at each new state we can determine which index we are moving to by counting the number of gaps, and we can start again. However when it comes to distance to move to the new spot, we have to account for the fact that in moving an item to the left, the items to its left have all moved right 1. Conveniently we can determine this from the bit field. Each 0 bit corresponds to something which was moved, so if we pretend the 0 bits aren’t there and align right, that’s where the candidates are currently placed. So the count of set bits to the left of the candidate to move gives the number of swaps.

This recursion can be memotized over the bit field to ensure it runs fast enough. Terminal condition is the empty bit field which has a cost of 0. All that remains is to prove this approach correct…

Looking at the recursion from the inner most levels first. Obviously if all are in the right place, the cost to move them into the right place is 0. If all but 1 of the left most spots have been made correct, obviously the rightmost must stay still, and if it is in the correct spot, the cost is 0. If all but 2 of the left most spots are correct, either we swap or we don’t. If we swap the cost is 1, otherwise the cost is 0. The recursion will try both options so there is definitely no loss of generality yet. All but 3. In this scenario if the ideal scenario has the right most moving left 2, there is nothing to gain by swapping the other 2 first, it just increases the cost now and increases the search space, we can leave deciding whether to swap those until after swapping the right most across without any lower cost scenarios lost. Thus demonstrating that we do not lose generality. (Not a proof, but doesn’t look too hard to formalize.)

Now if you’ve been paying close attention you’ll notice I left something out completely. Everything above presumes that you only swap the people in the front row, or only in the back row. I can’t see a good way to demonstrate why this still leads to the correct solution, but intuitively, mixing front and back swaps serves to lose the ability to chain the swaps in order to place items in the correct spot. In order to perform the chaining you end up needing to do the corresponding front swap for a back swap or vice versa (or simply repeat the same swap again, either way it seems obvious that the number of swaps is increasing). In some scenarios mixing the swaps doesn’t make things worse, but it does not provide ways to make things better.