GCJ11 QR Post-Round Analysis – The Missing Docs .Net

So looks like a pretty big turn out – 10566 qualifiers. 1855 perfect scores – which I was a part of – although in some ways I don’t think I deserved to be.

The 4 problems was definitely more than I could have handled in a 2.5 hour competition (not without less huge insane mistakes at least!) – I actually took more than 5 hours to submit my answers – although I had a lot of distractions and took a few hour break in the middle. So sure it took less than 2.5 hours to write up the code, but I cannot ignore the value of that break allowing to the answers to mull over in my head.

Q1) Given starting positions for 2 robots and that they can both move at 1 unit per second, and that it takes 1 second to press a button, and a list of integer button positions and who has to press them in exactly the given order, what is the minimum time to complete the required work?

A1) This puzzle was actually portal themed – there was mention of possible cake. As usual they provided a sample example of ‘one way’ to solve a sample optimally. I almost wonder whether I should start to assume that any time they say that it is an example of a sub-optimal strategy which just happens to be optimal.

Anyway the question itself is a fairly basic simulation using a greedy algorithm. At any given point in time the two robots have a target position, which is based on the next button which needs pushing by them. For a tiny optimization, this only ever changes when they push a button, but given the constraints even on the large input you could probably have gotten away with updating the target every timestep. Then it just comes down to simulating each timestep. If a robot is at its target, it doesn’t move. If it is supposed to push a button it pushes the button and doesn’t move. A likely trivial mistake to make would have been to accidentally let the second robot push its button in the same turn as the first robot because you’ve updated which is the next button to press but haven’t yet ended the current turn. Maximum number of turns is something like 100k for the large input, no problems given the 8minutes you have to submit the answer.

Q2) Given a list of pairs which if matched consecutively are replaced by a third (which will never be found in the list of pairs), and a list of pairs that if they are ever found anywhere in a list (except if they are instantly replaced by something in the first list) cause the entire list to be cleared. Process a list of input characters adding each one to the output list in turn, subject to these rules, print the output list in a pretty format.

A2) At 82% this problem had the lowest small to large conversion ratio – and it isn’t surprising because the small problem constrains the list of pairs to be at most 1 for each case and the large allows up to 30. This isn’t a performance problem at all, but it does mean that several corner cases just cannot occur. Writing up my solution for this problem probably took longest of all of them, although I found it easy to solve in theory – I made copious errors during implementation – several of which would have crept past the small inputs if I hadn’t of noticed them in time. (Keeping with the computer gaming theme, this problem is inspired by Magika)

So again this is a straightforward simulation. If the list is empty you add the character. If not first check whether the incoming character and the existing last character are a pair in the first list of pairs. Performance is not a problem, just walk them all, checking both orders (scale optimization is to add pairs a dictionary of dictionaries – but not needed here). If a match is found, replace the current last character with the replacement and continue loop onto next character. If not it is time to check whether there is a need to clear the entire list.

Again because the scale constraints are so small there is no need to optimize this using dictionaries, you can just walk every entry already in the list, consider it with the incoming character against every pair provided in both orders. Obviously this is O(n*k) but k is at most 30 in the large input. If you do optimize this scenario you would probably do what I did and add a set to keep track of the unique elements in the input list so you can dictionary lookup the next character, and do a set intersection between the dictionary results and the set of unique elements in the input list. The one thing to be careful of here is that a set is not a good implementation for keeping track of the set of unique elements in the input list, you need a count of each unique. Without a count, the pair transformations which remove the last element can cause you to incorrectly claim that element is no longer in the list. In any case, if you find a match, you clear the list and do not add the incoming character.

Finally if neither transformation nor clearing occurs, add the character. At the end of the simulation loop the nice formatting was trivially achieved using string.Join – but a simple loop with a first loop condition does the job.

Q3) Two boys have a set of items which each have a value. One boy can’t add – any time he adds two numbers he always forgets to carry the 1. He also adds in binary… The other boy however can add just fine and wants to create two piles of these items which the first boy will think have the same value, but hopefully do not. In fact he wants to minimize the value of one non-empty pile, and being a mean boy, give that to the first boy and take the rest for himself. For a given set of values, is it even possible, and if so what is the maximum value the second boy can take for himself.

A3) Looking at the questions, I would say this one gives the first question a run for its money as the easiest question – but the statistics don’t back me up. I was bemused to see people asking the Admins questions about this question during the competition. ‘Does the first boy add the items left to right?’ – ‘Yes’. Addition without carrying the 1 is both commutative and associative, so the order is irrelevant.

First off we note that the first boy thinks A+A=0. This combined with commutative and associative, quickly shows that the sum of all items must be 0 if there are 2 piles which have the same sum. The next leap is to see that A+A is the only way to get 0, hence again because of commutative and associative natures any bi-partition of such a pile with sum 0, has equal values.

So, from there it is trivial. Second boy gives the single smallest item to the first boy and takes everything else.

Implementation involves proving the no-carry sum is 0. Afterwards I realized that I could have implemented this using simple xor on the whole number, but I broke the integers into arrays of bools and reimplemented bitwise no-carry sum (xor!) by hand. Once you prove that the xor of all inputs is 0, sort, and sum the top n-1 elements and return that as result. Simple.

Q4) Given a permutation of the first N integers, and a single operation which is ‘randomly permute a selected subset’ and acting ideally – what is the expectation value for the number of operations until the list is sorted.

A4) This was by far both the hardest and easiest problem. The provided sample provided a hint which was both useful and useless at the same-time… Random simulation looked liable to time-out before convergence, especially for large input and presumed you knew what the ideal strategy was.

So, what is the ideal strategy? The sample provided a hint that if you have n disordered elements, acting on each sub-permutation cycle independently appeared a good strategy. (Permutation cycle in disordered elements is the smallest set of entries which can be reordered on their own to end up completely ordered correctly. For example 2341 is a permutation cycle of 1234, as is 2413, but 2143 and 4321 are not as they can be divided into pairs which when switched are in the correct positions. This may not be the correct definition of a permutation cycle, but I can’t think of a better term to use in this discussion…)

Indeed I could see that it seemed reasonably obvious, that not randomizing a subset of a permutation cycle was inefficient. I didn’t prove it, but a trial with small permutation cycles agreed with the hypothesis. So I started down the path of dividing the input into a count of the number of each permutation cycle length, under the assumption that the ideal strategy was to randomize each permutation cycle in turn until it was ordered. At this point I tried some examples – in fact the first example I tried was a cycle of length 3, 312. However due to some incredibly stupid mistakes (I made 3 in quick succession, but only noticed 2) I incorrectly calculated an average of 2.5 operations to order this cycle. I then made a completely mistaken leap from this mistaken value to presume that this was further evidence that working with cycles was better than working at random.

So at this point I start breaking out some serious combinatorics deriving the formula for the average number of operations for a cycle of length n, assuming the ideal behaviour of working after each operation was to work with one cycle at a time. I’m not great with combinatorics, and it took me a while to realize that most of the combinations and permutations and factorials in my formula would actually cancel out, and I wouldn’t be needing to calculate the ratio of 3000 digit numbers to 7 significant figures. I then made a mistake doing the cancellations and ended up with something again, completely wrong. It was obvious though, so I started some trials again. This time on the core piece of my formula which is ‘how many different ways is there for n randomly permuted entries to have a length k cycle which includes the first element. I was bemused to find that my trials didn’t depend on k, it was always (n-1)! – or 1/nth of the total permutations. I could see where I screwed up my cancellations now and ploughed ahead with my formula. I implemented mutual recursion memotization, and hit go. It was fast, it produced the correct answers for the sample.

At this point I decided to have a look at my memotization tables – I laughed because there, accurate to 7 or more significant figures, was a table which contained ‘n’ at index ‘n’. I had successfully demonstrated that for ‘n’ disordered elements the sum of the times for each subcycle is equal to ‘n’, which is tantalizingly close to suggesting that advanced strategy was irrelevant. The only thing you have to do, is make sure you don’t randomize something which is already in the correct place (and make sure you do randomize every member of any given cycle, which comes from letting everything else randomize).

So at this point I ripped out my memotization tables and replaced them with ‘if passed n, return n’. I then realized the obvious, that I didn’t actually need almost any logic any more at all. The answer was equal to the number of disordered elements in the input. Counter, for loop, if statement, increment, return result.

I probably would have solved this problem quickly, if it wasn’t for that mistake of calculating 2.5 as the expectation value for cycle length 3… Instead I wasted almost 3 hours on it alone. 3 hours for 4 lines of code… (ignoring the file parsing code of course!)

Leave a Reply Cancel reply