April 2013 – The Missing Docs .Net

GCJ13 R1A

An interesting new style of question turned up in this round, where you could potentially fail even if you do everything right…

Top 1000 advancing, cut-off ended up being Q1 small/large, Q2 small in quick time, or Q3 small as well.

Q1) Determine how many rings on a bullseye can be painted using a specific amount of paint.

A1) Conveniently the ratio of paint volume to surface area involves pi, so this problem ends up being integer. First off you calculate the area of an annulus with inner radius r and outer radius r+1 – this is pi*(2r+1). Paint volume per ring is thus 2r+1, and we can get all rings by summing from the initial r, increasing r by 2 each time. For small input you can just keep adding until you don’t have enough paint for the new ring and return the count of rings so far. Large input paint volumes need 64 bit integers, so brute force needs over 1 billion iterations in the worst case, so that is unlikely to work fast enough.

So for the large input you need to be a bit smarter. The sum is a classic arithmetic series total, which has the closed form (a+b)(b-a)/2k – where a is the smallest element, b the largest element and k the element spacing. First option (which I would suggest is the best choice) is to binary search on the number of rings, using this formula to calculate the sum. Being very careful to avoid integer overflow. (So long as your binary search range is selected very carefully (a+b)*(b-a)/2k should able to be made to work with 64bit integer, as k is 4 and a+b and b-a are both guaranteed to be even – but it is simpler to just use BigInteger.) Alternatively you can expand the closed form to give a quadratic formula in terms of the number of elements and the target paint usage, solve for n using double precision floating point, and then check which of the nearby numbers is the actual maximum using the closed form (to avoid floating point error triggering the floor function to incorrectly give you the one less than the answer).

Q2) Given an ordered list of tasks where you can spend your energy to create value, and you have a maximum energy E and a regeneration between each task R, and each task has a value per energy expended V_i – determine the maximum value you can get out of your energy, assuming you start at maximum energy.

A3) Small input the maximum energy is only 5, so going into any given task there are at most 5 possible scenarios, and only 10 tasks, gives a true brute force approach of 5^10, which is only a few million. Obviously you are always going to spend all remaining energy on the last task, so its only really 5^9 even. But you could use a cache to get the state space down to ~50 entries.

Large input maximum energy is 10^7, maximum value per energy is 10^7 and maximum number of tasks is 10^4. So even the cached approach is ~10^18 runtime operations. (Which is coincidentally also the range of the maximum answer, so 64bit integers are back in play.) Looking at the problem more closely it seems that a greedy approach is more likely – you want to use your energy all up on the most valuable tasks, unless you could have conserved some of that energy for a later even more valuable task.

Looking at someone’s solution this appears to be the case. You have an array of energy used at each location, starting off by using all energy you have for the first task, and everything you regenerate at the second. And so on, using all the energy you have as fast as you can. But also at each location after the first, you look backwards 1 by 1 to steal some energy forwards if you are higher value per energy. If the total energy used between your current steal candidate and yourself is less than or equal to max, you can safely steal it all forwards. But if it exceeds max, you take what you can, and leave the rest – and stop looking further back. Also if you ever find an element more valuable than you, you stop going back because everything prior to that will have already been stolen forwards to that element as much as possible. This greedy solution looks worst case O(N^2) which is ~10^8 – but the case to trigger that is first task is most valuable, rest are in ascending order, maximum energy is greater than N^2 * regen rate – so there probably weren’t too many inputs that were that bad. I suspect that an O(N) (or close to it) solution is possible by tracking whether we have pulled all the energy forwards in certain ranges, and thus skipping them during any walks backwards.

Q3) Given a set of products selected at random by choosing random subsets of N cards with randomly selected 1 digit numbers (between 2 and M inclusive) take a guess at what the original cards were.

A3) Yes, that’s right, the question doesn’t want you to give the answer or -1 if it isn’t possible to tell for certain, it wants you to take a guess if you aren’t sure. It then specifies that you pass if you have a certain percentage correctness. Since that is a bit nasty, there is no large input, just 2 small inputs that you can repeatedly try until you succeed. Time penalty still applies (which is kind of unfortunate).

Helpfully the first small input is very constrained, only 3 cards, only 4 possible values per card, giving 64 scenarios to consider as to whether they are possible given the presented products. Then to maximise your chances, if there are multiple possibilities, you choose the ‘best one’, weighted by the probability that a given choice of cards resulted in that products given – grouping them by the sorted order (of which there is only 20 cases). Also, since you are given 7 products, your probability of being given a product which is maximal for the value of the cards is actually pretty high. (You could run a bunch of simulations and determine probability for each possible product as to what the original input was, then combine them to choose the most likely input.)

Looking at solutions, it appears that even that is more complicated than required. Success could be achieved by looking at the products – highest power of 3, start your answer with that many 3s, highest power of 5, add that many 5’s to the answer – then look at the maximum power of 2. If you only have 1 digit remaining, you know the answer and return. If you have 2 digits remaining and the power of 2 is 8, then add a 4 and a 2, otherwise add 2 2’s. Finally return 3 2’s otherwise.

The second small input however has 7 possibilities per card and up to 12 cards. This one really needs some smarts, but 7^12 is 17billion, so you can’t consider every starting hand, let alone the probabilities of every subset. Also since there are only up to 12 products given per case, probability of getting the maximal product is quite low. While we can’t consider every starting hand, if you group them by repeats it should be much better. So if you generate the non-descending only hands, and calculate the number of repeats using the standard combinatorics (which I would have had to look up…) and then looking at the probabilities for each possible product (by trying the 2^12 multiplications) for frequency you can generate a big database which can be reused for each sub-test case. For each possible answer you calculate the probability by multiplying all the probabilities of each product together under each potential initial scenario group, and which ever answer is best, you return it.

Looking at a solution it seems I was on the right track. Only thing I saw was that rather than calculating actual repeats, they calculated inverse repeats divided by 12! and then used that as a division factor in the probability rather than a multiplication. Probably avoids requiring to normalize the probability calculations and worry about range errors.

TCO13 R2B

Seems low turn out is normal now – 1214 registered out of a possible 1950. 6 Australians braving the 2am start.

Tough round again, top 50 was determined by having solved the 250 and 500pt, with a couple of 250pt only thrown in through large challenge success rates. Despite throwing away some points with an obviously wrong challenge of a correct solution and getting off to a slow start with multiple silly bugs in my 250pt code, I managed to get top 400 placing and my TC rating went back over 1700 again. But still not in the running to get a t-shirt.

Q1) Given 3 infinite arithmetic series which you can freely slide to different integer offsets determine the maximum minimum gap between any 2 numbers amongst the 3 series. The 3 infinite series are defined by their spacings a,b,c which are all between 1 and 2000 inclusive.

A1) Initial guess was GCD(a,b,c)/3 rounded down – but that doesn’t work for cases like 20,30,40 (which was one of the given test cases helpfully).

As it turns out, the answer is either min(GCD(a,b)/2, GCD(b,c)/2, GCD(a,c)/2) or GCD(a,b,c)/3 (both cases, rounding down) – the determining factor being whether all 3 pairwise gcds are equal or not. Equal results in the second case, not equal results in the first case.

But since the problem states that the maximum range is 1 to 2000, you can find the solution with a bit less smarts and a bit more brute force. (As I did.) Firstly you observe that offset of one of the arrays doesn’t matter, it can be fixed at offset 0 – since it is an infinite scenario, the origin is meaningless. Secondly, it is obvious that an offset greater than the sequences spacing adds no value, since again the sequences are infinite.

This leaves 4 million scenarios to evaluate. So long as we can do so quickly, we are done. So for a pair of infinite sequences (with spacings s1 and s2) with a relative offset of i – what is the closest approach? Turns out this isn’t too difficult to work out – it is min (i % gcd(s1, s2), gcd – (i % gcd(s1, s2)). So calculate this for all 3 pairs and take the minimum then return maximum of those across all scenarios. My implementation (mostly pointlessly) went a bit faster than this by skipping the inner loop if the outer loop gave a pair of sequences which was worse than the current best. Also the outer loop can be simplified to between 0 and gcd(a,b) rather than 0 and b. However a mistake I made at first was trying to make the same range reduction for the inner loop – which isn’t valid as it doesn’t fully explore the combinations of modulus.

Q2) Given a directed graph of up to 50 vertices, where edges can come in 3 different flavours (and any given pair of vertexes can therefore have up to 3 edges in each direction between them) a game is played. This game involves one person choosing a starting location and announcing the flavour of each transition made to move to a new location. The second person then tries to state with certainty where the first person currently is. If the first person reaches a dead-end, they must declare as such, and the game is over (even if there are multiple dead-ends). Game is scored by the number of moves the first player makes before the game is over. Given the graph, and assuming optimal play by player 1, return how long the game is – or -1 if the game is infinitely long.

A2) So it isn’t too hard to write a solution to this problem (which I did) – it is however much more difficult to write one which isn’t too slow in some corner cases. I actually tried to construct a large test case to break my solution, and failed – but the system tester did not. My solution approached the problem from player 2’s perspective. At the start the ‘possible space’ is the set of all locations. Each called out transition maps the possible space to a new possible space. If that space consists of a single location – the game is over. If that space is empty, the optimally playing player could not have called that transition out. A simple dictionary approach lets you store how many moves remain for a given state. When you first see a state you store -1 – thus if you see a state on your current path you can immediately return -1, but if not you can replace that -1 with the actual best number of turns possible. Running cost of this is worst case O(n^2*2^n) – which for n=50 is way too slow. (Using bitwise operations like I did its O(n*2^n) 64bit integer operations, but that is only valid till n=64, and its not a significant improvement anyway.) But constructing this worst case is non-trivial – so I gave it a shot (and failed).

It appear the correct approach is to consider that for player 2 it doesn’t matter whether there are 2 or 22 possible locations, anything greater than one rules out the game ending. So if you take any pair of locations, and consider the possible ways to map that pair to another pair and so on and so on – if you ever find the same pair again, you’ve created a loop and the game can continue forever, but if your search always terminates (no mapping for pair, or only mapping is to them both going to the same node) it isn’t too far of a stretch to see that the best number of turns you find in this search (starting from each possible starting pair) is the same as the best number of turns from the more extensive search.

Q3) Given an initially empty rectangle of size x, y – and the ability to perform 2 operations where you select a sub-rectangle sx, sy and fill 0 or more of the squares in the sub-rectangle – determine how many possible different final states there (modulo large prime) x and y bounded to at most 40.

A3) Only 2 solvers for this problem. Answers look like serious combinatorics – adding and subtracting numerous cases. Seems to break it down into rectangle size where changes are made. For each rectangle size where changes are made count how many possibilities, and multiply that by number of possible offset locations of that rectangle in the larger rectangle. That somehow makes the counting of possibilities a bit easier. Seems that for each of these rectangles you consider the case where the 2 sub-rectangles are fixed to opposite corners – and then subtract off the overlap between the 2 cases of opposite corner. Then there is some magic to do with the edges of the rectangles which must be to avoid some kind of double-counting – its all way too complicated for me at 5am in the morning…

GCJ13 Qualifying Round

As is typical for GCJ – qualifying round had a large turn out, ~22000 people made a successful submission, of those 17059 qualified (at the moment at least).

63 perfect scores – 2 of which were Australian, and I don’t recognize either handle. Radeye gets the unfortunate distinction of 64th place, having managed to solve all of the hardest problems, but having the large input on the trivial first question fail… (I have to wonder if he was trying the different language for small/large input thing, as his small input in perl looks fine at first glance… but it is perl, so yeah…)

Q1) Do some simple analysis of a board from a 4×4 equivalent of tic-tac-toe where there is a special piece which counts as both X and O. Determine winner/draw/incomplete.

A1) This problem was so trivial they had to point out not to make it harder on yourself by attempting to determine inevitable winner/draw. Small and large input, but the only difference was the number of test cases. (Importantly, solving this problem completely is not enough to advance to round 1.)

The problem also specifies that the board will be from a valid game, so you don’t have to handle cases where both X and O have won. Check each row/column/diagonal for either winner – return if found. Otherwise return draw if board is full, incomplete if not.

Q2) Given a rectangular array of ‘heights’ and a device which reduces all heights in a row, or column above an adjustable setting, determine whether the given heights could be obtained if everything started with equal height (which is greater than or equal to anything in the given heights). Adjustable setting can only be changed in between usages, not as it performs a row/column ‘trim’.

A2) The important thing to notice here is that for any given square, it doesn’t matter what order the device interacts with it, it only matters what the minimum setting was across all interactions. So if we can determine what height was used for each row/column – we can just run a simulation and check whether the input matches expectation.

For each row/column – the device cannot have been set any lower than the highest member of that row/column. And if it was set any higher, the usage definitely could have been ignored entirely. Therefore you can just determine the maximum value for each row/column – perform that simulation, and check result versus input. This runs in O(nm) time and additional space.

A slower alternative (which I thought of first – and have not verified is actually correct…) is for each cell, check to ensure that it is the highest value of either the row, or column (or both). If it is not the highest value of either the row or the column, it cannot possibly work. Trivial implementation here is O(nm*max(n,m)) – which would still be fast enough (and requires no additional space, if you cared for some reason). This can also be done in O(nm) by painting cells which are equal to max of row, or max of column in two passes over each row/column. If all cells are painted, you have succeeded. But the boolean for each cell has to be stored somewhere.

Q3) Determine the number of palindromic squares of palindromic numbers there are in the range A to B. Maximum value of B either 1000, 10^14 or 10^100 depending on problem tier.

A3) This is the first time there has been a problem with 2 large inputs, one harder than the other. This problem was correspondingly worth a very large number of points.

The small input is trivial. In fact you can do it in your head, since the number of palindromes less than sqrt(1000) is about 12, and of those only 5 are still palindromic when squared. So just search the set 1, 4, 9, 121, 484 against each given A, B pair to return a count. I wonder if go-hero.net has a category for the null programming language… There was only 100 test cases, if you type fast and pre filled the form a bit, it shouldn’t be too hard to do that in 4 minutes.

The first large input poses a bit more of a challenge. The first thing to realise is that these numbers are very rare, so we can solve all 10000 test cases by generating the list once and performing binary search to find indexes to take difference to get the count. Indeed we could even pre-calculate the list and hard code it in this case, there aren’t that many less than 10^14. An approach which works for this case is generating all palindromes less than 10^7, squaring each one and checking whether its a palindrome. We can even generate all palindromes less than 10^7 by brute force, checking every number to see if it is a palindrome – it should still run in time.

The second large input is where it gets tricky. The numbers are still rare enough that you can calculate them all before running any test and keep them all in memory, but generating them all becomes a bit more time consuming. First off, we can’t possibly brute force 10^50 numbers to find palindromes to square, so instead we build the palindromes by taking the 10^25 numbers and either reflecting after or through the last digit to create our palindromes. 2*10^25 however is still way way way too many. But if you look at the output of such a program for up to 10^8, you can get a pretty confident grasp on some ways to speed things up. The only case where a digit greater than 2 is used in a palindrome which when squared still gives a palindrome – is 3 itself, giving 9. So you can special case that and this means you are down to 2*3^25 cases. This is still too many, but it is a big improvement. A closer look will show that the digit 2 is only used in very rare circumstances. If it is the first digit, all other digits are zero, or the last digit is 1 and is reflected through rather than after. Otherwise it can be the last digit that is reflected through. The former gives 1.5 cases per power of 10, a tiny number. The other reduces us down to 3*2^25 cases. This is 100 million cases, each of which is worst case performing a 50 digit by 50 digit multiplication. This may take minutes even with a good BigInteger implementation – so advisable to test before hand to see if it will be fast enough, or even better – pre-compute and store/load it.

However, we can do better. A closer look shows that every palindrome which is still a palindrome when squared, the sum of the squares of its digits is at most 9. This is why 3 only appears once, and why 2 is so rare. But it does help us beyond that. Out of the 2^25 scenarios from creating patterns of 0s and 1s – we can skip everything which has more than 5 1s, or more specifically 4 1s in the reflected section and optionally a 1 in the middle. This gets us closer to 3*25^4 which is clearly going to run in time. But the complexity of walking permutations like this might be a little bit annoying…

Q4) There are a set of boxes, which each take a specific kind of key to open. They break the key in the process. Inside the boxes there may be more keys. You start with a set of keys and a description of all the boxes, return the canonical opening order to open them all, if possible at all.

A4) Small input wasn’t too bad here, 20 boxes. For a given subset of the boxes being open the number of keys you have left doesn’t depend on what order you opened them, so you can do a depth first search on the canonical order, ignoring any states you find twice. This is O(n*2^n) which for 20 is perfectly fine – especially with only 25 test cases to perform.

Large input however was 200 boxes. The previous algorithm is no longer any good. I spent some time thinking about this, but couldn’t come up with a convincing idea. My best thought was that maybe if you did a reverse depth first search, from the answer back to the start, it might be self-pruning to the point where your search space ends up being much smaller than 2^200.

Looking at other peoples code, it appears that the solution is that there exists a much faster method to determine whether it is still possible to get to the final state from any given state, compared to finding a solution. This method is then just applied iteratively every time you try to make a move in the depth first search on canonical order, and back track immediately if the ‘can still finish’ function returns false.

There appears to be two parts to this approach. One is that you first check that there are enough keys in the given situation to open every chest. So you just add up the contents of every chest with the starting scenario and ensure it is higher than the count of the number of chests for each key type. Somehow, having checked this then makes the following seemingly nonsensical ‘can still finish’ function valid. Create a directed graph connecting each key type of a chest not yet opened, to the key types contained in those unopened chests. Then for each key type you have (ignoring count) start traversing the graph using a simultaneous breadth first search. If you can reach every key type used by an unopened chest – the problem is still solvable.

I actually wonder if some of the solutions I am reading are actually correct, or whether the fact that there was only 25 test cases meant it was easy to fake a truly correct answer.

GCJ13 Starts Soon

Qualification round starts in just over 13 hours and is available for 25 hours after that. Registration is available right up until the Qualification round ends, so if you have forgotten, there is still time.