June 2010 – The Missing Docs .Net

TCO10 R2

So I get a t-shirt this year 🙂 not a google one, but a t-shirt none-the-less.

Round 2 was a very strange round, I thought I was slow on Q1, but apparently I was unusually fast on Q2 – which is very much not how I usually perform in these competitions, even less so at 2am start times…

As far as I can tell out of 8 Aussies only 2 of us got t-shirts and are through to round 3. In fact despite it not being a google code competition, and hence John Dethridge being eligable to compete, I was the highest ranked Aussie at 133rd place. I think that is my personal second best placing ever in a coding competition. (Although I guess it doesn’t really count until its a placing which gets me eliminated…) My best ever I believe was 115th or so in one of the first google code jams. One of the ones where 100 people were going to be flown to the US to compete on-site…

Q3 had a decent number of submissions, but they did not include me. Almost all of those submissions failed though, so I have to suspect there is something I am missing about this question (even if I did only work out how to solve it at the start of intermission – I hadn’t been trying that hard since I felt relatively secure that I would get through given my current score and with only 30 minutes left I felt my chances of writing up a solution were negligible… and on second thoughts I am not sure I actually have solved it…). 1 of the 5 people to solve Q3 didn’t even get through, because both their first 2 answers failed.

I might as well write up some analysis before I get 3 hours sleep…

Q1) Given a set of places and multilane roads which connect them (each multilane road has the same number of lanes in each direction) and a single snow plow which starts at place 0 and takes 1 minute to traverse a single lane between any 2 places. What is the minimum time to clear all the snow and return to place 0?

A1) I wasted a couple of minutes convincing myself that my very easy solution was correct.

Step 1, verify the graph is connected to all places which have roads – I used breadth first search from place 0 and then checked the two ends of every road were at places reachable from place 0.If not reachable return -1.

Step 2, sum the number of lanes, return that value. This step was the one which bothered me, but I eventually convinced myself that since depth first search visits every edge exactly once, and multilane highways can always be processed with all but 1 lane on the way in, and the final lane on the way back, that it is indeed this trivial.

Q2) Given a number less than or equal to 100 million, find the smallest number equal to or greater than that number, which is the sum of 2 numbers made up entirely of odd digits in base 10.

A2) The key to this question is to recognise that the number of numbers which consist of only odd digits less than 100 million is only half a million. 100 million itself can be expressed as 1 + 99999999, so there is no need for anything more than 8 digits.

So, generate all odd digit only numbers up to and including 99999999, in order. Then starting i and j equal to the first and last positions in the list, decrement j while the sum of the two positions is still greater or equal to the input. Do not go past it. Record that sum. Increment i and start repeating the j decrement again. When you can’t decrement j any further, compare sum with best so far, if better, update. Keep repeating this until i is greater than j.

There is a much more efficient method, where the solution can be found in time order of the number of digits in n, rather than a time order of n/log n, but taking the simpler to reason/code approach was probably the only reason I scored so well. If they had of wanted to make the question harder they could have simply increased the limit from 100million to 1 billion. This would have caused memory usage of the simpler implementation to jump, I believe past the memory limit.

Q3) Given a ‘large’ grid of squares, and up to 40 of those squares which ‘interesting’ determine the minimum number of grid splits required to get each square on its own. A grid split is dividing one section of grid in to 2 smaller sections, either along a horizontal or vertical line. And ‘on its own’ means no other squares, interesting or otherwise, may be in its section.

A3) I am still not sure why a greedy approach of selecting the edge which has the most number of interesting squares touching it for each of the current set of sections and choosing to divide on it might fail (which is where I spent my first 25 minutes of thought), but a memotization or dynamic programming approach is going to be the right approach…

The problem with DP is it does appear it will run too slow. The worst case of the number of potential sub problems is 80x80x80x80 and each of those cases will require processing time, with up to 160 different children for each of the highest level problems. 2 seconds processing time is going to be pushing it, and memory usage is going to be close to the limit.

Apparently some very good coding in c++ means the direct DP approach can run in time and in memory. And after hearing one of the other coders say 40^5 as their execution time, I think I may have worked out the trick. If you sort special squares along an axis, there are places where you get 2 potential edge splits in a row, where there are no special squares inside. I believe the optimisation is to realise that if you split one of those edges, you can split both and treat it as a single operation, but with a count of 2 rather than 1. This makes it 41x41x41x41 worst case, with 82 potential children at the highest (every?) level. I’m not 100% on this optimisation yet but I’m pretty sure its the one.

Actually I have subsequently convinced myself that this optimisation is fine, and the same logic allows us to immediately trim off any outermost edge split locations before we start the dp. This drops the dp table to 39x39x39x39 worst case with 78 potential children at every level (not just highest, but there could be some potential to speed it up with a binary search to find one which is in the current location and walk outwards, other methods seem to have a higher overhead than the time they save…)

TCO10 R1

So TCO10 round 1 finally came along, of course I’ve been sick all week and TCO starts their competitions at 2am, so I wasn’t feeling in the best of shape for getting through in the top 850.

But I did, about 750th, entirely due to a successful challenge during the challenge phase. Been a long time since I’ve done one of those, and its the first time it has made a real difference. I only got the first question out successfully, I realised I had been an idiot for about an hour with 2 minutes left on the clock to do the second question, and I made some stupid mistakes in my implementation, which I submitted without testing knowing my time on the first question was likely insufficient to get through. Interestingly no one challenged it despite the fact it failed even the most basic tests!

Its 4am, but I am a little energised from having snuck through, so I’ll write up the first 2 questions. Question 3 will have to wait – I haven’t really looked at it. (Although at first glance I think I would have made better progress on it than Q2 – although I probably would not have solved it).

Q1) Given 2 strings of equal length, find the smallest string which you can both make them equal to in the minimum number of moves. A move is changing one letter in either string to either the letter above or below (where z can wrap to a and vice versa). Minimum number of moves has priority over ‘smallest’ string. Smallest being lexicographically earliest, alphabetical sorting.

A1) The examples pretty much covered how to do this, although I imagine that skipped the one tricky corner case. In any case, the solution is to check each letter in turn between the 2 strings. If the absolute difference ignoring wrapping is less than 13, add the smallest of the letters to the result, otherwise add the letter a. I was very concerned that at the 2am start I might have gotten the less than 13 condition wrong, implementing an off by one. A simple mistake would have been to used less than or equal to. If the non-wrap distance is 13, the wrap distance is also 13, but you have to remember that wrapping always takes you through a, which results in a lexicographically smaller answer.

Q2) Given a computer with 2 registers both initialised to 1, and 2 instructions, one which sets register 2 (Y) to the sum of the 2 registers, and the other which sets register 1 (X) to the sum of the two registers, determine the shortest program set register 1 to the value ‘R’, where ‘R’ is between 1 and 1 million inclusive. If there is more than one program, return the program which is lexicographically smallest, where the second instruction as described above is called ‘X’ and the first one is called ‘Y’. (Just to confuse, the instructions have the same names as the registers!)

A2) I felt pretty stupid at 2minutes to go when I realised that the approach I had discounted immediately, was the answer. That approach was trying each combination of register 2 with register 1 equal to R and where register 2 is less than or equal to R.

Instead I recognised the similarities with Fibonacci numbers, and wasted time thinking the golden ratio might be involved in the solution. I tried breadth first generating under the assumption that the answer string would be short, but I soon proved that it would not always be. Despite that I somehow went back to golden ratio thinking…

What I should have realised was the association with GCD, a combination of registers can only be reached if their GCD is 1. But more importantly, I should have realised that for a given pair of X and Y, there is only one X’, Y’ pair which can produce it. And following this back to X=1, Y=1 (if it does indeed reach that) can be done very quickly (since it is Euclid’s algorithm). So running it 1million times is acceptable. (This is what I realised with 2 minutes to go…)

The subtlety is the requirement to return the program. If you try and build up the program for every execution of Euclid’s algorithm you will find making the generated strings will consume more cpu time than Euclid’s algorithm… So the trick is to only generate the strings if needed. Solving the problem in 2 phases is one approach, executing the second phase only for the cases which phase 1 indicated were smallest. Another is assuming that the shortest answer will have X’ and Y’ near each other, iterating in an order which does that first and then capping the Euclid algorithm recursion to whatever your best previously seen depth is. This doesn’t actually seem a guaranteed success, but it did seem to be good enough.

The best answer I saw calculated the programs as a single phase, but expressed them as a run length encoding, allowing for the division acceleration of GCD to be maintained. Better yet the run length encoding was a list which when compared to another list, actually maintained the lexicographical sorting of the non-encoded string, so the only point at which a large string might be constructed, was creating the final answer.

Finally if you were using c++ and 2 pre-allocated 1meg char arrays, it appears you can run in under 2 seconds cpu time even if you don’t use division acceleration. Sneaky…

Q3) Given a fully connected directed cost graph, determine maximum profit from tours of the graph which all have the same income (regardless of where the tour goes), all start and end at vertex 0, but cannot visit the same non-zero vertex as each other.

A3) I initially suspected something like enumerating subsets of the destinations, determining cheapest tour for each, and sticking them in a maxflow graph in a way which ensures no 2 which overlap destination wise can be selected, then running max flow graph algorithm. Suspect the creating of the graph is the bit which I would get stuck on, not enough practice.

… Apparently I couldn’t sleep until I solved this one, at least I didn’t feel the need to come and write up the solution straight away.

Create a max-flow/min-cost graph as follows. For every vertex in the original graph, create 2 vertexes, destination enter, and destination exit. Between each pair (exception for vertex 0) create an edge with no cost and capacity 1. For each edge in the original graph, add an edge between destination exit for the original edge start, and destination enter for the original edge end, with cost equal to original cost and a positive integer capacity (might as well make it 1). Connect vertex 0 enter to the sink with infinite capacity and no cost. Connect vertex 0 exit to the source with no cost and capacity C.

Run max-flow/min-cost for each C equal 0 to number of vertexes -1. Return the best of flow times income – cost, out of each simulation.

I was happy I managed to solve this without looking at any solutions (at 5am no less), I think that is the first time I’ve reasoned through a max-flow/min-cost problem from scratch. Interestingly I thought I had implemented a max-flow/min-cost in TMD.Algo, but it appears not. I really should get to that!

WordPress 3

So I’ve updated to wordpress 3 and switched over to their new default style. Have to say I think it is a huge improvement, the new column width and font size is much more readable.

I do need to work out what I want to use as my header image though…

GCJ10 Round 3

So round 3 was last weekend, and no exceptional circumstances were declared so just the top 500 went through (which didn’t include me).

With only the top 25 going onwards, the questions were always going to be tough, and for the 4 questions, getting all the small inputs and 2 of the large inputs in a decent time was sufficient to get through.

Only 370 people out of the top 500 got a positive score, and having an initial look through the questions I doubt I would have done very well – maybe a couple of small inputs out. One of the 2 remaining Australians got 66th place, quite impressive.

On to the questions I guess, although I’m going to have to refer to the post contest analysis this time for sure!

Q1) Given a sequence of numbers which are at most D digits in size, determine whether the sequence is predictable if it is governed by S(n) = A*S(n-1)+B mod P and if so return the next number. P is prime and less than 10^D, A and B are non-negative.

A1) The small input D was at most 4, meaning that a brute force on P and A, solving for B, verifying the rest of the numbers fit, and determining whether the set of next numbers was unique, is easily done. That would have been easy. The large input however D is at most 6, making a brute force of P and A impossible. What we need to do is consider each P and the set of numbers to determine both A and B simultaneously. If we have 3 numbers, we have 2 simultaneous linear equations, Y=A*X+B mod P and Z=A*Y+B mod P (if we don’t have 3 numbers, the sequence is not predictable… for every A there exists a B which gives the result). If we subtract these equations we get Y-Z = A*(X-Y) mod P. and so long as X-Y is non-zero since P is prime, this equation has a unique solution which we can solve using extended GCD (which is in TMD.Algo). This still leaves a few special cases. Specifically the X=Y case. If X=Y then Y will also equal Z, due to symmetry, regardless of A and B and P. So any case of repeated numbers is predictable. This includes the 2 number case, despite what I said just before…

Looking at the contest analysis seems I was on the money here. So, worst case I would have come 250th… lets see if I can do better…

Q2) Given up to 100 unique integer lengths of fencing, determine the minimum number of fencing pieces required to create a fence with a length of exactly L. L is between 10^10 and 10^18.

A2) For the small input the maximum length of each piece of fencing was 100. For the large that jumped to 100000. I think that a key point to this question would have to be that L has a very large minimum value. For the small input I would posit that it can be solved using dynamic programming on the lengths of fence to create lengths less than 10000, then for each of positions up to 10k prior to the target L using only the largest piece, check if there exists a way to reach L using the DP table, and find the minimum amongst those options. When the fence size jumps to 100k, my approach becomes infeasible, because the DP needs to go to 100k squared. Going to have to check the solutions for this one.

Okay, so I would probably not have gotten the large out, although the intuitive leap does appear within my level. Take the largest board, use it until we get just short of L (or exactly L). Then we need to determine how to make the remainder using smaller fences, or to make the remainder + largest fence, or +2 largest fence… We can do this by creating a graph, starting with nodes 0 to largest fence-1, and adding edges as follows. For each fence length less than max, add an edge from each node to node + length of weight 1 if node+ length < max or to node + length -max of weight 0 otherwise. Then given this potentially 100k node graph with, maybe 10million edges, find shortest path from 0 to remainder (This can be done in O(E) time). The length of that shortest path + number of max length boards to reach 0 is the answer. This only works because the shortest path will never have more than 100k weight 0 edges, and L is greater than 100k squared.

Based on my reading of the answer, I believe my small input answer is correct, so that would bring me to 21points or 187th place.

Q3) Given up to 200 locations at integer coordinates on a (very long) line each containing 1 or more items, you want to move them so that there is at most 1 at any spot. However your only option for moving them is if there are 2 or more on a spot you can move 1 to the left and 1 to the right.

A3) Small input is up to 200 items, large input is up to 100k items. Even the small input seems tricky here… would seem to need some kind of strategy. Thinking about each move, it doesn’t change the centre of mass, so if we can find the centre of mass, the number of moves is minimally limited to those which spread them to being every spot filled with 1 about that centre of mass. That however may not be possible. Movements at the centre of mass are more useful than other movements since they are the only ones which actually separate things. The problem may also be separable into sub problems which don’t interact. In that case the centre of mass of each sub problem seems important. Small input I guess can be done via simulation of a strategy of centre of mass first or maybe largest first. I probably would have just tried one to see if it solved it, since you get answers for the small input immediately. Large input however requires better than straight simulation… Time to check the answers again…

Well, turns out strategy isn’t needed for the small input, so I would have solved it ‘by accident’ – the actual series of moves is irrelevant, the final configuration is always the same and takes the same number of moves to get to regardless of strategy. Apparently this is a ‘famous theorem’ which probably means there are a few thousand people in the world who know it 😛 With this observation (and one other leap I would have been unlikely to get) the problem becomes quite simple. Simply construct the final answer and from there determine how many moves were required to get to it. Constructing the final answer can be easily done by simply building up the input. Adding each stack one item at a time, it either goes into an empty spot, or it causes the current end stack to split in 2. If you keep a stack of segments, the possible scenarios are, top 2 segments join, top segment splits or top segment splits and the bottom part joins with the one below it. Each operation is therefore O(1), giving an O(n) simulation to get the final position. Now the trick to get the number of moves is to take the sum of squares of the original positions and compare them to the sum of squares of the final positions. What will be found is a difference which when divided by 2 is the number of moves. This can be proven by considering how the sum of squares changes with each move. I suspect that if I had of known the theorem I would have tried to count the number of moves while simulating building up the stack. Each addition adds a number of moves related to the lengths of the splits it causes in a segment. Doubt I would have solved that in a competitive scenario anyway…

I’m willing to give myself the small input though, bringing me to 27 points and 158th in my imaginary competition where I hadn’t flunked in the previous round.

Q4) How many ways can you add 1 or more different numbers to give k, subject to a rather complex restriction… The restriction is, if the numbers are written in base B, the dth digit (counting from the right) of each number will be unique across all the numbers. Different orderings of the same numbers all count as the same way.

A4) This question strikes me as pure evil, but interestingly enough slightly more people got out the large input for this than for the previous question… The small input limits B to be binary to decimal and k to be no more than 100. The large input however raises B to being anything up to base 70 and k to be up to 10^18… I think the key to do even just the small input is to realise that we can reorganise the digits freely between numbers, without changing the sum. But even with that it seems a bit tricky, dp over sums s less than k, with the number for the largest digit rearrangement being n, also less than k. Then we need to work out how to multiply the total to get the actual answer. I think we may need to break the dp down into number of sums of n numbers to get to s, since I think the reordering count depends on that. However the fact that 0’s on the front don’t need to be distinct is a problem. At this point I give up and look at the solutions. Its not like I would have had any time left to solve this question in the competition anyway…

Apparently I was complicating things too much – the small input can be done by simple brute force, the number of partitions of 100 with distinct integers is only a few hundred thousand, you can just generate them all.

However I was starting to get on the right track for the large input. It is a dynamic programming problem, and the order of digits is not relevant for each column, but the needed approach is to build up one digit at a time, and using a nested DP to count the number of reorderings for each column. So you need to keep track of the carry, and how many of the numbers are finished and whether there is a 0. The zero allows for the termination of a number, but I am not sure why just the fact of a 0 rather than the number of 0’s is needed to be known. It almost makes sense, but I would need to implement it to be sure.

Anyway 0 points for me here so I’m stuck with my theoretical 158th place.

158th would have been close to my best ever placing, which was 115th in the first GCJ when it was still run by top coder… (and 100 people were going to be flown to the US) and the depth of field has definitely increased since back then… (In a super imaginary world where I had made the intuitive leap for Q2 large input, I would have gotten 51st… ahh I can dream :P)

Twist’N’Turn Silverlight

On Kwontom Loop forum, someone asked for Masyu puzzles, so I copied my LoopDeLoop silverlight game engine from last weekend and spent a day hacking it to change it to be like Masyu instead.

The result is http://www.themissingdocs.net/Twist.html.

Still a work in progress, Masyu puzzle generation is much more annoying than LoopDeLoop, due to a large proportion of possible loops do not have a valid Masyu puzzle associated with them. Therefore where LoopDeLoop can generate puzzles in sizes like 20×14, Twist struggles to get much further than 8×8. I will at some point be looking at an alternative puzzle generation strategy, but not tonight.

Update: I’ve fixed the slow puzzle generation using a cool little trick – it now can do 15×15 in a decent time.

GCJ10 R2 Analysis

So, 500th place was the same score as me, just 30minutes faster. If I had of started with question 3 rather than wasting almost an hour on question 1, I would have gotten through. Had I submitted anything else successfully I would have gotten through.

Exactly 2 Australian’s got through, in 498th and 500th respectively. I was the third highest Australian, and given how I was doing in the first round, 615th was a fairly successful result, just disappointed I don’t get a t-shirt.

On to the questions…

Q1) Give a set of numbers laid out in a diamond shape (for instance 1 number on the first row, 2 on the second, 1 on the third), determine the smallest diamond shape of numbers which contains the given diamond, and is horizontally and vertically symmetric.

A1) This question had a bit of controversy, up until about the 50minute mark (which was when I switched to Q3) the question text was actually incorrect. I would claim this as why I failed to solve the problem, but almost certainly the real reason is because of the midnight start, my approaching head cold and general momentary stupidity :P.

Possibly the biggest pain with this question is the diamond shape. It is not a nice layout of numbers to work with. I spent a few minutes trying to get my brain cells to work to write code for detecting mirror symmetry in a subset of the diamond before I gave up and rotated the diamond 45degrees, making it a square. Even writing the code to do that wasted about 15minutes, which indicates how well my brain was working. Working with a square with diagonal symmetry checking was much easier, although I still had a bunch of off by one errors I had to fix. In the end I did get that working, but while my answer solved the sample, it did not solve the small…

As I realised after the competition it was because I was completely screwing up my solution. I was determining 2 mirror symmetries in a subset and mysteriously assuming that could be used as the seed of the smallest double symmetric diamond which contains the original. I was at least having the requirement that the subset included one of the corners, but this approach is just wrong. In the end I have all the code I need to solve the problem, just the wrong algorithm!

I believe the correct solution is to find the largest single mirror symmetric (along the axis which doesn’t pass through the corner) subset for each corner. Then find the largest sum of pair of non-opposite corners. These 2 will be the seeds of the large diamond. Final answer is 3*original diamond size – sum of sizes of pair of non-opposite corners half symmetric subsets. (Give or take some off-by ones which I may have missed…) All in all given how few points this question was worth, I really should have listened to my first instinct and just ignored it completely.

Final results for the competition are not announced yet, leaving a tiny hope that I might get through due to the controversy with this problem, but it would be a cheap win which I don’t deserve.

Q2) Given 2^p teams competing in a knock-out competition, with each match in the knock-out tree having a specific cost and a list of the maximum number of games for each team you are willing to miss, determine the minimum cost to attend the required number of games worst case. (p is up to 10)

A2) Having up to 1024 teams scared me away from this question quick smart, but it was the one that most people actually solved. The small data set every cost was 1 dollar, the large, the costs varied between 0 and 100000. Even so overflow was not going to be an issue. Even now I can’t immediately see how to solve this, but I am beginning to think that it involves a divide and conquer approach… And pretty much as soon as I’ve thought about the question properly for a minute I have solved it. Define the recursion to be over the minimum cost to satisfy all leaf restrictions of a given branch, having already bought n matches at some parents. Then for each node in the tree, the result is equal to the minimum of the cost of this node and the sums of each child for having bought n+1 matches, and the sum of each child for having bought n matches. Terminating condition of infinite cost if we fail to satisfy the maximum games missed for a given team. Finally the result is the grand final match node with having bought 0 already. This can be done using dynamic programming, with an array p by (2^p)-1 using the /2 to find the parent, *2/*2+1 to find the children packed tree in array representation. Or you could dynamic program over an actual binary tree, with each node containing an array… performing a depth first walk. Or you could do memotized recursion. Whatever, its all easy – and I really should have taken a closer look at this one!

Q3) Given an ‘infinite’ grid with cells which are filled in or not, where each cell which has a north or west neighbour stays alive and which has both a north and west neighbour, but is empty, comes alive (simultaneously – so a north east pointing diagonal line of cells slow moves and shrinks to the south east). And up to 1000 large possibly overlapping rectangles which are initially filled in, determine how long before there are no cells filled in.

A3) I managed to solve this as almost O(N^2) in the number of rectangles. I suspect with the appropriate data structure checking every pair for ‘overlap’ could be reduced to O(N log N) but O(N^2) is plenty for this questions large constraints. (The almost in the first sentence is the O(alpha(n)) amortised cost for each disjoint set tracker union operation.)

First we divide the problem in to sub-problems. If two sets rectangles are well separated, they do not influence each other in any way. So first of all what counts as separated? If 2 rectangles overlap, then presto that is a guaranteed interaction. If they don’t come close to touching at all (separated by at least one in any axis), they are separate. If they are horizontally or vertically immediately adjacent, they act together. This leaves diagonally adjacent. Corners touching in the north east/south west directions count, but the north west/south east directions do not. I used a disjoint set tracker (I love this thing!) to track which act together and which do not. This part of the problem actually turns out to be the most expensive, since the rest runs in O(N).

Once they are separated we can treat each section individually and take the maximum of each problem. So given n rectangles which interact, how long do they last? It is fairly easily apparent that no matter how the north west corners get trimmed at each time step, the south east growth areas grow first. So the last cell to go will be the south east most part that either grows or is already there. Growth areas are limited by the south most edge and east most edge of all constituent rectangles. The tricky bit is determining which of the first cells to go will be the start of the cascade which reaches that final location. Although actually when it is phrased like that, it becomes somewhat obvious. Since the area which dies moves south east, the most north west point is going to be the start of the final cascade, all other locations will get stuck against one of the walls belonging to that location. That may sound convincing but proving it is probably more complicated. I managed to convince myself well and truly before I wrote my solution, but my reasoning is not a proper proof, although I am sure one exists. So comparing the north west corners to select the one which lies on the most north-west diagonal, and taking the maximum of the south east corners, we have the ‘effective rectangle’ for the group of rectangles which interact. A single rectangle takes height + width – 1 turns to disappear, and this also works with the effective rectangle. So we finally return the maximum of each disjoint set, and we’re done. Only 60 people got this out, so I was quite happy.

Q4) Given N points which each have a goat tied to it, and M possible locations for a bucket which each goat must be able to reach, determine for each bucket the minimum area which all goats can reach, by adjusting the goat ropes to whatever length you like.

A4) I spent my last 25minutes doing the small problem set for this. If I had of successfully submitted it, I was through, and I realised that at the time, so the pressure was on. The small problem set limited N to 2, which is the easiest case of the overlapping circles problem. When I saw this I thought, cool that will be simple I’ll just jump on mathworld copy the formula into my code and submit it… Which I did. And it failed. Oh I have a bug. Fix that, still fails. Oh what about this special case, fix that (later realise that the problem explicitly excluded it from being possible), still fails. Hmm maybe numeric accuracy in the acos function parameters, avoid unneeded square roots… still fails.

What can I say it was 2am, I couldn’t quite activate enough brain cells to realise that the formula from mathworld was wrong… Well not exactly wrong, more, insufficiently correct. The mathworld solution only handled the case where the circle intersection points form a line which crosses the segment joining the circle centres. If the resultant area is made up of 2 circle subsections where one of the circle subsections is more than half of a circle, it produces the wrong result. In that case the area is the sector + the triangle, rather than the sector – the triangle. If I had of actually sat down and derived the formula I might have gotten this correct, but 25minutes at 2am, it wasn’t going to happen.

The large problem set, N is up to 5000, M is up to 1000. Even if I had the formula for the area of a truncated circle and an arbitrary convex polygon already written (which is something I should really get around to doing for TMD.Algo), I suspect I would take a long time to write the solution, and even then I think it would be O(N^2 * M) at best, which is very risky timewise. I think something like the convex hull algorithm must come in to play. Only 2 people got this problem out and they came 7th and 27th. The top 6 got everything out except for this.

I actually think I have seen this question asked in topcoder before, so its probably worth being added to TMD.Algo in its entirety.

So that is most likely it for GCJ10 for me this year, I’ll write up the analysis for R3, which will likely contain some interesting questions – they save the best for last… Next stop is round 1 of TCO10.

GCJ10 R2

Didn’t quite make it through – was happy with my one solution, unhappy that I didn’t get any of the smalls out for the other 3, despite trying 2 of them for about an hour and a half. I think the one I didn’t try looks like it was the easiest of course.

Final placement was 615th. Will write up a full analysis after sleep…