Tilps – Page 9 – The Missing Docs .Net

TCO12 R2A Update

They made this round official with no changes. So my incorrect code has successfully improved my topcoder rating by 90 points. Yay for me…

TCO12 R2A

Only 1362 out of 2001 qualifiers turned up for this round. (Well above the 1% existence failure that Otinn uses for his odds calculations. So maybe I have >1% chance of getting a t-shirt after all…)

374th before system tests, I was just hoping to get a non-zero score this round. The first question had potential for problematic test cases and lots of successful challenges (the number of red ranked coders on 0 points or lower was pretty scary…), and the rest were too difficult for me to even consider at this late hour. (Although I swear the second question was familiar…)

Waiting on systests … and the announcement comes through that because the challenge phase was broken for the last 3 minutes, the round *might* be cancelled. Which would mean another 2am night back-to-back.

Systests finished, so if this round actually counts I’ll be in 285th. Which in any other year would be a t-shirt placement, but given the fact that round 2 has 3 parts and splitting of t-shirts over the best 350 ranks of all it won’t be good enough. (Unless subsequent rounds are even harder and hence not enough people get past the minimum 0 points requirement…). Only 461 people got a positive score this round (2/5 Aussies, including myself got a positive score).

Q1) Given a list of sets of states for light switches and corresponding states for lamps, but with no knowledge of which light switch actually corresponds to which lamp – determine the minimum number of extra state sets required to correctly identify each lamp. Each switch and its light are in sync, it is a simple wiring. Each individual state on its own will be valid, but there may be no solution which satisfies all states, in which case the answer is -1.

This problem looks like it needs brute force, but with bounds of up to 50 lights and up to 50 state sets, that certainly seems implausible. So I created a 2 dimensional array of whether switch i could be lamp j. If a single state set says a switch i is on and a lamp j is off, obviously switch i cannot connect to lamp j and vice-versa. If two state sets have the same switch state then those switches cannot be in the set of lamps with different lamp states and similarly for switches in different states. The question then becomes, is this enough? And since I passed system tests, the answer might be yes… (I fully expected it to be no though…)

Finally count up how many switches each light could be. Then take the ceiling of the base 2 log of the highest of those counts, since each count corresponds to a set of switches which we know nothing about, and that set of switches can be solved in base 2 log tests, and each of the groups of switches can independently be solved at the same time.

Post Sleep Update 2: A better approach is to consider the pattern of each switch and each lamp. Presuming that there is a solution any given pattern must appear the same number of times amongst switches and lamps. The pattern with the highest count is the same as the switch with the largest count in my version, and from there the code is identical. This approach is quite obviously correct, so is it possible to define a test case which breaks my approach?

The answer is yes. Using an a quick written app to compare the output of the 2 approaches for random inputs I found the following test case (and many others…)

switches = YNYNNNYNNYYNY, YNNNYYNYNYNNY, NYYNYYYYNNYNN
lamps = NYYYNNYYNNYNN, NNYYYNNYYYNNN, YNNNYNYYYNYNY

My code incorrectly believes such test cases to be possible, but the pattern NYN only appears in the lamps not the switches.

This is twice in a row that I have managed to get past the system tests with incorrect code!

Q2) Given a list of items to visit in order on a 1-dimensional line, and the ability to add up to K instant teleports, what is the minimum travel distance to visit the items in the order specified?

A2) No real clue yet… but it isn’t the semi greedy approach of choosing the best location for the first 2 teleport locations, then the best location of each subsequent one… (I wrote that during the comp and it failed one of the samples.)

Looking at one solution the approach appears to be to calculate the minimum distance to travel in the n left most items with up to K teleports, but I’m going to need some sleep before I make sense of it.

Post Sleep Update 1: So the approach is indeed the minimum distance to travel for the n left most items with up to k teleports. More specifically it is the minimum distance travelled in the area to the left of the n left most items where the kth teleport is at the same location as the nth item. In order to solve the question the component required is to be able to answer the question ‘what is the distance spent travelling in the segment a,b if a and b are both teleports. Then you can treat the initial state as having teleports at -infinity and +infinity and add teleports left to right, adding the cost of the segment previous teleport location to new teleport location.

The cost of a segment bounded by teleports only needs to consider the items which are inside that segment. Obviously the path for every item outside that teleport either does not attempt to enter the segment, or skips the segment to get to another item outside the segment, or jumps to whichever end is closest to its target inside the segment. So for each item inside the segment, we need to consider the item before and after it in the sequence. For each of these if the other element is outside the sequence we simply add the distance to closest end of the segment. If the other element is inside the sequence, we ignore the before and only consider the after to avoid double counting. The cost here is either the sum of the costs to the closest endpoints or the direct distance in between the 2 elements, whichever is smaller.

During the competition I had thought that something along these lines would be needed, but only at about 7 minutes left on the clock and I thought the cost for a segment would be much more difficult to calculate.

Q3) Given n rooms connected by a set of 1 direction tunnels which do not form any loops, and the knowledge that some rooms are clear and some rooms might be obstructed and if so could not be entered, determine how many scenarios there are where the number of paths from room 0 to room 1 is even. Up to 50 rooms, and at least all but 32 of those will be listed as clear. (Also: at most 500 tunnels)

A3) Only 2 correct solutions for this during the competition, obviously brute force is too slow, but at first glance the question seems plausible… I’ll take a closer look after some sleep.

GCJ 12 Qualifying Round

So, the 25 hours are up, and at final count about 230 people submitted all the problems (including myself). After the elimination of incorrect ‘large input’ solutions I managed to remain part of the 163 people who got a perfect score in this round.

Over 18000 people got a positive score, so I imagine that the number of competitors was pretty high. 15692 people qualified, so lots of competitors for R1A.

Q1) Given an input string, perform letter substitution to generate another string. (Letter substitution is a bijection and some examples provided.)

A1) So, wow… simplest question description ever? No large input, since well, what could they do? The small difficulty of this question was that the provided samples were missing 2 letters. The first of these letters was provided in the question text, but it was provided as a forward mapping and the actual question asked for a reverse mapping. The final letter can be deduced by the fact that the substitution is a bijection, and hence if you know all but one path, the final path is between the only unused input and output. So, you could have copied the samples into your code and written some moderately smart code to work out the substitution for you … or you could just do what I did and work it out by hand and hard code the mapping, using indexes into a handcrafted string.

Best part of this problem? The answers. I have to wonder how many people submit their outputs without actually checking them. Because if you did, then you missed out! After decoding many of the answers were amusing or interestingly absurd.

Q2) Given a set of numbers which are known to each be the sum of 3 non-negative integers, determine how many of these triplets can have a largest number of at least k. The triplets are further limited that difference between the lowest and highest values of the triplet is at most 2 for ‘n’ of them, and at most 1 for the rest.

A2) The small input limit was that there was at most 3 numbers. This meant that you could brute force every allocation of the difference of 2 and just return whatever was the best answer. The large input was not feasible using this approach, but I doubt that there were many people who would have used the brute force approach anyway…

So, if we look at each triplet, if the difference limit is 1, then the triplet consists only of Floor(sum/3) or Floor(sum/3) + 1. If the sum is 0 modulo 3, they are all the first, otherwise there is one of one kind and two of the other. So we can go through the list determining the triplet for difference limit of 1. If the resultant triplet already has a highest value equal to k or higher, then add it to the result. Otherwise, if the highest is equal to k-1, then maybe we can use a difference of 2 to reach the limit.

Specifically the sum must be at least 2 (as each integer is non-negative) and the modulo must not be 1. If the modulo is 1 then changing to a difference of 2, doesn’t actually increase the highest value, it just increase the number of numbers at the highest value. Accumulate all the cases where switching to 2 would make a difference, then limit it by n, since only at most n of these can be used. Add the result to the total and you are done.

Q3) Determine the number of pairs of numbers n and m where n is less than m, n is greater than or equal to A and m is less than or equal to B, such that n and m have the same number of digits and are rotations of each other. A and B are also guaranteed to have the same number of digits and A is less than or equal to B.

A3) Small input limits B to be at most 1000, which in practice means 3 digits since if B has 4 digits, so does A and hence A equals B and there are no pairs. Large input limits B to be at most 2million, which gives us some cases with 6 digits and a maximum range of 1million numbers to check.

It might seem that the large input will need some fancy tricks to deal with how large it is, but, that is not really the case, brute force approach will run in time unless the implementation is particularly slow. For each number in the range A to B, generate all rotations, if the rotation is greater than the starting number and less than or equal to B, add one to your total.

Generating rotations is not hard, first count the number of digits in A, by dividing by 10 until the answer is 0. (A is guaranteed not to be 0, so you don’t have to worry about that corner case.) To generate a rotation simply store the mod 10, then divide by 10 and add the modulo times 10^length. Repeatedly execute until you are back at the same number you started with. The fact that many rotations may end up starting with 0 is not relevant, since you are going to compare the value against the original number which does not start with 0, and hence all of these will be skipped.

Q4) Given a rectangular grid of square cells where cells are either part of a wall or not, and all edges of a wall are mirrored and you are a 0 size object at the centre of a specific cell which is not a wall, determine how many reflections of yourself can be seen where the light travels at most a distance D and can’t pass through yourself. The outer edge of the grid is guaranteed to all be set. Light bounces mostly as you would expect, but there are two specific cases of interest. Firstly, if light exactly hits an inner corner, where 3 wall cells meet, the light completely reverses direction. Finally if the light hits an outer corner with the light heading towards the 1 filled cell adjacent to that intersection, the light is destroyed. Light can glance a corner just fine, even two diagonally touching corners, it just passes straight on. Grid size is at most 30×30 and D is at most 50.

A4) So, here it is. The question to see if you have some spare time (or are really really good…). The small input had a strange limitation which was that the number of walls would be at most double the width plus double height, minus 4. It took me a while to realise that this meant that only the edges would be filled in, since half the samples violated this condition. (Looking back, at least 1 sample from each question with a large input violated the small input conditions. I’m not sure if this is new this year, but it is actually kind of good…) This restriction is actually quite powerful and would allow for a much simpler solution as I will explain in a minute.

So, how do we determine the solution? Even a simple case like the 3rd sample which is a 2×1 open space with a maximum distance of 8 has 68 reflections. One thing to think about is where the light starts and ends up. It is in the middle of a cell on both occasions. If you think about what happens each time the light reflects off an edge, its like the light keeps travelling but into a mirrored version of the puzzle about that edge. A mirrored puzzle with a mirrored starting position. This mirroring is always aligned at cell edges, so the resultant grid remains aligned with the starting grid and hence in order for the light to travel from start to start through multiple reflections, the initial angle must be one which passes through the centre of two cells in the infinite grid plane. That is an infinite number of scenarios to check, but lucky for us we have a maximum distance D, so we can easily limit this to angles which connect the centre of two cells at most a distance D from the starting position. A simple implementation just considers every cell in the +/-D square.

Now we have to follow the light. This is where the small input becomes much easier. Because it is a simple rectangular room, you can expand it out to cover the entire +/-D square through reflection, populating each square with either nothing or a reflected position of the original starting point. Then simply count how many starting points are within distance D of the original starting point. The double reflection in an inner corner thing plays along nicely with this, and the outer corner light destroy never happens in a simple rectangular room. You don’t even have to actually create the large grid, you can use some tricky divs and modulo to map each potential square straight back to the input data.

Oh, except that doesn’t quite work. This approach counts the same angle multiple times. Think of the 1×1 room with a maximum D of 4. The reflected grid contains a starting point in every cell, but obviously walking a multiple of 45 degree angles passes through an intermediary reflected starting point before it reaches the outermost one, which is disallowed. The solution to this is to instead count the distinct angles. Use a lookup table to check whether you’ve already counted an angle, where the lookup reduces the dx and dy by their gcd (or reduces them to 1 if the other is 0).

Now the large input, that is a whole different scenario… If we try to reflect through each wall and overlay the maps, we get inconsistencies depending on order of reflection, and the outer corner light destroy is a pain. So, we really need to actually simulate the light path. At this point I grew concerned at how much work was involved in writing the solution. I once a very long time a go wrote a wolfenstein engine clone, which involves a technique called raycasting, specifically raycasting onto a square grid. I remembered that being kind of painful to get right, and here we are not just doing raycasting, but having to recast the rays at each reflection – and finally we need to be absolutely perfect in calculating distance or we’ll fail test cases involving angles based off of 3/4/5 triangles.

But I wanted to solve this problem, so I dived in. First up, I realised that the cardinal directions needed to be handled separately. Back when I was implementing my wolfenstein engine clone, I had hacks in place to ensure pure 90 degree angles never actually occurred as you got divide by 0 bugs, this time said hacks would cause the question to fail. Luckily these 90 degree angles are quite easy, just walk cells in the specific direction until you find a wall, determine the number of in between cells, double it, and add 1 (for the 2 halves exiting and entering the first cell) and check if that is too far or not.

So all we have to do now is simulate the non-90 degree angles. First problem to consider is the accuracy problem. My solution to this was to use the fraction class from TMD.Algo. By doing this I ensured perfect accuracy, so long as the numerator and denominator never overflowed. There would be a lot of calculations, but the fraction class repeatedly reduces to minimal form after each operation, so it really depended on what kind of fractions could be found. We were going to be looking at the positions of intersections with grid lines (for walls) and half grid lines (for the starting location), and a slope of dx/dy. Some handwaving and I convinced myself that the fractions would never have a denominator greater than 2*dx*dy after reduction, and square that for pre-reduction. Since D is at most 50, thats about 25million. Numerators shouldn’t get above D +2 times the denominator so long as we abort the simulation sooner rather than later. So numerator should cap out around 1.3billion, easily handled by the 64bit numbers my Fraction class uses internally.

Next problem… This one is kind of subtle, but even with fractions how do we ensure our distance check is perfect? If we calculate the distance to each reflection and sum these, we have to calculate the square root of a fraction, which is not actually very likely to be another fraction, so we’ll be adding together doubles and we’ll lose our perfect accuracy. The solution comes from remembering that we can pretend the light goes in a straight line with the world reflecting about as needed. The length of this line is defined by two numbers, the cumulative absolute dx and dy. After each simulation step we can add how far we’ve moved dx and dy to an accumulator for each, taking the absolute value so it continues to accumulate after each reflection. We can then compare the sum of squares of these accumulators with D^2, an exact comparison (this comparison might actually have triggered the need for 64bit integers in the fraction type now that I think about it).

So, all that remains is to perform the simulation. Given a point and a direction, we need to determine the next possible half integer coordinate. First step is to determine the half-integer coordinate. If we aren’t already on a half-integer (check denominator 1 or 2) then calculate the floor of double the value, and add 1 if we are moving in the positive direction then halve. If we are on a half-integer, we simply add or subtract half depending on whether direction of motion is positive in that dimension. Now we have 2 possible coordinates to move to next, one for x and one for y. We calculate the delta between current and new location for the coordinate calculated then multiply by dx/dy or dy/dx to calculate the movement in the other coordinate. Add this to the current position of that other coordinate and we have the ‘projected’ coordinates. Compare the projected x coordinate for the next y to the next x, and determine which is closer in the direction of motion. Whichever it is, that is the new current location. Update the accumulators at this point. Once the accumulators are updated, verify that we haven’t travelled too far, fail if we are in excess of D. Now check if we’re back where we started. If yes, return success. Finally check if we are at an integral boundary, is denominator 1 for either x or y coordinates. If yes for only 1 of x or y, check to see if we’ve hit a wall by calculating the floor of the coordinates and subtracting 1 from one of those coordinates if we’re heading into a wall in the negative direction. (Fiddly…) If yes for both x and y coordinates, we’ve hit a corner. If the far side of the corner is occupied (similar algorithm to before, but we might have to subtract 1 from both x and y – extra fidly) then we need to work out what happened. We need to look at the other 2 cells adjacent to the corner. If 1 of these is filled, we’ve got a normal wall and reflect (making sure you reflect the right way… Fiddly!). If both are filled, double reflect. If neither are filled return fail for this angle.

Stick the above complicated fiddly code in a loop and presto you are done… (No, I didn’t get it right first time, but it did at least fail on the samples, so I didn’t have to find out by running the small input…) When I ran the small input I ran it in debug mode with the debugger attached. It took 2 minutes, which given the 4 minute time frame for submission was kind of nerve racking. I thought about the large input for a bit and decided it couldn’t be much slower, since I was simulating every half integral coordinate intersection regardless, and whether it reflected or not was a negligible contribution to the processing time, so given that there was 8 minutes for that, I should have been fine. But to be sure I changed to run in release mode. I think the problems for the large input were actually a bit larger D on average, so even in release mode it still took 2 minutes. But that left plenty of time… Still, considering how many corner cases there were I was sceptical that it would pass tests. But it did, so I am happy.

TCO 2012 R1C

Easy problems this week, in fact in what is a rare turn of events I made submissions for every question.

Before systests I was 147th, but I already knew that was going to change, as I had proven one of my solutions wrong already. However the systests did not agree?!? ‘Final’ placing 101st. Even if they were to rerun with a more comprehensive systest I would still be about 140th apparently.

So finally, a round easy enough for me to do at 2am.

Q1) Given a set of guesses where you know exactly one character in the guess is wrong in each case, determine how many possible answers remain. (Up to 50 characters, each in the range ‘0’ to ‘9’. Up to 50 guesses.)

A1) This is a simple question which tried to trick you by making the return value a 64bit integer type. The worst case is actually where you have the same guess made 50 times, and there is 50 characters.

For a single guess you can enumerate all possibilities by substituting each character with the other 9 possibilities. Then the answer is just the size of the intersection of the output for enumerating for each guess. Use a dictionary to count how many times you see each generated possibility, and count how many times the count equals the size of the input – done. Even if you have 50 different guesses at 50 digits each, there is only ~20k entries in your dictionary. Or just perform Set intersections, each set has a maximum of 450 entries, so you could even do it using lists and contains and still not run slow…(not that I recommend getting into that practice!)

I however, made a fatal mistake. I didn’t realise that a 50 digit number can’t be converted to a long until the break before challenge phase… I was converting my generated values into numbers and using them as my dictionary keys rather than creating strings. Its not like the code was any easier my way, I just had less string concat calls, which I have developed a natural aversion too it seems. Despite this fatal mistake I somehow passed system tests!?!

Q2) Given a grid where you can move from grid intersection to intersection at a variable cost, but you can only move right or down, determine the minimum cost to get from top left to bottom right if you change the costs such that every path has the same total cost, but the only change you can make is to increase the cost of an edge. (Grid is 40×40)

A2) Well, this one looks easy at first. Although I can’t fully explain my justification any more… The idea is that the minimum cost path has at least one edge which is not part of any maximum cost path – unless the minimum and maximum are the same. Presuming this possibly unjustified leap of logic, it then simply becomes obvious that if you can work out the maximum path cost at the beginning, that is the same as the maximum path cost at the end. Maximum cost is easy, for each vertex determine the maximum of cost from above or left, where the maximum cost for those locations is already calculated and cached.

Q3) Given a sequence of letters where no letter appears more than 2 times, and there are at most 50 letters, determine the minimum number of swaps to turn it into a palindrome.

A3) So I submitted an answer to this, but I certainly didn’t have any form of proof to go with it. First, determine if the answer can be a palindrome, if there is more than 1 letter which appears only once, it cannot be, otherwise it can (due to the restriction of maximum appearance of a given letter being 2). If the length is odd and the single appearance letter is not in the middle, swap it there from wherever it is and add 1 to running total count.

Then walk from front to back, at each point if its the first time you’ve seen that character, store its position. If it isn’t switch the character with the position inferred from the first location you saw the character (if it isn’t already in the right spot) and add 1 to running total count. If a switch occurs, you need to remember to check the same spot again, since the newly placed character may or may not be in the right spot.

Very unusual for a greedy approach like this to be the solution for the 1000pt question, but it passes. Then again, given my incorrect solution to the first problem which ALSO passed systests, maybe I am giving system tests too much credit 😛

GCJ Qualifying under way

So, I can’t do a write up about the questions until tomorrow (sometime after I sleep post TCO12 R1C), but I did finally finish them. Penalty time of 7:20 (which doesn’t count in qualifying since all you have to do is score 20 out of 100 – and even if all my large inputs fail somehow, I’ll still get 50) – but I had lunch and played computer games for 4 hours or so in between the 3rd and 4th questions. So I think I could have done it in about 3 hours which I think isn’t too bad. At time of writing only 40 people have submitted solutions to all inputs. I think that is kind of low compared to normal, even if it is early days…

New this year was a question with no large input, not that exciting, but still something different.

TCO 2012 R1B

Again a late night, and again a poor performance.

I was faster on the 250pt question this week, but I probably spent more time testing than I needed to. Still 217 is acceptable. Unfortunately, at the prior to systest phase, the top 600 lowest score is 300 and I failed to get the 500pt. So I sit in 750th, hoping for a huge number of failures in the 500pt problem that I struggled with. I definitely saw a few solutions which were greedy in nature in my room, and although I couldn’t find a counter example, greedy is usually high risk. I guess I’ll find out shortly.

Greedy solution passes… but still quite a few failures, enough for me to go up to 646th. Just 11 points from cut-off this time. R1C here I come?

Q1) Given up to 50 objects with lengths between 1 and 50 inclusive, determine the largest k where you can make k rows of length k either by using a single object or 2 objects separated by a gap of 1.

A1) Nice easy problem. The answer is obviously in the range 0-50, and its not too hard to verify possibility, so try each one descending until it works, or you reach 0. To verify, sort the objects, any object larger than k can be discarded, every object equal to k, count that up. Then perform an inward walk of those left checking for sum equal to k-1. If sum equals step both ends inwards and increase your count, if sum smaller move the bottom point up, if the sum is greater move the top point down. Continue until the points meet or cross. If the count is greater or equal to k, then its possible to make the k rows needed.

Q2) Given up to 50 tasks which take time between 1-3600 time units, determine the minimum time to complete the tasks subject to the following strange constraint. You have a single worker, each worker can only do one task. However a single worker can split into 2 workers in time t.

A2) The greedy solution is to sort in descending order and replace the 2 smallest entries with the largest of those two plus the split time. Repeat until there is only one task remaining and return its time. Very simple, but is it correct? The answer is yes.

The path I erroneously took was to try and determine the minimum time to perform the largest i tasks and have k workers left over. However I couldn’t get this to work because my formulation didn’t allow for the other tasks occurring after a split but still concurrently with another task. I actually realised this fairly early on and tried to formulate a reverse search more like the greedy solution, but I got confused, went round in a circle and started working on the same broken approach again. (Way too tired!)

‘Proof’ for greedy solution. You can treat the splitting of workers as creating a tree. Each leaf node of the tree does some work. There are no empty leaf nodes because any empty leaf node could be removed and its parent replaced with the other branch. In an optimal solution the 2 smallest items will be at the deepest level of the tree. As if they are not, then they can be switched with any non-deepest leaf node and the solution either improves or stays the same. Furthermore all deepest levels are reached at the same time point, so you can freely interchange any leaf at the deepest level without changing the total time for that tree. Hence, the two smallest nodes can be made to share a common parent. This pair of two smallest nodes and their parent can be replaced with a single leaf which takes as long as the longest of the two smallest nodes plus the split time. This results in an equivalent problem with an identical solution presuming we have started with an optimal solution for the original problem (I don’t have a strong proof of that, but it appears obvious – decomposing a node can only make it easier to arrange the tasks to achieve an optimum so if we are already optimum, we are also optimum for composition). We can now start all over again, but we have 1 less leaf node. Hence we can repeat until we only have 1 node, and the total cost is now obvious.

Q3) Given two rows of equal length of up to 16 people each with heights between 140 and 190, determine the minimum number of adjacent people swaps such that every member of the second row is larger than its corresponding member in the first row.

A3) Interesting question… the limits of only up to 16 people suggests an exponential search is feasible.

Looking at one of the passing answers, it looks like an exponential search on the minimum number of swaps to make the first k elements correct, looking at every possible subset of the k-1 swap locations. I’m not sure how this accounts for swap ordering… maybe I’ll work it out after some sleep.

So, post sleep revisit… The approach is to determine the minimum number of swaps required to have index i and smaller satisfy the criteria using i out of the available slots. The tricky bit in understanding the solutions is that they use a compact state space. If you use a bit field, with i bits removed, the fact that there are i bits removed indicates that those numbers were placed in the first i slots, in some order. So all bits set means any ordering, so that is where we start. For each set bit, we consider moving that value to the beginning. This will take a number of swaps equal to the distance between the start and the candidate, the rest of the values will retain the same order, and obviously we only do this if moving it results in a satisfiable condition. We now have a bunch of new bit fields to determine the minimum swaps for, with the bit corresponding to the moved index removed. So looking at each new state we can determine which index we are moving to by counting the number of gaps, and we can start again. However when it comes to distance to move to the new spot, we have to account for the fact that in moving an item to the left, the items to its left have all moved right 1. Conveniently we can determine this from the bit field. Each 0 bit corresponds to something which was moved, so if we pretend the 0 bits aren’t there and align right, that’s where the candidates are currently placed. So the count of set bits to the left of the candidate to move gives the number of swaps.

This recursion can be memotized over the bit field to ensure it runs fast enough. Terminal condition is the empty bit field which has a cost of 0. All that remains is to prove this approach correct…

Looking at the recursion from the inner most levels first. Obviously if all are in the right place, the cost to move them into the right place is 0. If all but 1 of the left most spots have been made correct, obviously the rightmost must stay still, and if it is in the correct spot, the cost is 0. If all but 2 of the left most spots are correct, either we swap or we don’t. If we swap the cost is 1, otherwise the cost is 0. The recursion will try both options so there is definitely no loss of generality yet. All but 3. In this scenario if the ideal scenario has the right most moving left 2, there is nothing to gain by swapping the other 2 first, it just increases the cost now and increases the search space, we can leave deciding whether to swap those until after swapping the right most across without any lower cost scenarios lost. Thus demonstrating that we do not lose generality. (Not a proof, but doesn’t look too hard to formalize.)

Now if you’ve been paying close attention you’ll notice I left something out completely. Everything above presumes that you only swap the people in the front row, or only in the back row. I can’t see a good way to demonstrate why this still leads to the correct solution, but intuitively, mixing front and back swaps serves to lose the ability to chain the swaps in order to place items in the correct spot. In order to perform the chaining you end up needing to do the corresponding front swap for a back swap or vice versa (or simply repeat the same swap again, either way it seems obvious that the number of swaps is increasing). In some scenarios mixing the swaps doesn’t make things worse, but it does not provide ways to make things better.

Its TopCoder Time!

So I went in topcoder 2012 R1A and wow was I slow… Easy questions, and I only implemented solutions for 2 out of 3 them with 3 minutes to spare.

724th before system tests, cut-off is top 600. After system tests 655th. A single challenge would have gotten me through, and I was staring at an incorrect java solution for the 250 for a while but someone challenged it faster. So I shall be back next week, which will almost certainly be much easier with 600 of the top people going through this week.

Q1) Given a game where points are given out 1, 1, 0.5, 0.5, 0.25, 0.25 … and list of peoples turns which may not be in the correct order, determine who could have possibly won outright (no draws). (Max 8 turns.)

A1) This question caught me badly, I immediately presumed that there was an order of play, and the turns were just shuffled to be confusing. This lead me to a simple solution which was completely wrong, which I ‘corrected’ twice before finally realising (looking at the final sample where 1 person had 3 turns and 1 person only had 1) that I had completely misread the question.

So, throw everything away, start again. Simulate the game for every permutation of the given turns and collect the distinct set of winners. C++ coders used next_permutation to great effect. I just wrote the brute force. Still even if I had have read the question correctly, I wasn’t writing code fast.

Q2) Determine the number of fractions a/b where a < b, a is co-prime with b and a*b equals one of the factorials less than or equal to N!. (Max 250 for N)

A2) Took a moment for my brain to get started here as well, I momentarily went off on a side track of counting the number of fractions less than 1/k which have a*b = m!, but realized pretty quickly that there was no way that was going to help me avoid counting the scenarios where a and b are not co-prime. A bit more thinking and I realized that I could count all co-prime fractions a/b without restriction of a < b much more easily. And since there are no factorials which are square numbers (ignoring 1, which we can) the answer is exactly half that. So how do we count these? Well break k! factorial into its prime factors and their powers, obviously if a, b are co-prime, then no prime factor is shared, so all we need to do is count how many ways we can divide the distinct prime factors into 2 groups (including cases where the groups are empty, since we can always make a or b 1).

Here I had a moment of stupidity. How many ways can you divide N items into 2 groups… well that’s the sum of n choose k for all k, 0 to n – and choose can be implemented using a pascals triangle cache for high speed. Err… Wait a second… or I could have just raised 2 to the power N… But too late, already implemented and tested, so submitted it shall be.

But again, it took me way too long.

Q3) Given a set of light switches which can control multiple lights, and each light is controlled by up to 2 switches and the list of exactly which light each switch toggles, and an initial state of lights, determine the minimum (non-negative?!?) number of switch uses required to turn all lights off. (Max 50 lights)

A3) This problem is so easy, I wish I had of had time to look at it. Since each light is controlled by at most 2 switches, switches can be in 3 states. Useless (because they aren’t used anywhere), Defined (because they are the only switch for each light they switch) and Joined (because either the two switches must be the same or opposite by the connecting light).

Collect all joined switches into clumps. For each clump, try toggle or not for a single switch, the rest of the switches become defined. If all lights turn off, determine how many toggles that matched to and choose the smallest if both toggling and not-toggling works. Add all clumps together, add in the defined and ignore the useless… and you are done.

Well, except I didn’t mention all the failure scenarios. A ‘Defined’ switch could have no solution if the lights it is defined for do not start in the same state. Also neither state for a clump may result in all lights off. Finally there could be a light which is not referenced by any switch and is not already off.

So in the final analysis it looks like if I hadn’t of misread the first question, there is a decent chance I would have saved enough time to get me the 40 points I needed to get through, and potentially even have been making good progress on the 3rd question when time ran out.

.Net 4.5 Beta

So its out, I’ve got it installed, but I’m only just starting to do a comparison between the surface area of the developer preview and the beta.

A few tidbits I found before I fell asleep…

Looks like Linq to SQL is getting support for geometry types and more functions.

DataReader – GetDbNullAsync, GetFieldValueAsync (interesting, I guess large columns can still be loading while we know we’ve got a new row), GetStream/GetTextReader (a way to get at large columns without consuming all available memory?)

Async BulkCopy (makes sense!)

SQLException has a GetClientConnectionId – not sure in what scenario you might get confused about which connection a sql exception comes from, so I guess this is something other than what it seems at first glance.

GetHResult is on Exception now…

Some of the new http stuff had non-async methods removed rather than provide both. (Interesting, I wonder if that will be the pattern going forwards…)

Maybe some TLS1.1/1.2 support in SslStream.

UnsafeOnCompleted on TaskAwaiter – (Makes me think of critical finalizer… wonder if its even vaguely related…)

GCLatencyMode.SustainedLowLatency – sounds exciting if it lives up to its name… (Apparently there is something called .Net 4.0.3 beta which already contains this…)

AsyncWait on SemaphoreSlim … could be interesting…

WPF gets the concept of ‘InactiveSelection’… wonder what that means…

(And thats enough for now…)

Treemap Maker

So I’ve added a new little utility to the web site. Its under the default software page, not one of the sub-pages.

This utility is a bit like Sequoia View, but not actually as good right now. Only advantage it has over Sequoia is that it can do file count based layout as well as file size based layout. I find this of interest when backing up a system as the raw file count can often slow down the process as much as large data, so this lets me target where to trim/create archives.

It has left click to zoom in one level (even down to the one file taking up your entire screen level) and right click to undo a zoom. The layout algorithm is one I found on-line, something called Pivot-by-Middle. does a decent job and wasn’t too complex to implement. It has a black rectangle to indicate the top-level directory your mouse is currently over, which is very expensive to draw in the current implementation, so it will lag behind if you have a large number of rectangles in the window. It also shows whatever file your mouse is over up the top.

The colouring is one of the big steps backwards compared to Sequoia – I use a random colour blending approach, where the colour of a square is the blending of the colour of the file and all of its parent directories, and all colours are initialized at random. Its better than nothing, but that is about all I can say for it…

Caveats: It ignores all errors while trying to walk the file/directory hierarchy – so if you don’t have read access to a large directory, it won’t tell you (it won’t know) and your graph may not be as interesting as it first appears.

Expression is Interesting

I was playing about with Expression trees today for practically the first time and I wrote something like this.

object a = objectDictionary["somekey"];
Expression constExpr = Expression.Constant(a);

Eventually this expression went into a bunch more expressions and finally compiled into a delegate.

It wasn’t until afterwards that I went… wait… what? I had just passed a reference type to be a ‘constant’ in a delegate and it wasn’t even a string. Surely it will fail when I run it. But no, it works just fine.

I may not have written a lot of IL, but I did write a parser for the output of ildasm, so I like to pretend that I know a bit about what you can do in IL and I certainly couldn’t see any way you could embed a reference into some IL. And it certainly can’t be capturing a, since that rewrites the parent method and Expressions can’t do that…

So I turned to the trusty internet and found ILVisualizer, which I suspect I will be keeping on hand at all times from now on… and lo, the compiled delegate is closed, with the this parameter bound to an instance of CompilerServices.Closure. This type has an object array property called Constants, where my trusty reference has been stored and the generated delegate loads this and accesses the first element.

In part what I find interesting about this is that you can’t trivially write a lambda to do the same thing, since that will capture the variable, and subsequent changes will affect the lambda. Here we are guaranteed to get the value of the reference at the point in time the expression is constructed.