June 2012 – The Missing Docs .Net

My first non-competing round of the season… will see how many of these I bother to write up – if previous years is anything to go by, not many!

Q1) Given a set of tasks which each have a time to complete, and percentage chance of failure, determine the order which minimises the expected time to completion if failure of any task means you have to restart your order from scratch.

A1) This was a sneaky question, not insanely difficult, but they set the restrictions on the small input that the time to complete of every task was identical. In that scenario the solution is trivial, you sort by most likely to fail to least likely to fail (using a stable sort, or secondarily sorting based on original index).

The large input isn’t much more complicated, the only question is what is your sort operator look like when the times of the tasks are not identical. Unfortunately my initial guess would have been length of task times the chance of failure, when the correct answer is length of time divided by chance of success. Fraction class works here, or you can define your sort operator as a.Time*(100-b.FailChance) < b.Time*(100-a.FailChance). Pretty sure I’ve seen this question somewhere before…

Q2) Given a hexagonally shaped hexagonal grid and a series of ‘moves’ which fill in hexagons, determine the first point in time that either 2 corners are connected, 3 edges (which don’t include the corners) are connected, or a ring is formed. (A ring is a connected loop of hexagons where there is at least 1 empty hexagon inside.)

A2) The large input has a hex grid with side lengths of up to 3000, and 10k moves, so the implementation needs to be reasonably efficient. Having written LoopDeLoop, I’ve had opportunity to consider similar problems in the past, so I think I had a chance of solving this problem if I had of been in the round. We can solve the 3 sub-problems independently, but there will be a common theme.

Corners. if we use a union-find data structure and associate properties with each set representative (specifically the number of corners the set touches), we can do this in linear time. Each move is handled easily Each surrounding position is either empty or a member of a set. Gather the number of corners for the distinct sets, union them all together, add in the new position and set the number of corners equal to the total (adding 1 if the move is in a corner). If total exceeds 1 we are done.

Edges. Almost identical to the corners, only instead of summing, we use a 6bit number to represent the edges touched, and use bitwise or to combine them. And rather then checking total we do a pop-count.

Loops. Here is the tricky bit. Again we can use the union-find data structure, but this time rather than aiming to union things together and track status, we are looking to see if the current move has 2 neighbours which are already in the same set. There is also the possibility that the move doesn’t create a loop, but just fills in the edge of a ‘blob’. If precisely 2 of the neighbours are the same set, and they aren’t touching, we have definitely created a loop. If 3 of the neighbours are the same set and they aren’t ‘sequential’, we’ve created a loop. If 4, same applies. If 5 or more there isn’t anyway to not have them sequential (and specifically 6 can never happen since it is already a loop). Edge and corner placements don’t really change this logic, although you certainly can’t go all the way to 6.

Plenty of opportunity to screw up, but overall I think I had a better chance here than with the large input of the first question…

Q3) Given a delivery fee per delivery, and cost of types of food, and length of time until stale, how long can you eat delivered food given a certain amount of money.

A3) For a given number of days to survive on a single delivery, the answer is fairly obvious. For each day take the cheapest food which would not be stale.

The problem comes in deciding how many days to go between deliveries. Obviously if a type of food exceeds the cost of the cheapest food plus delivery fee, that type of food will never be selected, but since the delivery fee is averaged over multiple days the crossover point could be lower. It wouldn’t be too hard to do an infinite money scenario maximum efficiency but for a finite amount of money the problem is trickier.

The large input is huge, 10^18 money and up to 10^18 stale time. But the number of food types is much smaller, only 200. I looked up some solutions, and the one which I think most matches where my thinking was heading is assuming that the solution consists of d day cycles or d-1 cycles to make up the total number of days achievable. Binary search to find the maximum number of days, and ternary search inside that to find the minimum cost to complete that many days. If minimum cost is below the total amount of money, binary search goes up, otherwise it goes down.

Q4) Given a sequence of characters, some of which can be substituted with numbers, determine the length of the shortest string which contains every possible length k consecutive subsequence of the original sequence or any of its substituted variations.

A4) Large input has k up to 500, small input is only 2. Small input we can build the list of distinct pairs, including substitutions. Then you have to work out how short you can make the joining. For each distinct pair you have to have at least 1 character in the output string, but do you need 2? For each unique character in the input we can count how many pairs start with it, and how many end with it. The difference (if positive) is how many times you can’t reuse that character. Finally even if you could theoretically reuse everything, there is always the first character, so there has to be something which can’t be reused. (I read a solution because I wanted to finish writing this blog post before 1am… but I don’t think this is too hard to come up with…)

The large input, I have no clue! Unlike with the size 2 input, you can have multiple different length overlaps to save you characters. (Not to mention that 500 character sequence can have 2^500 versions using substitutions…) Only 1 person solved this during the competition, so I don’t feel to bad… Turns out he used a mincost/maxflow, although I certainly am not going to read the pages of code to try and understand how mincost/maxflow solves this problem…

Low attendance for this round, out of 1900 potentials, less than 1100 turned up. Apparently some people were having difficulty reaching the topcoder site just before it started so they delayed the start by 10minutes. I lost connection twice, once during debugging the first question and again during the challenge phase (I think it was my own modem even, not top-coder). But I don’t think it affected my performance significantly.

Writer solution for the 900pt problem was wrong… so it took a very long time to get any results…

346th before system tests – but I missed a corner case in my answer to question 1 and ended up with 0 points. Which dropped me down to 525th. I wasn’t going to advance or get a t-shirt in either case, so not a large drama.

Q1) Given a fully connected directed graph where you can change 1 edge value, determine the longest path taken by a greedy algorithm which always chooses to move to the closest remaining destination, starting at 0 with tie breaking based on order of input.

A1) I made my code more complicated than it needed to be, so my submission time was a bit slow (but apparently not complicated enough). Brute force is O(kN^4) (k being the maximum range of each edge value), which is too slow. However it can easily be seen that there are only a few interesting values to change an edge to. First is the maximum – this either eliminates the edge from consideration, or increases the maximum distance. Next is the highest value which makes the edge be selected. If the edge is already selected, this may still be higher than its current value. So at each point you can choose the second shortest path if the edge being modified is under consideration and try the value equal or immediately below, whichever is the highest value which makes it selected.

There are a few corner cases, making maximum doesn’t rule the edge out of consideration if there all other edges are higher index and maximum length and you can’t have an edge with length less than 1, so you can’t always make an edge selected either.

Specific case I missed was when forcing the maximum, the scenario where it increases the maximum distance rather than forcing a different option doesn’t just include where it is already maximum or it is the only option. It also covers the case where all other options are higher index and also maximum length.

Q2) Given a large number of points in a 2 dimensional plane where no x or y co-ordinates are in common, determine how many ways you can select 3 points such that if x coordinates are in order 1 2 3 the y coordinates are in order 1 3 2.

A2) The number of points is up to 30k, which rules out O(N^2) algorithms. Evidently the solution is O(N log N). I don’t think I’ve seen a problem like this one before, and the solution is kind of interesting. I certainly wasn’t going to guess it during the comp though… As a bonus, I completely misread the question and thought the x and y coordinates were in the same order (despite the question having a sentence explicitly calling out the fact that they were not…)

First do some range reduction, sort the x-coordinates and re-number them to be consecutive integers from 0. Then switch to sorting by the y coordinates.

Now we need to actually solve the problem… So we consider each point in reverse y order, first checking to see how many points already added have higher x then the current. Then we add the current point. At any point in time all points already added have greater y, so greater x is the criterion of interest. Then we add the new point.

The first trick being that we need to answer the question regarding the number of points already added with higher x, quickly. For this we use a range tree. Since we renumbered the x-coordinates, we know the full range, and can do a trivial sideways binary tree to cover up to the next power of 2 greater than the range. When an element is added, we increase the value of each segment which contains it. When we need to know how many elements are we look into the tree, summing any right branches whenever we take the left.

The second trick being all I’ve said so far only helps with choosing 2 points, how do we choose 3? Well we can take the number of values with greater x and do a choose 2, but while we know they have greater y values and x values, we don’t know if they actually are ordered correctly themselves. So we need to keep track of how many pairings greater than a given are the wrong order. Since the wrong order is that they are sorted, and we know how to count the number of elements greater, this is not too hard. Just have another segment tree, but rather than giving unit weight to each number as it is added, weight it by the number of entries strictly greater then itself at the time it was added. Thus the total of these is a count of all pairings in sorted order, which we can subtract from the choose 2 to get the correct answer.

Don’t think I’ve seen a question involving a segment tree before at all even – I’ll have to keep them in mind in future.

Q3) Given a list of heights, and the rule that at each time step positive heights shift 1 unit of height to the next entry in the list (circular so last moves to first), determine the new list of heights after a very long time.

A3) I think I should have put more effort into this question, because it certainly seems easier than Q2… The solution is to simulate it, but jump forward in time by the largest step you can at any point.

At any point if everything is 0 or negative, you are done. If everything is positive, you are done. If everything is 0 or 1, you can rotate the solution to finish. If everything is 0 or greater, a few simple simulation steps will either make everything positive, or just 0 and 1.

So the only interesting scenarios are the ones which involve a mix of negative and positive numbers. In these scenarios, you can jump ahead, so long as you don’t fill any holes or clear any mountains, since that changes the dynamics of the scenario.

We can rule out a few corner cases by forcing step by step simulation if the number of turns remaining is smallish. Also if there are any 0’s between a mountain and a hole, some manual simulation will take care of that. Then, for each hole we follow left to find the left-most contiguously positive position, and skip ahead a number of steps equal to the minimum of what fills the hole or reduces the height to 0. The minimum of all these minimums is the number of steps to simulate, which can be done just by subtracting the number of turns off of the left-mountains and adding them to the connected holes.

I think the worst case runtime of this algorithm is something like O(N^3) – finding the connected mountains is linear, but the manual simulation sections might be O(N^2). However since one mountain or hole is eliminated in each step, there shouldn’t be more than O(N) such sections.

Month: June 2012

GCJ12 R3

TCO12 R2C