TCO13 R2A (take 2) – The Missing Docs .Net

Attendance not that high again, 1279 registered competitors for this rescheduled round. But at least the servers seem to be more stable tonight (even if I am no less sleepy).

6 Australians this time. I guess one of them had difficulty connecting last night.

Tough round – I only submitted a solution for Q1, and I had low confidence, since it was a greedy solution and I could see the vague potential for non-greedy answers. During challenge phase, 40% of submitted solutions to Q1 in my room were successfully challenged, including mine. I was too tired to identify bugs in other peoples code however, which left me with 0 points.

Which given the round difficulty was 560th before systests. This raised to equal 441st in the final tally.

Only 2 successful solutions to Q3, and both of those people skipped Q2. Only ~30 solutions to Q2. Cut-off to advance was a fast time on Q1 and a successful challenge. Looks like John Dethridge and nabbftw might be in with a t-shirt chance from this round with a decent Q1 time – and EvgeniSergeev managed to get a successful challenge, the other 3 Aussies (including myself) left on 0.

Too tired to do any more write up now – I’ll try and come back to it later. But I do at least know why my Q1 answer failed.

Q1) Given 2 equal length sequences of letters, determine the correlated sub-sequences which when appended give the maximal result.

A1) I have managed to fashion a ‘brute-force’ solution which passes system tests, but I’m not 100% happy that it is actually correct. But I will present it here anyway.

The first thing I did was trim the first sequence so that it was in non-ascending order – performing the matching trims in the second sequence. It seems trivial enough to me that if your first subsequence ever increases, you could have removed the previous lower characters to get a better value. The problem then seems to become a question of how best to ‘cross the append’.

I identified a few different ways that you could cross the append, and brute forced them. Each way basically starts at some offset in the first sequence and tries to cross at some letter value. Each letter which gets in the way (is lower than the desired cross value) is removed as it is encountered – and as each letter is removed we check to see if we’ve found a better solution. However there is one exception, once you have crossed over, if a letter is in the ‘way’ it may correspond to a letter in the first sequence with value higher than your starting index, in which case removing it can never be a win. These letters are just left in place and hope they don’t make the solution worse. Every combination of offset and possible cross letter value is tried and we return the best we can find.

The way I handle this exception lowers my confidence in my solution a lot – but it works well enough for system tests.

A better solution is dynamic programming. The DP is ‘best solution of length n using the last k characters’. Initialized with length 0 being empty, each iteration we check the kth last character to see if it makes a better solution to length n+1 (than the previous length n+1 solution, if one exists) by pre-pending it to length n. (Proof of correctness of this still isn’t entirely obvious, I suspect a similar solution to ‘best solution of length n using the first k characters’ would fail – since an obviously correct solution would need to apply arbitrary subsets to jump from multiple iterations before to the new iteration, rather than being based purely off the previous iteration. Something to do with prefixing vs postfixing would seem to be the key.)

Q2) Given a square grid, where each cell can be any value 0 through 9, and up to 10 specified values, determine the number of scenarios which exist where each row and column adds to the same number modulo 10. (Answer given modulo large prime.) The grid may be large, up to 1000×1000.

A2) This problem boils down to 2 cases. Firstly the width of the grid may be larger than the number of given clues. In this case the answer is 10^((n-1)*(n-1)-k + 1) – where n is the width of the square grid and k is the number of specified values. Since the number of specified clues is less than the width of the grid, there exists at least 1 row and column which is completely independent of the specified values, which can be used to guarantee whatever desired modulo for each row/column.

This leaves the case where we don’t have an independent row and column. But we do have a maximum grid size of 10. We can break this down further. First if any row or column is fully specified, we know the target modulus. If there are 2 target modulus that don’t match, there is no solutions, and we can immediately return 0. If there is no target modulus, we try and see how many solutions there are for each potential target modulus and add them all together.

Looking at someone else’s solution I see that from here we can break the problem down in to independent subsets. If a cell is not specified, than that row/column pairing are not independent as they are linked by the choice to be made. But if the value is specified, then you can just subtract that value from the target for that row/column and they remain independent. A disjoint set tracker can be used to identify these independent sets of rows/columns, and the number of possibilities for each independent set can be multiplied together to get a result. Once we have an independent set of rows and columns, there is either no solutions or lots of solutions. If there is lots of solutions, the number of solutions is much like the original calculation, 10^((r-1)*(c-1)-k), where r is the number of rows in the independent subset, c the columns, and k the number of fixed values in those rows/columns. So all that remains is to identify when there are no solutions. This appears to be possible to do by looking at the fixed values in each row for columns which are independent, and for each column for rows which are independent. If the sum of each of these don’t have the same modulus, there is no way to set the values. (How you derive that leap of logic, I am unsure… but it sounds vaguely familiar from magic squares – sum of the sums each have to be equal.)

I wondered if a simpler solution is for a given target modulus, go to each value, set it to 0, perform any cascades forced by the modulus, and for each value you manage to set to 0, that is an extra power of 10. If you ever get to a contradiction, its not possible. To check, I wrote this up, and it passes system tests. Written correctly it is O(N^4) and with N at most 10 this easily runs in time. (The more complicated algorithm above appears to be O(N^3).)

Q3) Given 2 numbers A and B, determine the number of unique values X^Y where 1 <= X <= A and 1 <= Y <= B. A and B are both potentially large.

A3) I think I almost had a better chance of solving this problem than Q2. Although that isn’t saying much given how tired I was. We start with A*B scenarios, and start subtracting off duplications. With each of A and B being up to 1 billion, obviously duplications need to be removed in batches. The first trivial set of duplications is X=1, there are B-1 of these. Now we only have to consider X > 1. We need to identify scenarios of X^Y = N^M. Cases where N is a power of X aren’t too bad, but there are cases like 8^2=4^3 which are far more problematic.

So, taken from someone else’s solution… For each value between 2 and the sqrt of A inclusive, consider every power not greater than A. For every number between 2 and A inclusive not in this set there are B options. There is also the 1 extra for the value 1. What remains is how many contributions count from the others, numbers which have powers less than or equal to A. If we take each unique non-power in the range 2 to sqrt of (A) inclusive, and consider how many powers of it exist which are less than or equal to A we can work out how many unique values we can reach using an exponent of at most B. Each power can add B, but then we have to remove the overlaps. It almost looks like a standard add everything, remove pairwise overlaps, add back in the triplets, removing the 4’s, etc. But since the power could go up to 30, there are a lot of groups of 15, so it needs to be done a bit smarter. I can’t really work out how to describe the solution here properly, so if you are interested I guess you can go read a solution.

Leave a Reply Cancel reply