# GCJ 2016 R3

Unlike Round 2 where I think I would have struggled to make the top 500, this round I think I might have done much better if I had been competing. Possibly even top 120.

Advancing to the world finals was definitely beyond me, that would have required solving the large of the problem worth the most points as well as the other problems I consider within my reach, which I struggle to even comprehend how the solution verifier could work…

Q1) Given a sequence of moods which you can either make a request or submission against, and the constraint that you can only submit your last unsubmitted request, determine the maximum score you can get if you get 10 points for requesting in the same mood as a submission, and 5 points otherwise.

A1) The actual problem describes the possibility to get 0 points, but that would require you to request the wrong thing for the current mood and then submit it later also against the wrong mood.  Any such scenario can trivially be improved by changing your request to not be the wrong thing, since the later submission would then also not be the wrong match, so you get 10 more points.

So, this problem is a bit deceptive, but a little playing around with scenarios should give you a good guess that greedy is the way forward.  If the input sequence of moods has equal two in a row, you’ll do a request submit to get the 10 points.  Having removed those, you might now have moods which are neighbouring and equal, so you can align those up and get 10 points there as well.  Keep repeating until there is no pairings left.  The remains is a simply alternating sequence, there is no way to get 10 points in a simply alternating sequence regardless of the moves you perform.  So just take each pair as it comes and get 5 points each.

Having discovered this greedy solution, the only problem remaining is that the large is up to 20k, so an O(N^2) solution is going to be too slow…

As it turns out, its possible to do in linear time.  The simplest approach I have for doing this is a bit unusual.  It uses a linked list!

Convert the input in to a linked list, then while you have a current node and and a next node, if they are equal, remove those nodes and leave current point to either just before current or just after next if there is nothing before current.  If they aren’t equal, advance one.  Every time you remove a pair add 5 points to your total.  Then add 5 points for half the size of the original input.

A more complicated approach which doesn’t technically require a linked list, would instead involve an array of starts.  When you find a pair you repeat outwards to find the largest simple chunk which can be matched up from the inside out.  Then you set the start array for the last member of that chunk to point to the first.  Then you move on.  Whenever you finish creating a chunk, check if its left neighbour has a start value.  If so use that value to conceptually O(1) merge this chunk with its neighbour and start considering whether the ones outside that can be made in to a pair.  If so, keep going and write a start value once you get stuck again.  This basically simulates the process of the linked list algorithm…

Q2) Determine the fraction of strings which contain certain words where the strings are generated by all the possible ways to select nodes of a forest such that you only ever select a child after its parent and each node has a single character label.

A2) This question was probably the one I was least likely to get out even though I could solve it.  It just looks too hard…

In practice you just need a small number of scenarios to convince yourself that a simple answer is in fact correct.

Because the required accuracy is only 3 parts in 100, randomly generating 10000 of the strings should be plenty.  (The exact detail escapes me, but I seem to recall that if you require an accuracy of 1 part in x for something that has a specific random chance, you need to run x^2 simulations.)

Given the input size is only up to 100, O(N^2) should be ‘okay’ to generate 10k strings if you are quick about it.  The trick is just how to ensure that your randomly selected strings are representative.

To work this out, consider a simple case where one node is on its own, and 10 others are in a chain.  There are 11 possible outputs, for each of the different possible locations of inserting the one node in the chain.  If you were to select with equal likelihood from the available options at any point, half of your generate strings will start with the label of the one node on its own, when it should only be ~9%.  The next option I considered was randomly selecting a node and adding it and all of its parents.  However if we do that the probability of the one node on its own being the last value is far higher than 9%.

The third option I considered was, weighting each available node to select from, by additionally including the count of all of its children.  This means the first selection is 10 parts the chain, and 1 part the node on its own.  Which gets us the correct percentage.  Following all the way through the options shows it generates things correctly for this scenario.  Which is promising.  We already know it also generates the right values for a forest entirely of single nodes.  So that is two data points in its favour…  I tried one more scenario to convince myself.  A single node and a simple 3 node tree.  The tree has 2 orders, and the single node has 4 insertion points, this gives 8 possible scenarios.  Of those 8 scenarios 2 start with the single node.  Again the weighted selection works 3:1 corresponds to 8:2.  And walking through the rest of the scenarios shows the percentages work out.

Good thing about this question is that its just a high valued small input, so if this weighting thing is wrong, we can find out pretty quickly…  but its just fine as it happens.

So calculate the weights of the tree, then add the roots, randomly select based on total weight, remove that node, add its children, repeat.  This is O(N^2).

A slightly nicer to implement solution is to recognise that the weights are effectively a place holder for allowing random selection of any node and then putting the deepest not yet placed parent of that node instead of the selected node.  This way you don’t have to do any tree constructing.  Just have a boolean array representing what has been selected so far, generate a random number the size of the remnants and walk to find the nth not selected item.  Then walk its parents while they exist and aren’t selected.  Finally select that node and then repeat the process until all nodes are selected.

Q3) Determine the smallest maximum jump size to get from one asteroid to a second asteroid, if there are a bunch of asteroids moving at linear speeds and you can jump between any two at any moment so long as you don’t stay anywhere more than S seconds.

A3) This problem reminds me of another I’ve done before, but that was in 1 dimension, this is in three…

However, the small input is easy, since all the linear speeds are zero.  Therefore there is no point to waiting anywhere.  Therefore it just becomes a question of for a given maximum jump size x, is there a path between asteroid 0 and asteroid 1 then solving for the smallest maximum jump size x.  Once a maximum jump size is set, the problem reduces to a simple search (DFS is one option).  To solve for smallest, you can do a binary search.

Contest analysis isn’t up yet, and I’ve not ready anyone else’s solutions, so I don’t know how to do the large.  I suspect it has a similar structure.  Once you have a set maximum jump size you can determine the time ranges that a given edge is open or closed.  This appears to cause a combinatoric explosion if you clone each node based time range that a given combination of edges are open/closed with edges between the clones depending on the size of S, but maybe you don’t need to, you can instead just have a separate clone per status of a single edge, connected to clones representing different arrival time ranges and the separate arrival and departure clones are connected depending on the times and size of S…  I don’t quite think it works though.

Q4) Given a simple single bit single register computer and two programs that run simultaneously made of three atomic instructions, set 0, set 1 and print register, determine a program pair which can potentially print any of the ‘good’ values, but never prints the ‘bad’ value.

A4) Again I don’t know how to solve the large (at least I don’t think I do…), but the small is surprisingly trivial.

For the small input the bad value is always a sequence of ones.  So as long as the bad value isn’t also a good value (which is something to just check for in general…) one option is to write a program which can print any sequence of X digits except for all ones.  One way to do this, is to have one program that continuously sets the register to 0, then prints the register, x times.  Then the second program consists of exactly X-1 calls to set the register to 1.  Because of arbitrary interleaving, this program pair can print anything except for X 1’s.  Since the first program always sets to 0 before printing, there would need to be an interleave before every print, but there are only X-1 instructions from the second program to interleave.

This is suggestive of possible approaches for the large, but the open question in my mind is whether they cover all possible inputs correctly, or just a subset.  For instance an input where the bad value is all zeros, can be solved using an inverted version of the program.  If the bad value has one one digit, and all the good values have either more or all less than one one digit similar constructions work.  In general I can solve if the number of set bits is either strictly greater or strictly less than the bad value, but not otherwise.  For specific cases if there is only one good value, I can solve that too…  If there are multiple good values and they have x set bits in common, but the bad value ‘further away’ from the commonality then the good values or doesn’t have all of those values set. That can be solved too.  Or inverted for x unset bits in common.

In general the things I can solve are where one program does all the printing, and the second program is a sequence of ones or zeros of some length.  Its not clear to me that such a program pair is always the solution if there is such a solution, but it doesn’t seem unlikely…