Q1) Given a set of unit steps in N dimensional space, each in a specific direction of a specific dimension, determine if the path ever self intersects. N is up to 1 billion and there are up to 50 steps.
A1) As I was reading this question I was thinking, this is kind of trivial, even for question 1. Then I saw the 1 billion. However, that only shocked me for a moment, since 50 steps means only at most 50 dimensions are actually moved along…
So anyway, maintain a dictionary of locations. First walk the input to see which dimensions are actually used, then map those to the first k integers. Then a location is simply an array of length k, and array equality can be used for the dictionary. (Or don’t bother with the dictionary, only 50 entries…). Perform the steps, if any location equals any previous location return invalid otherwise return valid.
Q2) How many ways can a connected graph (with edge lengths) be reduced to a tree such that the shortest distance from all nodes to 0 is the same for both the graph and the tree. Graph vertex count is up to 50. Return result modulo 1 billion and 7.
A2) So first step, calculate the shortest path for all nodes to node 0. Choose your favourite shortest path algorithm… Then we sort the nodes from closest to furthest. Then we consider each node in that order, creating a running product (with modulo after each multiplication and using long long numbers to avoid overflow) of the number of options each new node has to connect in to the previous nodes in a ways which equals the previously determined shortest path. Presto, done.
Q3) Given a tree with n nodes, and a connected graph also with n nodes, how many ways can the tree be overlaid on the connected graph such that every edge in the tree matches an edge in the graph? n is at most 14. Return result modulo 1 billion and 7.
A3) I swear I’ve seen this problem before… no idea where though. I also thought maybe I would be able to solve it, but I eventually gave up and looked at an answer.
Step 1, determine a dfs order for the tree. Then considering each possible starting position in the graph for node 0 of the tree, memotize over the dfs options in the graph which match the tree. Memotization is an array of dfs position (source destination pair, but since its a tree, can reduce that to an edge or null for leaves – since the answer for a properly matched leaf is always 1), the set of nodes in the graph unused, and the node in the graph which the source is being matched to. At each recursion, unless it is a leaf position in which case the answer is 1 as long as there are no available graph nodes, sum over each partition of the available graph nodes, where one of those partitions is connected to the node currently being matched to the source. Each element of the sum is the product of the number of ways the first partition works with a given mapping for destination times how many times the rest of the partition works for the next edge of the dfs from the currently being matched node.
The above on its own is going to be slow, too slow. The inner loops for the top level recursion is 14*2^14 and the size of the number of potential positions where the loop has to be run is also quite high. A major speed boost is to only consider partitions which have exactly the right number of nodes to satisfy the dfs branch being considered. The number of nodes down a given dfs branch can be precalculated, and a popcount lookup table for every bitset will make the comparison fast. Alternatively rather than walking every subset of the bitset and checking its popcount, the patterned combination walk bit tricks function can be used. This reduces the inner loop to a worst case of 14*Choose(14,7), but far more importantly makes use of the memotization much sparser rather than being quite dense.
Hopefully next time I see this question I can remember all of that…