TCO10 R4

So, as predicted by the odds I am out ;). I did at least score some points, so I came 125th. The question structure didn’t favour me, an easy first question jumping to a hard second question.  A fast time on the first question was sufficient to get through, or a couple of challenges and a terrible time.  However the first problem was so easy the challenges were hard to come by, and the second problem’s input requirements so obtuse that constructing a failing test case was next to impossible.

My time for the first problem was terrible  – I had the problem solved in the first minute, but a plus sign where I wanted a minus sign sent me down a warren of verifying things by hand, and quickly half an hour had vanished.  That being said I probably would not have gotten through even if I hadn’t screwed up badly, had to be fast.  Although an extra 20minutes and I might have actually had a serious attempt at the second question.

For a while I thought maybe I could do the 3rd question, thinking it was a well known computational geometry problem which I could just grab an algorithm off a textbook for.  However the closest related problem in the textbook told me the running time would be too slow.

Q1) Given a set of bank accounts with known values, where 1 dollar gets you one ticket in a lottery draw each week for a fixed amount, what is the expectation value of bank account 0 after n weeks. Limits: n up to 1000, winnings per week up to 1000, up to 50 bank accounts, each with a starting balance of up to 1million.

A1) I implemented this as a DP of probabilities of number of wins in a given number of weeks.  Note that the total amount of tickets in a given week stays the same regardless of who wins in the previous weeks, so only bank account 0 and the initial total of the incoming bank accounts is of interest.  Once DP is formed, simply multiply each probability in the final week by account balance for that many wins and return.

As it turns out the problem can be solved without the DP. It appears that simply using the expectation value of one week can be used to generate the expectation value of the next week.  Reducing O(N^2) down to O(N).

Q2) Given a sequence of numbers formed by multiplying the digits of each of up to 50 consecutive integers, determine the location of the first of those integers, assuming the sequence refers to the earliest such match. Limits: the input numbers will be less than 1 billion, output will be less than 2^63.

A2) After the competition I realised that this problem is not as hard as it looks.  I really might have worked this out if I had of made a serious attempt (and had that extra half hour :P). If the input is all 0’s you can return 10,100, 1000 depending on input length quickly.   In fact brute forcing for possible answers of 1-1000 takes care of some corner cases. Find the first non-zero number, brute force the last 3 non-zero digits and solve for the rest by trying as many 9s then 8s then 7s down to 2s going from right to left, then check against all other numbers in the input to see if it is right.  Return the smallest result.  For kicks I wrote this up and it failed test cases – but only because of a mistake in my code, fixed that and it now passes system tests.  My first thought was brute force 2 digits (well my real first thoughts were more complicated than that), but input size > 12 allows some control over the 3rd digit.  A number followed by 12 0s indicates that the last 3 digits were 9s for instance.  You can look at the input set and break it down in to all sorts of combinations, but in the end, simply brute forcing all last 3 digit scenarios is easier…

As an aside I was left wondering what checks the challenge system used, since the input limitations are significant.  Other than 0 no two consecutive numbers in the input can be equal. No input number can have a prime factor greater than 7. At least 5 of the inputs must be 0, and after the last consecutive 0 every 10th number must be a 0. Two consecutive non-zero numbers must be increasing and the difference must divide them.

Q3) Given up to 1200 points, where no 3 are co-linear and no 2 are coincident, determine the number of line segment intersections from choosing any 4 points of the points.

A3) Brute force is obviously no good. A look on Wikipedia indicates that even the best algorithm for enumerating all intersections of the complete set of line segments from n points is in the order of the number of intersections, and if the 1200 points form a convex hull the number of intersections is well over N^3 (Close to N^4 even I think).

Therefore we can’t enumerate the actual intersections we have to count them in batches.  Specifically the time needed needs to be about O(1) for every line segment. O(log n) would be sufficient.

I had a quick look at one solution, and it considers each point, and sorts the other points by angle relative to that point, then for each of those segments in order, mystically calculates the intersection count really quickly. … going to have to read it again after some sleep.

Edit: Post sleep update.

I had a look at some more solutions.  General approach seems to be assume everything crosses.  Then for every point, sort the other points around it by angle.  If you can get 180 degree separation in the sorting, all the points on the left will not form crosses with the points on the right. So you subtract these scenarios away.  However there is over-counting which you have to adjust for.  Looks like the over-counting adjustment is probably the hardest bit.

TC SRM 468

Q1) Given a dictionary and a mapping of numbers to sets of letters, and an input sequence consisting of numbers, ‘try alternate’ button pushes and ‘spaces’, determine the output.  Assume dictionary is processed in alphabetical order.

A1) Easy enough, make a dictionary of number sequences to list of words that match, sort each list, then process input to produce output…

Q2) Given a graph with a line of nodes each connecting from one to the next, determine the minimum time to get from the first node to the last.  Subject to some special conditions.  Each edge has two time options for its travel.  The first is always available, but the second can only be used if there are less than k contiguous sections where second option has already been used, or you used the second option for the last edge. The number of nodes n is up to 400thousand, and k is up to n or 40, whichever is lower.  The first and second times for each travel option are generated using a pseudo-random number generator with specified parameters.

A2) While writing my question text above I may have made this question even easier due to my phraseology.  Its a DP problem, 3 parameters, number of contiguous sections of second option so far, boolean if last was second option and how many nodes passed already.  Very easy, only problem is that it requires 64meg ram in the default dp table.  Luckily every entry only depends on the ones at one less nodes passed.  Lucky because that means we can get rid of the memory problem, and also because if it depended on all of them the processing time would have been excessive.

Q3) Given a k-partite graph where each ‘layer’ only connects to its neighbours (arranging the k layers in a circle), determine the maximum independent set.  Limits: Maximum vertices is 100, k between 10 and 50.

A3) When I first saw this problem I was shocked, a question 3 which maybe I can actually do.  (The original question isn’t phrased as given, but still it was obvious this was the real question.)  I unfortunately pretty quickly came to the realisation that the graph itself didn’t have any limitations which guaranteed it was bipartite. If k was even, I was good to go, but when k is odd, I had a problem.  I thought about it for quite a while, and along the way I had the idea that I could try all the possibilities for one of the layers, and do the rest pretending it was a bipartite graph.  Unfortunately I threw this thought away because I didn’t remember that k was at least 10, and so I decided that worst case would be 33 vertices as the smallest layer and 2^33 scenarios is too many to consider.  With k at least 10 (or we might as well say 11 since 10 is trivial and a maximum vertices of 100, the worst case scenario is when the 100 vertices are distributed in layers of 10 or 9.  Select one of the rows with only 9 in it, and try all 2^9 scenarios.  For a given layer if we are trialling a vertex as being in the independent set, we can remove it, and any immediate neighbour vertexes.  If not we just remove it.  This removes the entire layer, leaving us with a bipartite graph which we can solve via the dual maximum matching problem using simple maxflow on a graph with capacities which are all 1.  Add in the number of ‘independent’ vertexes in the current trial and choose maximum over all trials and done.  Probably one of the easier question 3’s I’ve seen recently.

TC SRM 469

Q1) Given a n rows of m seats, and a list of specific taken seats, return how many ways 2 people can sit next to each other. Limits: n and m up to 1 billion and number of taken seats up to 50.

A1) Very very easy.  Use long long to avoid overflow.  For each row with taken seats get the sorted taken seats in that row, then for each gap greater than 1, add gap-1 to the total.  Finally add number of empty rows times m-1.

Not my favourite kind of question, simply because I am not uber-speed code writer.

Q2) Given a counter which ticks down every minute, and a set of k items which each have a time length, but increase the counter by j at a specific time within that time length, determine the maximum number of items which can be processed completely before the counter drops below 0.  If there are multiple ways to achieve this goal, return the order which prefers lower numbered items first. Limits: k up to 20.

A2) Each item has a minimum requirement, and a net effect. If there is a non-negative net effect which we can meet the minimum requirement for, we want to use it. If there are only negative net effects, which we can meet the minimum requirements for, we will never be able to use anything which is currently not at minimum requirement.  However, the order of the things left is non-trivial.

With a limit on k of only 20, every possible subset is a valid consideration if we can process it quickly. This provided a hint, but I needed to look at a solution to get the rest. If we divide the input into start and end sections, we can determine which of the end section can be placed immediately after the start section.  This lets us create a DP of maximum number which can be processed from a set which appear after the rest not in the set (regardless of whether the rest not in the set can actually all be processed before it).  That final condition may sound crazy, but once you get to the full set,  the not-in set becomes empty set and the condition is fine, so the rest doesn’t matter.

Once the DP is done, you just have to do the ‘reverse dp walk’ to construct the output.

Q3) Given a list of items and 2 queues to put them in, where each item has a processing time, which depends which queue it is in, and after processing is placed into the other queue, unless it has already been processed by that queue, determine how many different ways to place the items in the 2 queues subject to 2 conditions.  1) The items have to be added to the queues in the same order as the input. 2) Neither queue must ever be empty before the other queue has finished its initial list. Limits: up to 47 items and each processing time is between 1 and 20.

A3) No clue – the processing time limit seems suspiciously small though. Maybe a dp on the number of solutions with an initial queue time of n?  Let me look at a solution.

Rather than solving the problem as given, solve a different problem, number of ways for queue 1 to run out before queue 2 is done its initial list.  Then switch queues and use the same logic again.  Subtract both from number of possible distributions.

So, how to count the number of ways for it to run out. Each time we put something on to queue 1, it makes it harder to run out, each time we put something on queue 2 it is easier to run out.  Some analysis is required to work out the DP.  The definition of fail is if total queue 1 time + total time of first i-1 from queue 2 processed at queue 1 speed – total time of first i from queue 2 processed at queue 2 speed < 0 for any i.  We want to know what the minimum of the left for any i is, for each possible distribution in order to know whether it has failed, if that minimum is less than 0, then it is a fail.  As each new item is added to the pile this function changes.  If we add one to queue 1, all possible left hand sides increase by the processing time, thus also increasing the minimum.  If we add one to queue 2, the inequality doesn’t change except for the new i=new length of queue 2 case.  In that case the inequality equals the previous scenario where all were processed minus the new addition as a possible new minimum.  So we can minimize between that and the old minimum.  The only problem is we need to know that previous scenario as well.  So ‘previous scenario’ and ‘minimum’ are 2 parameters to the DP, the 3rd being number of items en-queued.

Now that we have this new parameter we need to know how it changes.  If we add to the first pile, it goes up exactly the same as the minimum does, adding the queue 1 processing time.  If its added to the second queue, that causes a subtraction of the queue 2 processing time, and an addition of the queue 1 processing time.

Potential range on previous scenario and minimum are pretty large so memory is an issue if the entire dp was kept in memory.  Helpfully each case only depends on the previous number of items en-queued, and the final summation is only on the full list of items en-queued, so we can throw away memory as we go. (Avoid garbage collection issues and alternate arrays, don’t keep allocating.)  Additionally with the inequality range being +/- ~900, that is close to 200million dp locations filled without any optimisation.  Running with c++ this is probably reasonable, but with Java further trimming helped.  For instance minimums only ever get smaller and we are only interested in the negative ones, so treat all positive minimums as 0, and we save half the state space.  Furthermore the area of the DP increases by +/- 20 in both dimensions with each step, taking this into account can save another factor of ~3 rather than blindly checking every spot.

Again out of my range, but useful to sit down and understand how it works.

TC SRM 470

Q1) Given a line of n (up to 51) rooms  separated by gates of up to 16 different types, a game is to be played.  One of the (non-end) rooms is the ‘target’ room, where both player 1 and player 2 want to get to.  Player 1 starts in room 0, and player 2 starts in room n-1.  Each player (starting with player 1) takes turns removing all gates of a specific type.  At any time if someone can reach the target room, they have won.  If both can reach the target room the game is considered a draw.  Assuming optimal play return how quickly player 1 can win (if they can), or how quickly player 2 can win (if they can), or ‘draw’.

A1) This is a simple memotized recursion, using the bitset of gates removed as its state.  At each stage of the recursion first check if the given bitset is a p1 win, p2 win or draw.  If it is one of these 3 states, return 1, -1 or 0 – swapping the first two if the popcount of bitset is odd (cache the result of course).  If it is not a win for either player and not a draw, recurse on each bit which is not in the bitset, and select the closest negative number to 0, 0, or largest number of all recursion options and return that (after caching it).

While this approach is fast enough (50*2^16 worst case), a greedy implementation actually works.  If you divide gate types into left, right, or both compared to the target room, then player 1 wins if left + both <= right, player 2 wins if right + both < left and in all other cases its a draw.  Number of turns is 2*(left + both)-1 in the former, and 2*(right + both) in the later.  Giving an O(n) solution, which works regardless of the number of gate types.

Q2) Given two parallel rows of n dots, where some specific dots in row 1 are already connected to specific dots in row 2, determine the expected number of pairwise crosses formed if each remaining dot in row 1 is joined to a random remaining dot in row 2.  Every dot in both rows will have exactly 1 line connecting it to another dot in a valid randomly generated situation. Limits: n is up to 10000 and count of specific starting joins is up to 50.

A2) Summation of the expectation value of crossing for a specific pair over each pair of dots in row 1 will give the answer.  However with n=10k n^2 is not fast enough.  We can divide the problem into 3 cases.

  1. 2 non-specified dots.  Expectation value for all of these pairs is exactly 0.5, so we can sum all of these quickly. (O(1) time even.)
  2. 2 specified dots.  1 or 0, we know whether they cross.  The 50*49/2 pairs to consider is easily executed.
  3. 1 specified dot and 1 non-specified dot.  At 50*10k, this could be brute forced, with probability defined by the ratio of non-specified locations to the left/right of the specified location in row 2, flipped if the specified dot in row 1 is to the left instead of right of the non-specified dot.  However every non-specified dot is going to have the same ratio, only depending on whether it is left or right of the specified dot in row 1.  Therefore by knowing how many specified dots there are to the left/right of each end of each specified dot pair, this can be performed much faster.

Q3) Given a map, up to 50×50, with up to 4 pairs of towns which want a road built between them, determine the minimum cost to build the up to 4 roads required.  Roads are free to build, and can go anywhere, except through a rock.  Rocks are a side to side connected set of square tiles for which the entire set can revert to being normal ground for a specific cost.  There are up to 62 rocks, and no 2 rocks have the same cost.  No rocks overlap with any town, nor do any towns overlap with other towns.

A3) This is a tough question.  No one solved it in the competition.

A single pair of towns is trivial, you just create a directed graph where nodes are connected sets of map tiles of the same type, and side-to-side adjacency defines the graph edges.  The incoming edges to rocks all have a cost equal to the cost of removing the rock, and the outgoing edges are free.  Then you calculate shortest path from town to town.

Handling 4 town pairs is tricky, because you have to consider the case where the cost of a given rock is shared between more than 1 road.  Looking at the solution write up, the suggested way to approach this is a dp over location and a mask indicating the towns are you are trying to join through this location, and the towns which are already joined.  For  each mask and position you consider each possible way of breaking the mask into 2 submasks.  The cost for that combination is then the sum of the 2 sub mask costs minus the cost of getting to this node if both of the submasks have one or more lone towns out of the town pairs.  The reason for the subtraction is if the submasks have a lone town then both of them have already incurred the cost of destroying this rock to get here, and so we need to refund it.

So far so good, however we have not handled anything to do with the cost between between nodes yet.  For a given mask and location, we need to minimise its cost by considering we could come from another location with the same mask.  If the mask has only pairs, it is already fully connected, so there is no road being built and we freely minimise everything.  This lets us combine any road with another when we build up the mask to a higher bit count.  If it does have a lone town, we have to pay the shortest path cost between the two vertexes (if it has multiple lone towns, we only have to pay the cost once!).  This second case requires n^2 steps to ensure minimisation is complete. (The former can be performed in n steps, but it isn’t a significant performance boost.)

The final problem is that n^2 is kind of expensive considering we have to do it for most of the up to 256 masks.  A particularly nasty arrangement of rocks can result in ~800 nodes in the graph.  However, excluding the empty space nodes, there can only be at most 70 nodes ( 8 towns and 62 rocks).  Every empty space node can be eliminated by connecting its incoming edges to its outgoing edges, since the incoming edges always have 0 cost, you basically just throw away that edge.  Number of edges will increase significantly in some scenarios, but still all points shortest path  will easily run in time with the lesser number of vertexes, and the overall algorithm benefits hugely from the decreased vertex count.

An interesting algorithm, certainly beyond my abilities to create on the fly.

TC SRM 471

This SRM was cancelled due to some issues, but the questions are still there, so here I go…

Q1) Determine the largest prime less than or equal to n, which when divided by 2 (each time rounding down) D-1 times, results in D primes being generated. Limits: n is up to 10million and D is up to 10.

A1) Generating all primes less than or equal to n is easy, the only thing to be careful is that if your programming language of choice stores bool in 4 bytes, you’ll exceed the memory limit.  So use byte, or bitarray, if needed.  Once primes are acquired, start from n and keep walking down, when you find a prime check the dividing sequence, if it works return the current prime you started with.  If you get to less than 2^D, break and return -1.

Q2) Given a directed graph with edges which each have a positive length if they exist (but the reverse edge may not have the same length as its matching forward edge), determine the shortest path (if one exists) which gets from vertex 0 to vertex n-1 where no sum of consecutive edges is 0 mod 13. Limits: n is up to 25.

A2) This is a dp over the graph, shortest path to get to vertex i where the current modulus for the sum of each of the last k edges is represented as a bitset.  From each value you generate more by considering each edge away from vertex i which itself does not have a length of 0 mod 13.  The new bitset for the destination is the current bitset rotated by the edge length mod 13, unioned with the bit indicating the edge length mod 13.  If the rotation causes the 0th bit to be set, do not add the new destination to the set. If we do add it,we minimize length between the new length total of current state + edge length, and any existing path found to the new vertex with the new bitset.

Because before the rotate bit 0 is not set, and after the rotate bit 0 is now in bit edge length mod 13), every step adds a new bit.  This means that the depth of search is at most 13.  It also means that the appropriate dp order is by number of set bits. By using the dp, we ensure we ensure that rather than worst case ~24^13 it is at most 2^13*24, which is much more reasonable.  Once the dp is generated, we walk all bitsets for the destination node and choose the smallest.

As an aside, since all cases where the 0th bit is set are discarded, you can alternatively use a 12 bit bitset, but you get added shifts since the rotate operation needs to occur on a 13bit representation.

Q3) Given up to 50 line segments which have integer dx/dy/dz, what is the maximum distance (squared so answer is an integer)  between the start and end you can achieve by joining them end to end such that the created line does not self-intersect.

A3) I managed to get some ideas for this but wasn’t sure if I was on the right track, so I took a look at some solutions.  Probably the most important bit which I did get was that the self-intersect restriction is irrelevant, any self-intersecting arrangement can be made non-self intersecting and at the same time larger, by fairly simple rearrangement of the connections.  The other important point is that the order of the edges is irrelevant, it only matters which end you choose to connect on to the current line.  (Still with up to 2^50 combinations, you don’t want to try them all…) Most of the solutions I saw were along the lines of ‘choose a random direction, and choose to connect each edge such that it has a positive or 0 dot product with that direction.’  Repeat while clock says less than 1.8 seconds past, and hope the best one found is the winner.  The solution in my head before I started looking at solutions was similar, only rather than random directions, it was try each edge direction, I gather that this would have not been sufficient as some other solutions were like that,but then threw in extra random trials, or tried directions perpendicular to pairs of edges, and added some random trials as well.  I found 2 solutions which did not use a random number generator, but I haven’t actually copy pasted them and run them through system check to ensure they are right…

Option 1, Firstly, join all parallel edge to make them into a single edge with length equal to sum of the components.  Then for every pair of edges, calculate the cross product.  Use dot product to select order in the direction of the cross product.  However, some edges will have a 0 dot product.  Try to maximise these in 4 different directions (using dot product), each of the positive and negative cross products of the incoming cross product and one of the first two edges.  One of these maximisation attempts will be largest, return it.

Option 2, Consider each pair of edges, and calculate the cross product.  Perform straight dot product optimisation like before, but don’t bother trying to optimise the 0 dot product cases.  For edges, also try the ‘jittering’ the endpoints +/- 1 in each direction, to create 26 variants, and repeat considering each pair of edges including these jittered versions.  This jittering tilts the created cross products, sort of simulating the different secondary optimisations which were applied to the 0 dot product cases.  Seems kind of dodgy to me, but then so does option 1.

The only bit that I didn’t really get was using cross products to choose the direction of primary maximisation.  The only thing I can see for sure is that it results in a lot more directions being tried. But you would think that trying each pair-wise sum would achieve the same goal.  Not sure on this.  With the SRM being cancelled, I guess there probably isn’t an official write up either for me to check on.  If every line segment was in the same plane, the cross product approach is completely useless since it actually only results in 1 direction, and every line has a 0 dot product with that direction.

TC SRM 472

Q1) 2 people take turns to remove powers of 4 from a number which starts with n. Where each player tries to be the last person to remove a number determine who succeeds.  n is up to 1 billion.

A1) This problem seems to be one of ‘solve by inspection’. Simulate the first few thousand values of n using dp and you’ll quickly see that the results form a regular pattern.  2 out of every 5 return player 2, the rest return player 1.  Mathematical induction can be used to prove that the pattern continues by noticing that all powers of 4 are either 1 or 4 mod 5.  In the end the code boils down to k = n mod 5; return (k == 2 || k == 0) ? “player 2” : “player 1”;

Q2) Given n cards which each have a number on the front and back such that numbers 1 to n each occur exactly once on the front and exactly once on the back, determine the number of different lists of n numbers which can be formed by reordering the cards in any order, with any face up. n is up to 50 and the answer should be returned modulo 1 billion and 7.

A2) This seemed a tricky question, and inspection of round statistics showed I was not alone in finding it hard.  The most important trick is to realise that if each card is consider as a link between the number on the front and the back, and each case of the same number is linked together, the full set of cards can be divided into cycles which each are independent of the others.  If the cycle consists of a single card, it has no options, only possible locations in the final list.  If it is longer, it both has a number of options which is dependent on the cycle length, and the number of ways that cycle can be interspersed into overall list.  The answer becomes the product of the number of ways each cycle length can be chosen from n reduced by the length of cycles already considered times the number of ways a pattern can be made in the cycle itself.

The number of ways a pattern can be made in a cycle can be calculated a number of ways, dp is one option, although it seems a little tricky.  An analytical approach is easier.  Each number on the front of the cards which are in a cycle can either be present 1 2 or 0 times.  Each 2 implies a 0, so it seems helpful to consider each case of the number of 2’s separately.  The number of ways to select which numbers are available to use with k doubles is the tricky bit, once you have that can boost that to get the full answer by multiplying by the number of orderings, which is n!/2^k if there are k 2’s.  While each 2 implies a 0, it doesn’t imply exactly where that 0 is, other than it is somewhere before the next 2…  So we need to reserve 2k spots for the 2’s and 0’s and then we need to alternate 2’s and 0’s, which gives 2 options.  So mathematically it is C(n, 2k)*2.  However in the case of k=0 there is only 1 option, so we need to get rid of the multiplication by 2. Which gives C(n, 2k)*(k==0 ? 1 : 2).

And that’s all the components, all that remains is to sum things up and multiply it all out and perform a lot of modulo operations everywhere.  (Pre-calculate the choose function using DP as usual.)

Q3) Consider a h x w rectangle of tiles where each tile is coloured with one of 4 colours.  Determine the number of ways to make every tile a different colour from its (up to 8) neighbours, while only changing at most k tiles. Limits: h and w up to 50, k up to h*w.

A3) After problem 2, I’m surprised there was even 12 correct solutions to this one.  First up there are a couple of special cases.  If height or width is 1, simple dynamic programming provides the solution.  In fact the rest of the approach used to solve the problem requires height and width greater than 2, so it we have to solve that first.

For height and width greater than 2 consider each possible set of 4 colours for the top left corner.  Playing around with the possibilities which satisfy the different colour to neighbours requirement you discover that every second row is identical or every second column is identical, or both. So assume the rows are identical, each column after the first 2 has 2 choices for patterns.  Then assume the columns are identical, each row after the first 2 has 2 choices for patterns.  This gives us 2 DP (very similar, in fact you can do a diagonal flip of the input and use the same DP for each).  In each DP you have number of ways to make the first n rows work (columns) using j changes.  Then sum all values of j from 0 to k for the all h row case, and presto you have the number of ways for column (row) alternation for a given 2×2 top left starting point.  Sum column and row together and … wait a second.  As mentioned earlier there is a case where both rows and columns alternate.  For a given 2×2 starting point, there is only one such case, but if it can be achieved in k changes or less, we’ve counted it twice.  So check that quickly and subtract it off.  Sum over all 2×2 starting points and presto you are done.  Not an easy problem, but I almost wonder if it was actually easier than question 2.

TC SRM 473

Q1) Given a sequence of step, turn left and turn right commands, which are repeated infinitely, determine whether you will stay in a constrained (but not specified) area, or drift off in to the distance.  Maximum command sequence length is 50.

A1) There are 4 scenarios.  1. You come back to where you started. 2. You end up facing the opposite way you started, and so the second run you end up back where you started. 3. You end up facing 90 degrees to where you started, in which case after 4 runs you trace a ‘square’ and end up where you started. 4. You end up facing the same direction as where you started, but not where you started, in this case you drift away.  In any case, simulate 4 runs and if you are back where you started, you don’t drift.  Case 1 will hit the same place 4 times, case 2 will come back twice, case 3 will come back once, but all 3 will be at the starting position after 4 runs.

Q2) Given n points equally distributed around the edge of a circle, and a ‘pseudo randomly’ generated subset k of those points which are marked, determine how many right angled triangles can be created using only the marked points. n is up to 1 million but k is limited to 100 thousand.  Also the input is guaranteed not to result in an answer greater than 2^64.

A2) Seems kind of tough for a second question at first glance.  First we need to identify how to create right triangles.  If n is even, this is easy, possible pair of edges corresponds to a mirror pair, and either of those mirror pair will make a right triangle.  If n is odd, no right triangles can be created… (A right angled triangle has its hypotenuse as a diameter of a circle, and 2 different circles can only have at most 2 crossing points, not 3.) However even with O(1) dictionary lookup and considering every pair of marked points to determine if either of the matching mirror locations is way too much.  So we need a solution which comes from considering the points one at a time.  The trick is in the logic we used to get rid of all the cases where n is odd.  The hypotenuse must be a diameter, so from each marked point we can determine immediately the location across the diameter and verify that it is a marked point.  If it is a marked point we then have a second trick, every single other marked point other than the 2 consumed will make a valid right triangle with the diameter.  So for every marked point in one half of the circle, if there exists the matching point then add k-2 to the sum.  Not so hard after all… (Although implementing the pseudo randomness itself had risks…)

Q3) Given a n x m board and k varieties of pieces each with its own count, determine how many ways exist to layout the pieces such that every row and column has at most 1 variety. n and m have a maximum of 30 and k has a maximum of 10. Return result modulo 1 billion and 9.

A3) Hmm, as usual Q3 seems beyond me. Even just trying to generate canonical arrangements which can then be boosted to the actual answer applying combinatorial multipliers, seems difficult.  After thinking about it for a while I gave up.   Turns out canonical arrangement approach is no good, you need to boost as you go along.

From a solution.  Define dynamic program over number of ways to populate the board using i rows and j columns to contain the first l varieties.  This is equal to the number of ways to populate the board with l-1 varieties in i-di rows and j-dj columns, times n-i-di choose di, times m-j-dj choose dj times the number of ways to arrange count of l in di rows and dj columns.  That second arrangement can be done using dynamic programming up front, or memotized, since it will be repeated a lot.  Firstly l has to fit into the area of di*dj, and must be greater than or equal to both di and dj in order for there to be an arrangement. I would have thought x*y choose c would have been the answer at this point but I would have been wrong, as it doesn’t consider the case where a row or column is left entirely empty. So we subtract off the number of ways to arrange c in to each possible scenario of less rows or columns, each multiplied by the number of ways to arrange the rows or columns into the full x and y size.

Not as bad as I first thought, but still pretty tricky.

TC SRM 474

Q1) Given a set of unit steps in N dimensional space, each in a specific direction of a specific dimension, determine if the path ever self intersects.  N is up to 1 billion and there are up to 50 steps.

A1) As I was reading this question I was thinking, this is kind of trivial, even for question 1.  Then I saw the 1 billion.  However, that only shocked me for a moment, since 50 steps means only at most 50 dimensions are actually moved along…

So anyway, maintain a dictionary of locations.  First walk the input to see which dimensions are actually used, then map those to the first k integers.  Then a location is simply an array of length k, and array equality can be used for the dictionary. (Or don’t bother with the dictionary, only 50 entries…).  Perform the steps, if any location equals any previous location return invalid otherwise return valid.

Q2) How many ways can a connected graph (with edge lengths) be reduced to a tree such that the shortest distance from all nodes to 0 is the same for both the graph and the tree. Graph vertex count is up to 50.  Return result modulo 1 billion and 7.

A2) So first step, calculate the shortest path for all nodes to node 0.  Choose your favourite shortest path algorithm… Then we sort the nodes from closest to furthest.  Then we consider each node in that order, creating a running product (with modulo after each multiplication and using long long numbers to avoid overflow) of the number of options each new node has to connect in to the previous nodes in a ways which equals the previously determined shortest path.  Presto, done.

Q3) Given a tree with n nodes, and a connected graph also with n nodes, how many ways can the tree be overlaid on the connected graph such that every edge in the tree matches an edge in the graph?  n is at most 14. Return result modulo 1 billion and 7.

A3) I swear I’ve seen this problem before… no idea where though.  I also thought maybe I would be able to solve it, but I eventually gave up and looked at an answer.

Step 1, determine a dfs order for the tree.  Then considering each possible starting position in the graph for node 0 of the tree, memotize over the dfs options in the graph which match the tree.  Memotization is an array of dfs position (source destination pair, but since its a tree, can reduce that to an edge or null for leaves – since the answer for a properly matched leaf is always 1), the set of nodes in the graph unused, and the node in the graph which the source is being matched to.  At each recursion, unless it is a leaf position in which case the answer is 1 as long as there are no available graph nodes, sum over each partition of the available graph nodes, where one of those partitions is connected to the node currently being matched to the source. Each element of the sum is the product of the number of ways the first partition works with a given mapping for destination times how many times the rest of the partition works for the next edge of the dfs from the currently being matched node.

The above on its own is going to be slow, too slow.  The inner loops for the top level recursion is 14*2^14 and the size of the number of potential positions where the loop has to be run is also quite high.  A major speed boost is to only consider partitions which have exactly the right number of nodes to satisfy the dfs branch being considered.  The number of nodes down a given dfs branch can be precalculated, and a popcount lookup table for every bitset will make the comparison fast.  Alternatively rather than walking every subset of the bitset and checking its popcount, the patterned combination walk bit tricks function can be used.  This reduces the inner loop to a worst case of 14*Choose(14,7), but far more importantly makes use of the memotization much sparser rather than being quite dense.

Hopefully next time I see this question I can remember all of that…

TC SRM 475

Another past SRM.

Q1) Given a list of n spots, where each spot is one of 3 types, and k markers, which are initially randomly placed such that no 2 markers are on the same spot, follow the rules where markers in the left most spot are moved right, markers in the right 2 most spots are moved left, otherwise markers on type 0 are moved left, type 1 move right, and type 2 moves left on the first move, but otherwise moves back to where they came from in the previous turn.  At the end of each turn the right most spot is removed from play, and any spot with 2 markers is cleared.  Determine the ‘expectation value’ for the number of markers left when there is only 2 spots left. n is up to 17 as is k.

A1) This is a question of details.  The important realisation is that determining the expectation value can be done by performing every possible starting scenario.  Worst case is n of 17 and k of 9 which has 24thousand possible starting points.  Each simulation is O(n^2).  After that it is a matter of implementing the simulation correctly.  Type 2 moves would be tricky, but the requirement to remove all cases with 2 markers on the same spot means a simple bool array ‘camefromright’ (initialised to false) is sufficient to make that work.

Q2) You start with 1 pair of an animal which is 1 year old. At age 2 the animals can breed to produce a new pair which will be age 0.  All pairs which are age 2 or higher will breed.  In some years after breeding half of the entire population of pairs will ‘leave’ never to be considered in the problem again.  If the number of pairs in the population is odd, the number of pairs which leave is half rounding upwards.  The specific years are provided in a list which has maximum length 50.  How many pairs of animals will there be after breeding (and possibly leaving) for n years?  n is up to 10 million so the numbers may be very large and the results should be returned modulo 1 billion and 9.

A2) This is the Fibonacci sequence with a twist.  The leaving years are the tricky part.  With a maximum of 10 million, you don’t even have to implement accelerated Fibonacci using matrix powering.  Leaving years aren’t too tricky – they remove all of the old population and half of the 1 year olds, leaving all the babies.  The halving can be done modulo 1 billion and 9 by multiplying by 500 million and 5.  The problem is the case where the number of pairs is odd.  Because of the modulo 1 billion and 9 (which is odd) you don’t know which numbers are odd or even once they overflow.  So the real trick to the problem is knowing when something is odd.  First thought is we can perform the same set of actions, also modulo 2.  However there is no inverse of a number which is not co-prime with the modulus.  So when we divide by 2, our modulo 2 is broken.

So it would seem we are at an impasse.  I certainly would have been.  However there is a trick (which I learnt from reading solutions :P), if you use modulo 4, you can perform a literal divide by 2 and only the high bit of the modulo will be broken. The low bit will still be accurate.  With up to 50 leaving years, modulo 2^51 would be sufficient to allow us 50 divide by 2’s and still have an accurate low bit.  Nice trick.

You can take that trick one step further, in programming languages where integer overflow is defined to wrap (which is not necessarily true in c++), addition and subtraction are implicitly modulo 2^(the number of bits in the number) so you can skip the whole modulo thing entirely and just use 64bit numbers for working out whether the numbers are even or odd.

Q3) There are up to 50 people, who answer up to 50 questions.  Each question has a point value, and some questions have been marked and others have not.  For questions which have been marked we know whether each person is correct or not, otherwise we only know if they answered the question or not.  We are going to select k people from the top n scores once we finish marking.  Equal scores are broken by order of person in the input.  Looking forward from this intermediate stage, how many different possible groups of k people are there?

A3) I didn’t get far on this question before giving up and reading solutions.  My favourite approach is to start off by caching choose function for up to 50×50.  Then resolve the problem by rephrasing it as ‘how many different groups of k people are there where person ‘i’ has the lowest maximum possible score’, and summing for each possible ‘i’.  These sub problems can be resolved by considering that people with a higher minimum score than the lowest maximum possible score ‘must’ be in the top n scores, and people with lower maximum possible scores cannot be (due to the requirement that ‘i’ is the lowest maximum possible score), the rest are entirely optional, we can toggle them in and out in any combination.  Then so long as we don’t have more than n-1 ‘must be in’ and we don’t have less than n-1 must and optionals, we can consider each partition between musts and optionals to make the group of k up until we choose so many optionals than the top n criterion would be overflown.  This is the sum of the product of the 2 choose functions, 1 for choosing from the optionals, and the other to choose from the ‘musts’.

In case the above is not convincing enough, a few pointers to make it clearer why it works.  Firstly ‘i’ being the lowest maximum possible score ensures there is no overlap between considered scenarios.  Then the fact that optionals can be independently toggled in and out means we aren’t counting cases which are impossible in the simple product of choose functions.

TC SRM 476

So I didn’t go in this single round match, but as part of my lead up to my being knocked out in round 4 I am going to write up 1 or maybe 2 past SRM’s a day as a bit of practice.

Q1) Given fixed and variable cost for each element where variable cost is proportional to the number of elements selected, what is the maximum number of elements which can be selected with a limit to maximum total cost. Up to 50 elements to select from.

A1) I was a bit disappointed at the time it took me to work out the solution to this, it was at least a minute… With only up to 50 elements to select from there are only a maximum of 50 answers, so you can check to see whether each answer is possible.  Can even binary search to find the switch point between possible and not-possible if you were wanted extra efficiency… which is completely unnecessary.  To check whether an answer is possible, calculate the total cost for each element assuming that there are n elements selected, and sum the smallest n costs to check if that is less than the maximum cost.

Q2) Given a directed graph connecting n nodes, what is the probability you can visit all nodes connected from node 0, starting at node 0, if when you arrive at any given node a random subset of size k of the available exiting edges is available for selection and you only visit nodes which are connected to from node 0 and never visit a node twice. Assume you know the entire graph and when presented with multiple options you select the one which will maximise your chance of success.  n is up to 36, k is up to n, and a textual space separated list of the outgoing destinations for each node is up to 36 characters long.

A2) I had to look up the answer to this because I was stumped.  However it turned out that the solution was one that I had considered and thrown away, because I mistakenly read the condition for the list of destinations from each node as being up to 36 entries, not up to 36 characters.  This was the killer point, because 36 characters can only hold at most 15 distinct numbers space separated when 1 is the minimum number.  With only at most 16 destinations  2^15 * 16 is an easily manageable state space, and another factor of 15 on top of that is an easily manageable running time.  What I am describing is a memotized recursion calculating the probability of success given having visited some of the required nodes, and currently being on a specific node.  Using the probability returned from recursion for each possible destination choice, you choose the highest probability.  However this has to be modulated by the chance of being able to select that choice, add in the second highest probability modulated by the chance of not being able to select the first choice, the third by chance of not the first 2, etc, until all choices are accounted for or probability of not being able to choose from the top n is 0.  Chances can be calculated using basic combinatorics… and assuming some caching up front, should be O(1).

Q3) Given a bi-directed graph (each directed edge always has its opposite edge present) with at most 1 cycle (when considered as an undirected graph) where each edge has a non-zero capacity, determine the minimum total capacity increase to ensure that a capacity n flow exists to node 0, regardless of the arrangement of sources (for total of n flow) feeding in to nodes other than 0. Maximum number of vertexes is 50, n is up to 100000. Maximum number of edges is also 100 (50 pairs).

A3) I actually think I stood a better chance with this question than question 2.  I am pretty certain that the arrangement of sources can be ignored, we just have to ensure the graph can handle a flow of n from any single node to node 0.  At that point I would think I would implement a maximum flow algorithm, then find ‘least number of constrictor’ paths and augment them until the number of constrictors increases, then re find.  Of course you can bail out as soon as you reach your target.  Least number of constrictor paths is like a shortest path search where edges with spare capacity have weight 0 and those that don’t have weight 1.  The low number of edges makes this fast.  Also since there is only at most 1 cycle there is a maximum of 2 paths between any 2 nodes, so implementing a proper maximum flow algorithm is overkill.  The 2 paths may have common sections, but other than that limitation, the maximum flow is the sum of the maximum flow for the  2 arcs of the cycle.  Even the least constrictor thing can probably be optimised due to the limit of a single cycle.  You can even reused the modified graph as you move on to each other potential starting node, many of them will then pass straight after the maximum flow test.  I think running time is worst case of about O(V * E^2) in an ideal implementation.