{"id":536,"date":"2012-05-06T01:38:54","date_gmt":"2012-05-06T01:38:54","guid":{"rendered":"http:\/\/www.themissingdocs.net\/wordpress\/?p=536"},"modified":"2012-05-06T01:38:54","modified_gmt":"2012-05-06T01:38:54","slug":"gcj12-r1b","status":"publish","type":"post","link":"https:\/\/www.themissingdocs.net\/?p=536","title":{"rendered":"GCJ12 R1B"},"content":{"rendered":"<p>So, I didn&#8217;t have to do this round, but looking at the scores it was very competitive.\u00a0 Probably harder to advance than R1A.\u00a0 Looking at the problems I think there were all theoretically in my grasp, except for Q3 large input, I might have missed the key insight, and probably wouldn&#8217;t have been game to try a solution based on random numbers.<\/p>\n<p><strong>Q1) If each item receives 2 scores &#8211; 1 fixed, and 1 a variable fraction of the total of the fixed scores, what is the minimum fraction for each item to guarantee that regardless of how the other items receive their fractional scores, that item will not have the distinctly lowest total.<\/strong><\/p>\n<p>A1)\u00a0 I see a couple of options here.\u00a0 But first some analysis of the question.<\/p>\n<p>If we give a specific fraction to an item, what is the worst case competition for it?\u00a0 Giving points to everything lower than it until they all equal the new value, and still having a non-zero fraction left. (Since that non-zero fraction can be divided amongst all equals to push them above.)\u00a0 So, if we have the items sorted by their fixed values, we can go from smallest to largest, skipping the current item under consideration, subtracting from the available percentage as needed until we get to an item which is above the target sum.\u00a0 This is O(N).<\/p>\n<p>The above suggests we can use a floating point number binary search to find the threshold, giving about a 20*O(N^2) to get sufficient accuracy for all of them. This would be fast enough given the maximum constraint of N=200.<\/p>\n<p>A more direct approach solve numerically.\u00a0 In between each step from one distinct fixed score to the next, the problem is a linear equation.\u00a0 If the answer for the linear equation lies outside the range, you move to the next step.\u00a0 Again this is trivially O(N), for each item, making O(N^2).\u00a0 In fact we can do a binary search on these segments, giving O(log N) for a total of O(N log N) for that component, but only if you can change the segments from one scenario to the next in O(log N) time or better as simply constructing the segments is O(N).\u00a0 If you construct the segments for every item, then the segments for each individual scenario is either the same with the counts above a given point reduced by 1 or two adjacent segments are joined.\u00a0 This later scenario can actually be ignored and treated like the first, the two linear equations\u00a0 will be the same, but testing them both is no significant loss.\u00a0 It is easy to embed the former logic into the linear equation, so we can indeed reuse the same segment list, giving us our O(N log N) running time.<\/p>\n<p><strong>Q2) Given a rectangular grid of cells where each cell has a floor and ceiling height, and a water level, and a requirement that to get from one cell to the next you can&#8217;t pass through an edge without at least a 50cm gap (either current ceiling\/destination floor\/water or current floor\/water\/destination ceiling or destination ceiling\/floor) you want to get to the south east corner of the grid as fast as possible once the water starts dropping.\u00a0 Every second at some future point the water level will drop 10cm.\u00a0 It takes 1 second to move from cell to cell if there is 20cm of water above the ground in the current cell, otherwise it takes 10 seconds.\u00a0 Starting from wherever is currently reachable from the north west corner of the grid, what is this shortest path? (No diagonal movement.)<\/strong><\/p>\n<p>A2) Ahh, interesting problem.\u00a0 Especially since the large input makes the grid up to 100&#215;100 in size.\u00a0 Suggests a shortest path algorithm, but the edge cost changes as a function of arrival time.\u00a0 Lets do some more thinking.<\/p>\n<p>Firstly, it should be apparent that we never go back the way we came once we start.\u00a0 More specifically because the best travel times only ever get worse the later we arrive, there is never any benefit to being somewhere at a later point in time then the earliest arrival.\u00a0 This suggests shortest path will be fine, with some small modifications.\u00a0 Finally there are no negative travel times, so we don&#8217;t have to worry about that complexity when it comes to shortest paths.<\/p>\n<p>We can reach anywhere reachable from the start at time 0, so simply depth first search to determine what is reachable and initialise a grid with 0 at those locations and maxvalue at all others.\u00a0 Add all reachable locations to an updateable priority queue.\u00a0 Now we perform an earliest first search, at each location we determine how long to reach each of its 4 neighbours relative to the earliest arrival time for that cell which will either be wait+1 second or wait + 10seconds, where wait is how long until the water drops low enough to make it passable.\u00a0 Wait could be 0 seconds if already passable, or infinite if it will never be passable.\u00a0 If the new arrival time for the destination is lower than the current value, update the priority queue.<\/p>\n<p>The code to calculate the wait isn&#8217;t exactly complicated, but it does have a number of criterion and would be a likely place to make a mistake.\u00a0 It is also where you have to handle not leaving the edges of the grid.<\/p>\n<p>This gives us the all points earliest arrival time, including the south east cell, which now contains the answer.\u00a0 So what is the running time?\u00a0 Initial search for starting points is O(N*M) or O(V) (V is total cells, or &#8216;vertexes&#8217; in the graph.)\u00a0 From there we run an earliest first search using a priority queue.\u00a0 Lets assume we use the LookupHeap from TMD.Algo, this has O(log N) add\/remove of any element.\u00a0 The number of priority queue updates\/adds\/removes will be at most O(E).\u00a0 E is the number of edges, which has an exclusive upper bound of 4V, effectively making it O(V) priority queue manipulations for a total of O(V log V).\u00a0 Despite V being 10k this is quite reasonable.\u00a0 The other component of running time the determination of wait, is easily O(1) per edge, so it is not significant.<\/p>\n<p><strong>Q3) Given a set of distinct positive integers, determine if there are 2 distinct subsets which have the same sum.<\/strong><\/p>\n<p>A3)\u00a0 Pretty simple problem statement, but the large input has 500 integers each up to 10^12 in size.\u00a0 Small input is more reasonable, 20 integers with 10^5 each.\u00a0 Only up to 10 test cases in both scenarios, so plenty of cpu time available for each test case.<\/p>\n<p>Small input approach.\u00a0 The most(?) obvious brute force is going to be too slow, dividing into each possible set of 2 groups, then determining the intersection of every possible sum of those groups is something like 2^40.\u00a0 However almost as obvious an approach is to break it into 3 groups, if the sum of all elements in group 1 equals all elements in group 2, we&#8217;ve found the solution.\u00a0 Group 3 is ignored.\u00a0 This can have a running time of O(3^N).\u00a0 Which is quite high, but given only 10 test cases, and 4minutes to submit, it is plausible if optimised. Important part of the implementation is to not calculate the 2 sums directly for every scenario, instead, mutate them as the scenarios change, adding and subtracting.\u00a0 This ensures the implementation isn&#8217;t O(N*3^N) which would almost certainly be too slow.<\/p>\n<p>Large input approach&#8230; hrmm.\u00a0 Well caching on the small input implementation gives us O(N*distinct number of sums) which doesn&#8217;t immediately give us anything, but we know the range of the sums which gives us an easy upper bound of O(N*(kN)^2) where k is the maximum value.\u00a0 For the large input this is a considerable improvement over the O(3^N), but still longer than life of universe type processing times expected&#8230;\u00a0 But it does tell us that as we get large inputs, the probability of there being a common sum approaches 1, maybe even is 1 by pigeon hole principle.\u00a0 Maybe this will be useful.<\/p>\n<p>So I looked up an answer to get a view into this question and I found a very important detail which I overlooked.\u00a0 If you have 2 sets which are not equal which have the same sum, you can remove all the elements in common to get 2 new sets which are distinct and have the same sum!\u00a0 Now we just have to find 2 distinct sets with the same sum.\u00a0 This insight also allows the small input to be done in O(2^N) by using O(2^N) space as well as time.\u00a0 It also lets us prove that the probability of common sum is 1.\u00a0 2^500 is greater than 500*10^12, so there are at least 2 different subsets which have the same sum, and hence 2 disjoint non-empty subsets with same sum by removing any common elements.<\/p>\n<p>Here is where the certainty comes in, if we do random guesses, and store the sums in memory until we find a collision we can get a very high likelihood of passing tests just by making enough guesses.\u00a0 Birthday paradox means that despite the potential 10^14 different sums, the probability of collision is quite high with only 10million repetitions, and extremely high at 100million. (And even higher considering the difficulty of constructing a set of numbers which evenly spreads sums across the entire range.)<\/p>\n<p>Another approach I saw in solutions with less random in it&#8230;\u00a0 If we sort the input then break it into sections of length 20, then determine every subset sum, then choose the 2 closest sums we end up with 25 sets of distinct subsets where we can use the sum difference as individual values to calculate every subset sum.\u00a0 If we find a collision then we can construct the subset.\u00a0 The idea here appears to be that we&#8217;ll end up working with a much smaller size numbers by finding the smallest difference in each section, so the probability of 0 sum is increased significantly in the second set of summations.\u00a0 Again this really seems to be a probabilistic approach, but apparently good enough that you don&#8217;t need to randomise the data.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>So, I didn&#8217;t have to do this round, but looking at the scores it was very competitive.\u00a0 Probably harder to advance than R1A.\u00a0 Looking at the problems I think there were all theoretically in my grasp, except for Q3 large input, I might have missed the key insight, and probably wouldn&#8217;t have been game to &hellip; <a href=\"https:\/\/www.themissingdocs.net\/?p=536\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">GCJ12 R1B<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-536","post","type-post","status-publish","format-standard","hentry","category-code-competitions"],"_links":{"self":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts\/536","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=536"}],"version-history":[{"count":0,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts\/536\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=536"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=536"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=536"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}