{"id":801,"date":"2015-06-13T08:42:21","date_gmt":"2015-06-13T08:42:21","guid":{"rendered":"http:\/\/www.themissingdocs.net\/wordpress\/?p=801"},"modified":"2015-06-13T08:42:21","modified_gmt":"2015-06-13T08:42:21","slug":"gcj15-distributed-practice-round","status":"publish","type":"post","link":"https:\/\/www.themissingdocs.net\/?p=801","title":{"rendered":"GCJ15 Distributed Practice Round"},"content":{"rendered":"<p>So, I think everyone qualified for round 3 was eligible to do the practice round.\u00a0 202 positive scores, 14 perfect scores.\u00a0 I hope everyone who intends to participate in the distributed online round had a go in the practice round, as the distributed format is new and quite different &#8211; I think people who spent a good long time on the practice round will have a big advantage.<\/p>\n<p>Even with 50 hours to think things over, and a testrun feature which lets you check if your solution is too slow on the large input, for test cases of your own design, pass rates for large input were quite low.\u00a0 2 of the quests had less than 50% pass rates.<\/p>\n<p>The top 14 perfect scores was 12 c++ and 2 java. No python.\u00a0 Small sample size, but given the ratio of language use in the group that advanced to round 3, its absence is slightly suspicious.<\/p>\n<p><strong>Q1) Find the largest prefix\/postfix sum for a list of numbers if the prefix\/postfix are not allowed to overlap. 5&#215;10^8 members in the list.<\/strong><\/p>\n<p>A1) On a single machine this problem boils down to what is the largest drop.\u00a0 Once you&#8217;ve found the largest drop (by calculating running sum and keeping track of the previous maxima vs the current value) the answer is the sum of all numbers &#8211; the largest drop (which will be negative, giving you a bigger total).<\/p>\n<p>Splitting this to multiple machines isn&#8217;t too difficult.\u00a0 Give each machine a subsequence of approximately equal length, calculate the minima, the maxima,\u00a0 the &#8216;local&#8217; biggest drop and the total.\u00a0 These can all be calculated in a single pass of O(N) time.\u00a0 Send all these values back to the primary machine and exit.<\/p>\n<p>The global biggest drop is either the largest of the local biggest drops, or it is the difference between one of the maxima&#8217;s and a subsequent minima.\u00a0 First the maxima&#8217;s and minima&#8217;s need to be corrected since they are local, so add the running total of the previous totals to each local minima and maxima.\u00a0 Then calculate the biggest drop, and finally subtract that from the overall total of totals.\u00a0 These steps are O(Nodes) so will easily run in time.\u00a0 Remember to use 64 bit integers everywhere, as the numbers can sum to 5*10^17.<\/p>\n<p><strong>Q2) Calculate who, if anyone, is a majority in a vote. 10^9 voters and potentially 10^9 candidates.<\/strong><\/p>\n<p>A2) So I have 2 approaches for this problem, which have different scaling functions, but are surprisingly similar in run time for the problem size.<\/p>\n<p>The &#8216;cool&#8217; solution is to count the frequency of each of the 32 bits in the candidate vote identifier over all votes.\u00a0 If there is a majority, the set bits of the majority candidate will have over half the vote, and the unset bits will not.\u00a0 So simply send the totals to central server and accumulate and compare to number of votes.\u00a0 This lets you create the bit pattern of the majority, if there is a majority.\u00a0 Send this majority holder back out to everyone, who then counts the frequency of exactly that candidate.\u00a0 Send these new counts to central server, accumulate and check if its a majority.<\/p>\n<p>This solution is linear, but requires two passes.\u00a0 Given how long it takes to call GetVote you will lose 500ms of your time just doing that if you don&#8217;t store the values, and if you do the store the values you should be careful you don&#8217;t run low on memory, but it will save you 250ms&#8230;\u00a0 Each value also has 32 bit extract&#8217;s to perform to potentially increment the 32 counters.<\/p>\n<p>The alternative solution is to sort the segments you assign to each server.\u00a0 This is O(N log N), but with at most 10million entries, the log N is only a factor of 24.\u00a0 You must be sure your sort function is in place, because otherwise you will run out of memory.\u00a0 Having sorted them you take every entry which got more than 20% of the vote (which is easily found now that they are sorted).\u00a0 There are at most 4 of these, send their identifiers to central server.\u00a0 Also send the size of the threshold you used to select them, which is 20% of the number assigned to this node.<\/p>\n<p>The central server now has up to 400 candidates, which it can easily total and sort.\u00a0 Any total which when added to the sum of thresholds sent back (which should be close to 20% of the overall, but might be slightly off due to rounding) gets a majority is a candidate.\u00a0 There is at most 3 of these as they each have to have at least 30% of the vote.\u00a0 Send these back and do a second pass counting totals for each of the 3.\u00a0 Finally the majority holder either appears, or does not.<\/p>\n<p>This solution has the advantage that rather than doing 32 operations per entry to start and 1 more to find the majority candidate total, it does 24 (for the sort), 1 to count run lengths and then 3 more for the totals.\u00a0 In practice there is more overhead, sorting operations are compare and swap where as bit extract and increment is cheaper &#8211; memory locality of the operations is also much better in the first solution.\u00a0It also has the advantage of potentially not doing the second pass.\u00a0 If you get a straight majority or none get to 30%, the second phase can be skipped.\u00a0 This bonus has an interesting correlation &#8211; it is more difficult to construct a fast running input which has slow sorting time and also has 30% of votes going to one person.<\/p>\n<p><strong>Q3) Find the shortest distance between two specific nodes in a circular doubly linked list.\u00a0 10^9 nodes.<\/strong><\/p>\n<p>A3) So the single machine version is you just walk left from start until you find the other target, compare the number of steps to the total number of nodes and you are done.\u00a0 The distributed version is a bit more awkward&#8230;\u00a0 you need to break the input set into sections, but given the nodes are in random order, how do you do such a thing?<\/p>\n<p>One option is to select evenly spaced nodes and call your section done if you ever see another node which is x mod k.\u00a0 This is however fragile, its not hard to construct inputs which means regardless of your choice of x and k, one of your nodes is going to do a lot more work than the rest.<\/p>\n<p>Solution is to use random nodes instead, that keeps you safer.\u00a0 To be extra safe, choose 10 random nodes per server, that way if one of its segments is double length, that only adds 10% extra running time.\u00a0 However you now have to compare every value you get while walking against 1k potential terminals.\u00a0 This is way too slow.\u00a0 One option is to sort the selected nodes and binary search.\u00a0 This has potential to run fast enough.\u00a0 Another option is to not use truly randomly selected nodes.\u00a0 Instead divide the nodes into groups of k, and for each group select a random terminal position.\u00a0 Then you can check if you are terminal by checking node value mod k vs random_nodes[nod value \/ k] mod k.<\/p>\n<p>Potential little tricks for avoiding overhead.\u00a0 Assume each machine runs the same version of libc &#8211; rather than communicating the random node selection from central server to each server, use a hard coded seed to rand and calculate it on each server.<\/p>\n<p>In any case, each node returns the length of its segment, the start, stop nodes, and whether it saw either or both of the start and end nodes.\u00a0 The central server can then easily and quickly construct the path from start to end node.<\/p>\n<p><strong>Q4) Determine whether it is possible to partition a set of numbers into two equal halves.\u00a0 52 numbers, each of which have a large range of possible values.<\/strong><\/p>\n<p>A4) Unlike all the other questions, this doesn&#8217;t bombard us with lots of data.\u00a0 But the question is NP-Complete, so it doesn&#8217;t really have to.\u00a0 Wikipedia (see subset-sum problem) tells us that it can be solved in O(2^(N\/2)) time, which is actually almost really close to fast enough for a single machine &#8211; but it would use more than the allotted memory limit.<\/p>\n<p>The simple fix is to use 64 of the 100 machines, and tell each to &#8216;fix&#8217; 6 of the numbers as either left or right, then run subset sum on the remaining numbers with an adjusted target based on the &#8216;fixed&#8217; values.\u00a0 This saves a factor of 8 in memory usage and running time, which is just about enough.\u00a0 Each machine returns whether it finds the partition, and central server thus has the answer.<\/p>\n<p>This only has Sqrt(Nodes) scaling, unlike the other problems which each have linear scaling &#8211; but getting linear scaling for this problem seems likely to be quite a bit more difficult&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>So, I think everyone qualified for round 3 was eligible to do the practice round.\u00a0 202 positive scores, 14 perfect scores.\u00a0 I hope everyone who intends to participate in the distributed online round had a go in the practice round, as the distributed format is new and quite different &#8211; I think people who spent &hellip; <a href=\"https:\/\/www.themissingdocs.net\/?p=801\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">GCJ15 Distributed Practice Round<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-801","post","type-post","status-publish","format-standard","hentry","category-code-competitions"],"_links":{"self":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts\/801","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=801"}],"version-history":[{"count":0,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts\/801\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=801"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=801"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=801"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}