Tilps – Page 7 – The Missing Docs .Net

GCJ13

Not that I am going to be eligible to compete – but for anyone else out there who can – GCJ13 registration opens tomorrow – competition starts in April.

TCO13 R1C

Looks like there was a big turn out again, all 2000 registration slots used. 12 Aussies seems the biggest turn out so far – with 4 advancing. (I think that gives 7 Aussies in to round 2.)

1000pt obviously wasn’t as easy as last week, but a fast time on the 250pt wasn’t enough to get through again, so the 500pt can’t have been too difficult.

Q1) Given a sequence length, maximum step size, start and end values, determine the maximum possible sequence element.

A1) Appears pretty straight forward, the key thing to realise is that the answer will either be a multiple of the step size above the start value, or the end value (or both). With a maximum sequence length of 1million, you can even try both with brute force without running out of time (so long as for any given reached value, you calculate the number of steps back to the other end using division rather than repeated subtraction). More efficient methods include an inclusive binary search on the range max(start, end) to max(start, end)+(length*maxstep) and checking feasibility by using division to calculate the minimum number of steps required to get there from each and and comparing the sum to the sequence length. There is also probably a closed form formula involving the difference between the start and end values which does the problem in O(1) – but I haven’t bothered to derive it.

In any case you have to remember to handle a maximum step size of 0, which breaks all the division calculations. So treat it as a trivial special case and return start before running any of the above code.

Q2) Given a set of integers which are each formed from the sum of an unknown number of non-negative integer sub-components, determine the smallest number for which it is guaranteed that there is no more than k sub-components which are greater. Number of integers provided will be <50, but the values and k itself may be up to 1 billion.

A2) This is a binary search problem – since as you increase the comparison value, the maximum number of sub-components greater than it never increases. The answer will always be in the range 0 to 1 billion and 1, and each binary search step needs only perform a division calculation for each input value.

Specifically if we are considering whether n is a feasible solution, we divide each sum value by n+1 (since they have to score greater), and take the integer division result as the maximum number of sub-components that there could be which are greater. Add all of these answers together, and if it is no more than k then it is a feasible solution. Binary search until you find the minimal feasible solution. (As usual, this binary search is technically implemented to find a value between true and false, so you need one which returns the first value above when equality is not found.)

Q3) Given an NxN grid and assuming uniform random allocation of a pair of cells (not coincident), determine the expectation value for the number of cells of the grid ‘selected’ using a pattern which is centred on each of the randomly allocated cells. The pattern being up to 9 cells relative to the centre of (0,0), (a,b), (b,a), (-a, b), (-b, a), (-a,-b), (-b,-a), (a,-b), (b,-a). If the pattern falls off the edge of the grid, the corresponding member of the pattern is ignored. If any two pattern elements are for the same cell, a given cell can only be selected once. N, a and b are all no more than 1 billion.

A3) So yes, this is definitely harder than last week. With such a large value for N, obviously brute force is out of the question.

It isn’t too difficult to calculate the expectation value for a single randomly allocated centre, using N, a and b there is fairly trivial formulas for what area of the grid the centre can be in, compared to the total area to allow each of the 9 points to be valid, then those fractions can be added together, taking care that the areas are up to 10^18 in size, so even if you take advantage that each fraction has the same base you may even overflow a 64bit integer unless it is unsigned when you add them together. Also you need to handle the case where a equals b, and not double count the pattern elements which are then equal. Double this gives a first approximation for the expectation value for 2 randomly selected centres, we just need to subtract off the expectation of coincident pattern elements. And handle the fact that the 2 pieces will never be placed in the same place.

There are 3 cases of coincident pattern elements. The centres are coincident, which has an easily calculated chance of 1 in the area size, and cancels out an entire pattern contribution – the expectation value of which we just calculated before. In fact since centres co-incident is not allowed, we need to cancel out 2 entire pattern contributions. The remaining 2 cases are more complicated – first is centre is coincident on a non-centre element – this cancels out exactly 2 cells (I think), but the probability is related to a subset of the area divided by the area squared (considered for each of the up to 8 different directions). Finally we have to deal with the case of non-coincident centres where the pattern elements match. This appears to have 2 sub-cases, straight and bent. In straight, the line between the centres passes through the overlap cell, and there is exactly 1 cancelled out cell. Again the probability is a subset of the area divided by area squared (this time reduced by 2a, 2b or 2b, 2a – for up to 8 different relative arrangements). In bent, the centres end up a+b apart horizontally/vertically or a+b, a+b apart diagonally. In the diagonal case, exactly 2 elements are cancelled out, and each of the 4 cases can be handled like straight scenario. The horizontally/vertically cases are another question though – this time you have to consider up to 3 different scenarios. Because each of the 2 elements to be cancelled out may not actually fit on the NxN grid. But you can calculate an area where the ‘left/top’ element exists to be cancelled out, and another with the ‘right/bottom’ element exists to be cancelled out – and use those as individual 1 cancelled out cell cases like the straight scenario.

Once all these adjustments are made, I think a further adjustment of area/(area-1) needs to be made to compensate for the fact that co-incident centres could never be selected in the first place.

All in all, it isn’t an impossible problem if you’ve done some expectation value calculations before – but it is so fiddly, so many cases to code up, and so many chances to overflow/lose precision. And correctly handling the no-coincident of centres is a bit tricky.

Looking at some solutions, it appears that it is possible to avoid special casing this so much, by considering each pair of elements from the patterns. For the pair to be co-incident the centres have to be in a certain place relative to each other, which combined with the overlap position gives you the subtraction component as a fraction of area squared. Each of these single subtraction components all combined, should give the same sum as all the subtraction scenarios I mentioned above.

Socket.ReceiveAsync – broken or broken?

I ran into another .Net runtime bug today. Socket.ReceiveAsync. This method promises to improve performance compared to using Socket.BeginReceive/EndReceive and, I would say it manages that. However, if you are receiving data into continuously moving buffer, ReceiveAsync is dangerous. Specifically if you call BufferList setter or SetBuffer after each receive before starting the next, it returns seemingly random values for the number of bytes transferred compared to what is written in the actual buffer.

After many hours of reflection and debugging I think I may finally understand what is going wrong. It appears that ReceiveAsync is ‘too fast’, an asynchronous completion can be triggered before ReceiveAsync exits, presumably while WSARecv is still running. Then when BufferList gets set it modifies stuff that WSARecv is still using, causing much confusion and something goes horribly wrong…

I think BeginReceive/EndReceive don’t suffer from this problem because they construct a new set of parameters for WSARecv every time rather than attempting to reuse stuff to avoid memory allocations, so there is no potential for a still active WSARecv to mess around with the data about to be given to the next WSARecv. Either that or EndReceive ends up waiting in what would be the failure case for ReceiveAsync – but the reflected code doesn’t seem to indicate that.

The ‘solution’ I have for ReceiveAsync relies on waiting to reset BufferList until a ManualResetEventSlim gets set when the previous call to ReceiveAsync returns. This solves the problems, but even ManualResetEventSlim is expensive enough that it seems the performance gains are lost.

Hacky Ugly Example Code: (Note that removing the comment marks on the lines regarding the ManualResetEventSlim ensures this program never prints anything.)

using System;
using System.Net;
using System.Net.Sockets;
using System.Threading;

namespace ConsoleApplication8
{
    class Program
    {
        static void Main(string[] args)
        {
            TcpListener listener = new TcpListener(IPAddress.Any, 5678);
            listener.Start();
            client = new TcpClient("localhost", 5678);
            var server = listener.AcceptTcpClient();
            ThreadPool.QueueUserWorkItem(callback =>
                {
                    byte[] bufferInner = new byte[8192];
                    while (true)
                    {
                        try
                        {
                            int bytes = server.Client.Receive(bufferInner);
                            if (bytes == 0)
                                return;
                            server.Client.Send(bufferInner, 0, bytes, SocketFlags.None);
                        }
                        catch
                        {
                            break;
                        }
                    }

                });
            ThreadPool.QueueUserWorkItem(callback =>
                {
                    byte[] bufferInner = new byte[24];
                    for (int i = 0; i < bufferInner.Length; i++)
                        bufferInner[i] = (byte) (i+1);
                    for (int i = 0; i < 10000000; i++)
                        client.Client.Send(bufferInner);
                });
            asyncEventArgs1 = new SocketAsyncEventArgs();
            asyncEventArgs1.Completed += AsyncEventArgsOnCompleted;
            asyncEventArgs2 = new SocketAsyncEventArgs();
            asyncEventArgs2.Completed += AsyncEventArgsOnCompleted;
            buffer = new byte[bigBufferLength];
            BeginReceive();
            while (true)
            {
                Console.ReadKey();
            }
        }

        private const int bigBufferLength = 1024*1024;

        private static void BeginReceive()
        {
            if (asyncEventArgs == asyncEventArgs1)
                asyncEventArgs = asyncEventArgs2;
            else
                asyncEventArgs = asyncEventArgs2;
            int length = rnd.Next(2) == 0 ? 8192 : 100;
            prevLastWrapped = lastWrapped;
            lastWrapped = false;
            if (lastWriteIndex + length <= bigBufferLength)
                asyncEventArgs.BufferList = new ArraySegment<byte>[]
                    {new ArraySegment<byte>(buffer, lastWriteIndex, length),};
            else
            {
                asyncEventArgs.BufferList = new ArraySegment<byte>[]
                    {
                        new ArraySegment<byte>(buffer, lastWriteIndex, bigBufferLength - lastWriteIndex),
                        new ArraySegment<byte>(buffer, 0, lastWriteIndex + length - bigBufferLength),
                    };
                lastWrapped = true;
            }
            Clear(length);
            prevLastWriteLength = lastWriteLength;
            lastWriteLength = length;
            prevWasSync = wasSync;
            wasSync = false;
            int newVal = Interlocked.Decrement(ref counter);
            if (newVal != -1)
            {
                Console.WriteLine("BeginReceive called wrong number of times. {0}", newVal);
                throw new Exception("Fail!");
            }
            //callerExited.Reset();
            if (!client.Client.ReceiveAsync(asyncEventArgs))
            {
                wasSync = true;
                //callerExited.Set();
                AsyncEventArgsOnCompleted(null, asyncEventArgs);
            }
            else
            {
                //callerExited.Set();
            }
        }
        //private static ManualResetEventSlim callerExited = new ManualResetEventSlim(false);

        private static bool lastWrapped = false;
        private static bool prevLastWrapped = false;

        private static bool wasSync = false;
        private static bool prevWasSync = false;

        private static volatile int counter = 0;

        private static void Clear(int length)
        {
            if (lastWriteIndex + length <= bigBufferLength)
                Array.Clear(buffer, lastWriteIndex, length);
            else
            {
                Array.Clear(buffer, lastWriteIndex, bigBufferLength - lastWriteIndex);
                Array.Clear(buffer, 0, lastWriteIndex + length - bigBufferLength);                
            }
        }

        private static void AsyncEventArgsOnCompleted(object sender, SocketAsyncEventArgs socketAsyncEventArgs)
        {
            int newVal = Interlocked.Increment(ref counter);
            if (newVal != 0)
            {
                Console.WriteLine("OnCompleted called wrong number of times. {0}", newVal);
                throw new Exception("Fail!");
            }

            if (socketAsyncEventArgs.SocketError != SocketError.Success)
                return;
            if (socketAsyncEventArgs.BytesTransferred == 0)
                return;
            if (socketAsyncEventArgs.BytesTransferred > lastWriteLength)
            {
                Console.WriteLine("Receive too long, Sync:{0}:{1} LastWrapped:{2}:{3} Length:{4}:{5} ClaimedLength:{6}", wasSync, prevWasSync, lastWrapped, prevLastWrapped, lastWriteLength, prevLastWriteLength, socketAsyncEventArgs.BytesTransferred);
            }
            if (buffer[(socketAsyncEventArgs.BytesTransferred-1 + lastWriteIndex) % bigBufferLength] == 0)
            {
                Console.WriteLine("Receive too short-quickcheck , Sync:{0}:{1} LastWrapped:{2}:{3} Length:{4}:{5}", wasSync, prevWasSync, lastWrapped, prevLastWrapped, lastWriteLength, prevLastWriteLength);
            }
            //for (int i = 0; i < socketAsyncEventArgs.BytesTransferred; i++)
            //{
            //    if (buffer[(i + lastWriteIndex) % bigBufferLength] == 0)
            //    {
            //        Console.WriteLine("Receive too short , Sync:{0}:{1} LastWrapped:{2}:{3} Length:{4}:{5} ActualLength:{6}", wasSync, prevWasSync, lastWrapped, prevLastWrapped, lastWriteLength, prevLastWriteLength, i);
            //        break;
            //    }
            //}
            lastWriteIndex = (lastWriteIndex + socketAsyncEventArgs.BytesTransferred)%bigBufferLength;
            //callerExited.Wait();
            BeginReceive();
        }

        private static SocketAsyncEventArgs asyncEventArgs;
        private static SocketAsyncEventArgs asyncEventArgs1;
        private static SocketAsyncEventArgs asyncEventArgs2;
        private static byte[] buffer;
        private static int lastWriteIndex;
        private static int lastWriteLength;
        private static int prevLastWriteLength = 0;
        private static Random rnd =new Random();
        private static TcpClient client;
    }
}

TCO13 R1B

As previously mentioned I didn’t have to get up at 4am to do this round, but a quick look at the final scoreboard it appears the questions were significantly easier. Almost half of the advancers solved the 1000pt question correctly. The 600th place cut-off required solving the 250pt problem in decent time and a successful challenge as well.

In this competitive environment only 1 Australian advanced, 2 more falling just below the cut-off having solved the 250pt quickly.

Q1) Given an even number of integers, the need to be made into pairs. These pairs can have each member added together to create a value of that pair. Determine the minimum difference between the lowest and highest values in such a set of pairs. Between 2 and 50 starting integers and numbers are all in the range 1 to 1000.

A1) Without justification – the simple approach is to sort the inputs and pair them from outside to inside. Then just return the difference between the largest and smallest pairs created this way. This is also correct, but proving it appears not so trivial. (Certainly in the competition this would have put me at risk, not being able to prove it usually delays my submission.)

Q2) Given a rectangular array which is partially populated, and the ability to clear up to R consecutive rows at a time or C consecutive columns at a time, determine the minimum number of ‘clears’ required to clear the array completely. Rectangular array is maximum 15×15.

A2) So the first key point to this problem is that the maximum size is 15. This means a brute force on every subset of rows, or columns (but not both) is plausible. The next key point is that the order in which you perform your clears does not matter, if 2 clears overlap the common cells will be cleared regardless of order. So you can do all your row clears before your cell clears, without loss of generality.

This leads to a fairly simple approach – try every subset of rows (including empty) and then use a greedy clearing to determine the minimum number of moves by columns to clear what remains, and also to determine the minimum number of moves to implement the original subset of rows selected. (Don’t worry if your greedy approach clears excess to the selected subset, taking the minimum number of moves is more important. The actual subset that you erase is a better solution anyway.)

Simple runtime is O(2^(Min(w,h)) * w*h) – each subset requires 2 greedy clearing passes, which cost proportional to the size of the full array in the worst case. This runtime chooses the smallest dimension, but in practice the code is simpler if you just standardise on rows or columns, it will run fast enough regardless. An ideal solution might be to implement it faster by using a sparse array representation much like the one used in the Dancing Links algorithm. But that certainly wasn’t required here.

Q3) Given a dictionary of distinct words and the following rules, determine the minimum number of remaining words you can end up with. There are 2 rules. First rule is that if any time you get the same word duplicated, the duplicate and the word are removed. Second rule is that you can replace any word with the same word with an even length prefix reversed.

A3) The key point for this problem is that the even length prefix reversal lets you create almost any anagram out of an even length word. For odd length words you get almost any anagram where the last character hasn’t moved. Secondly it doesn’t matter what order you remove pairs that match along the way, because there are no dead ends in the reordering and if a can be made to match with b and c, b and c can be made to match.

Almost any anagram isn’t every anagram. Specifically it is any anagram of the original pairs, where each letter in a pair itself can be in any order. The pairs never get split up in the process, so you can’t reorder 1234 into 1324. We can create a canonical form for matchable groups in this subset of anagrams by sorting each pair of letters, then sorting the resultant pairs. For example the word ‘word’ would first be rearranged to ‘owdr’ then secondly to ‘drow’ to given the canonical order. (FIFO is its own canonical ordering under these rules…)

So for each word, if it is even length, replace it with the word with its characters in canonical order. If it is odd length, do the same but leave the last character in place. Then do a unique count of the result words. If the count for a unique word is even, all pairs can be removed, if the unique count is odd, you have to keep 1 copy of that word. So simply count those odd ones up and you have your answer.

This was pretty easy for a 1000pt problem (certainly it seems easier than the 500pt problem) – but identifying the way to canonicalise the available subset of anagrams isn’t totally trivial, so it might stump a few people.

TCO13 R1A

So pre challenge phase I was on 531st – with the feeling that things could go either way for me – the 500pt had the opportunity for lots of failures because of floating point error margins, which I was confident I had taken care of – but on the other hand, a small missed optimisation I noticed with 15seconds to go, left me uncertain whether it was possible to write a degenerative test case which would cause a time-out.

After challenge I was 480th, with a bit more confidence having seen someone successfully challenge 2 others but not my solution, using what I am pretty sure would have been a time-out attempt.

My solution to the 500pt did fail system tests, but in-spite of that my final placing was 471st. There were a very large number of failures for the 500pt problem. In my case it was a degenerative performance test case, as I expected.

471st gets me through to round 2, no need to slog through R1B and/or R1C. R2A/B/C happen after daylight savings change, so I think they will be at 2am instead of 4am. I am not sure which is worse really…

The 250pt question was easy, and really so was the 500pt – but it took me over half an hour to realise the obvious aspect of the 500pt problem I was missing. I blame having to code at 4:30am… The 1000pt was not – but still there was ~80 submissions for it.

Q1) Given a set of numbers in the range 0-9, what is the minimum cost (where it costs 1 per integer distance a number is moved by) to ensure all the numbers are within 1 of each other.

A1) The limited range made brute force trivial – just try pair of adjacent numbers (0,1), (1,2) … (8,9) and calculate the cost to move each number below to the bottom number and each number above to the higher number. I managed to score 240 out of 250 points here, which is pretty high for me – but it was a pretty easy problem… Competition advancement cut-off was 234 points – so I didn’t want to be too much slower though!

Q2) Given a set of ‘gaps’ which start and end on integer boundaries (but are exclusive of the boundaries themselves) determine the smallest number where all multiples of that number do not fall into any of these gaps.

A2) I saw several solutions to this, and many of them failed. Most of them used floating point numbers for doing most of their code, which was a big risk due to cumulative errors. Others failed (like mine) because the code didn’t scale well enough.

The key point which took me far too long to realise was that any candidate solution can be improved so long as none of the multiples lie on the end point of a gap. Otherwise a microscopic reduction in jump size will still leave every multiple not hitting a gap.

This leads to the solution that the number is a divisor of one or more of the end values of a gap. So for each end value we check each plausible divisor to see if it works for the entire set of gaps. You can set an upper bound for the divisor by calculating the floor of right edge divided by width – but if the width is 1, this upper bound doesn’t help much. Checking the divisors I kept everything in integers, basically doing simple fraction math rather than using floating point. However I didn’t notice that my cost of checking whether a divisor worked or not wasn’t O(N) where N is the number of gaps but was instead O(D) where D is the maximum distance from the start to a right gap edge. Since the upper bound for D was 30k and N was 50 – this was a significant performance penalty. The algorithm is O(N^2D) if implemented correctly, but mine was instead O(ND^2). O(N^2D) was fast enough to pass tests. I saw an O(ND log ND) implementation which passed, but I don’t actually understand why it works…

The mistake I made which caused the O(D) performance when checking if a specific length worked was that I incremented the multiple counter by 1 while the current location was in front of the current gap. If the first gap was very late, and the gap size under consideration was 1, it would tight loop increment that counter all the way up. I needed to either use a simple division short circuit to accelerate to the greatest multiple before the current gap, or instead use a modulus check to see if the jump size would end up in the gap rather than walk my way up at all.

Q3) Given a rectangular grid which is cyclic in nature (left edge joins to right edge, and top edge joins to bottom edge) where each cell in the grid specifies a direction (either left, right, up or down), determine the minimum number of cell changes required such that every cell is part of a cyclic path which comes back to itself.

A3) This problem is a tough one. (The system tester thought so as well, it refused to rate anyone on this problem the first time round :P) I heard one competitor saying it was a case of min cost max flow – and I think I see how that could be the case, but min cost/max flow is a very complicated algorithm to try and implement during the competition.

The first level of analysis on this problem is pretty straight forward – if every cell has another cell pointing to it, then every cell is in a cycle. Correspondingly every cell which has multiple cells pointing to it, has to have changes occur to reduce it to one. The problem is how to minimise the number of changes required to get each case of excess inputs redirected to a location of insufficient inputs.

A bit of thought has failed to help me construct the appropriate min cost max flow graph to solve the problem, but it is after 7am – it is time to get some more sleep…

Post sleep – I can see how to formulate this as min cost circulation problems. Each node is duplicated, with a minimum and maximum flow of 1from copy 1 to copy 2 (and a cost of 0) – then each copy 2 connects to copy 1 of its neighbour with maximum flow of 1 and a cost of 1 if it isn’t the neighbour the node currently points at. One special node is added, post source which connects to every copy 2 with maximum flow of 1 and cost of 0. Source connects to post-source with maximum flow equal to target cycle count and cost of 0. All copy 2’s connect to sink with maximum flow of one and cost of 0.

The problem is then repeated for each target cycle count between 1 and V/2 – and the minimum of minimum costs is the result.

The problem is the standard min-cost max-flow algorithms don’t support minimum flows. One solution in this case where minimum flow equals maximum flow is to make it ‘cheaper’ to go through every minimum flow edge then any other path that doesn’t include them. To do this we change the edges with minimum flow to have no minimum flow but a negative cost greater than the number of edges in the original graph – this ensures that minimum cost will pass through every one of these edges. In this case the shortest augmenting path algorithm has to be one which handles negative cost edges.

One refinement – not sure if I’ve considered this before or I am just starting to understand better for the first time – but in this case you can do all of the V/2 problems at once by setting the target cycles to infinite and keeping track of the cost after each augmenting flow. Since each augment is minimum cost and increases flow by 1, it will pass through every cycle count and their associated minimum cost.

Surprise top coder…

So it appears I wasn’t paying enough attention to my email back in the middle of last year. Unlike usual when top coder announces it is opening registrations a couple of months before the contest starts, this time it decided to open registrations the middle of the year before!

So I got a reminder email today, registration is open and surprise – round 1A is this weekend. Guess I will need to quickly muster my ability to stay awake and conscious at 3am in the morning once more. I should probably start trying to do these competitions in c++, but with this little notice I think I will be using c# again this year.

Since I had been surprised by the sudden appearance of the top coder open – I went and double checked – GCJ13 registrations don’t open for another 3 and a half weeks. Not that I will be eligible to compete this time!

Finding a gap in an ordered sequence of distinct integers presented in random order

Unrelated to my recent interviews at Google, I saw the following interview style example question.

If you are given a random sequence of at most 4 billion 32bit integers find a 32bit integer that is not used. (There must be one, why?) Answer this in 2 parts – if you have unlimited memory, and if you have limited memory but plenty of disk storage options.

This problem presents itself to me as a case of the pigeon hole principle and the associated ‘counting sort’. 32bit integers have 2^32 options, which is a bit less than 4.3 billion – hence by the pigeon hole principle, 4 billion pigeons cannot fill 4.3 billion pigeon holes. Unlimited memory means you can allocate the 16GiB needed for a proper counting sort – but since we don’t actually care about repetition counts in answering the question of what number is missing – you can instead allocate a 512MiB bitset and simply mark every observed number as a 1 bit, then search said bitset for a 0 bit. And if your disk seek times are not a huge disaster, the bit set can be stored on disk instead. This solution is O(N) in a sense, if you ignore the cost of allocating the bitset. Obviously the limitation on size of input originally technically makes it O(1) with a massive constant – but that is a pretty non-useful viewpoint.

Usually I would have probably stopped there and not considered the question further (Except to mention that an O(N log N) solution can be done using sequential disk access and in this case log N is at most 32, so sequential read/write of integers vs random access at the 1 bit size is almost certainly going make the O(N log N) solution perform better in practice) – but for some reason the fact that we have the fixed ‘constant’ time of allocating the bitset was sticking with me. This morning I came up with an O(N) time algorithm to find a gap in the ordered sequence of integers without repetition presented in random order – with no 32bit restriction. (And as a bonus, it acts like the O(N log N) algorithm in many respects, so a restricted memory implementation would get the performance advantages of sequential access to disk.) Obviously the original problem doesn’t have the restriction of no repetition and removing duplicates in O(N) uses hashing – which gives random access again and non O(N) behaviour if hashing is poor – but I think the resultant problem is still interesting.

The algorithm isn’t very complicated and probably of limited use, but I don’t often come up with something I haven’t seen before. It uses as its basis the O(N) running time minimum, maximum and median algorithms. The O(N) median algorithm has a fairly high running time constant of implementation – but it is only needed to be used if you want guaranteed worst case time bounds – see at the bottom for how to adjust the algorithm to give randomised expected O(N) time with a much lower running time constant.

The algorithm is recursive with inputs of ‘integer array’ and min/max of range to find the gap in. In the original problem min/max would have been 0 and 2^32-1 (for unsigned integers), but if you are explicitly interested in gaps you can start with setting min/max to the results of running minimum and maximum on the input array.

From there the trivial algorithm is as follows – if min of range is greater than max of range – return ‘no gaps’. If input array is empty – return min of range. Otherwise find min/max/median of input array – if min > min of range – return min of range – if max < max of range – return max of range. Otherwise filter input array to values between min and median (exclusive) or median and max (exclusive), which ever range is has lowest density – if equal choose one. Then recurse using this filtered array and its corresponding exclusive range.

Performance analysis is trivial – each recursion reduces the number of elements considered by half (since the median is the halfway point), and each step takes time proportional to the size of the current array – leading to the N+ N/2 + N/4 … worst case which gives O(N).

Correctness depends on there being no duplicates – obviously whenever it returns something other than ‘no gaps’ it has found a gap, but it is the no duplicates which lets us say that it will only return ‘no gaps’ if there are no gaps. Because there are no duplicates, any given range in the sorted sequence either has no gaps (which gives a density of 1) or it has gaps (which gives a density less than 1). When we divide the sequence into the groups smaller/larger than the median and calculate the density of each – all we really care about is whether the density is 1 or not. Since the no gaps case always has a higher density than the gaps case, we will always recurse into a side which contains gaps – unless both densities are 1. (Indeed trivial extension of the original algorithm above is that if both densities are 1, return no gaps rather than recursing.) If we do recurse into a side which has no gaps, we will keep recursing until the input contains 2 or less consecutive numbers – the next recurse will then consist of an empty array and a max less than min – which is the case we return no gaps.

The choosing of lower density is simply an average time performance improvement – if the input set is biased, the lower density section is more likely to terminate in less recursions. Technically that step can just be ‘choose a side with density which is not 1’.

As an addendum – this algorithm is a bit slow if guaranteed worst case O(N) is not required (because of the cost of the guaranteed O(N) median finding algorithm) – you can drop the median requirement entirely, and just choose a random element as the pivot to filter the array. This still gives expected running time of O(N) based on the randomisation – but it can degenerate – and in the worst degenerative scenario you will need to deal with an extra case of a 0 length filtered array with 0 width range – which should be treated as if it has a density of 1.

The not so Random random

Just a quick one here – while I was working on some graphs for a suggestion I made for LOTRO. I discovered a peculiar anomaly in Random.NextDouble.

Plotting a graph of the length of runs where NextDouble returns greater or equal to 0.01, there is a significant dip at a run length of about 53. Switching to RNGCryptoServiceProvider makes this dip disappear and the dip is clearly reproducible even with sample sizes in the hundreds of millions.

Quite bizarre – although not quite as bizare as this deviation mentioned at stackoverflow.

Enums in dictionary keys.

I was doing some random reflection walking to remind myself whether using a struct as a dictionary key triggered any boxing operations using the default equality comparer. And if you implement IEquatable<T> it still executes a box statement in IL, but my attempts at creating an equality comparer which trivially does not execute boxing operations isn’t any faster, so it seems the optimiser can properly eliminate explicit null checks in generics where the type parameter ends up being a struct and remove the box operation as dead code. But that is not what this post is about.

Along the way in my reflection I saw that the default equality comparer special cases enums. I immediately thought this was great, because I had recently discovered that GetHashCode called on an enum is ‘expensive’, it is *much* cheaper if you cast it to the underlying type and call GetHashCode on that instead. (Specific case was an enum member of a type which overrode equality, so the default equality comparer special case didn’t apply.) So I had a look at the reflected code for this special case enum equality comparer and found an ‘interesting’ thing.

The implementation called this thing called JitHelper.UnsafeEnumCast<T> to cast its parameters to int. What makes it interesting is that the ‘implementation’ of this method is ‘throw new InvalidOperationException()’. Wait what…

Now obviously this method implementation isn’t actually called by EnumEqualityComparer, using enum’s as dictionary keys works just fine. So it seems that the JIT magically substitutes a cast if T is an enum type, and only uses the implementation as a default fallback scenario.

The things they do to get around the limited type constraints in generics…

And this led me to look closer at JitHelper. Here I find JitHelper.UnsafeCast<T>…

Have you ever wanted to cast Generic<Specific> to Generic<T> when you know at runtime that T is Specific? It seems that this little piece of magic lets you do that. (It is apparently used in async stuff to allow for cached result tasks to be returned for a bunch of basic types from a generic result task building function. For bonus points I now know that true, false, 0 in a dozen types, null and the integers -1 through 9(?!?), each have their own special cached task instances to be shared…)

BBR – an update

I made a small update to this early this morning – the logic for colouring the heated pieces has changed to use a trivial tone mapping HDR compression rather than my custom hack that I had in place previously. This means the colours look more vivid. It also changed the game play slight as colder items are more easily seen than they were before. (Specifically the visibility threshold has dropped from about 1800K to 1000K.) So I am using this as an excuse to try a new set of default parameters out.

Maybe next weekend I will take a look at adding radiation heat transfer to the game. However I do think it will need to be a small component compared to conduction or the gameplay will be significantly impaired. I mainly want really hot pieces to cause non-touching but nearby cold pieces to start to glow, only at short range. I do not want the top layer of locked in pieces to go cold while you try to drop a piece down on them. (Or the dropping piece to go cold before it even gets to the bottom.)