March 2013 – The Missing Docs .Net

TCO13 R2A (take 2)

Attendance not that high again, 1279 registered competitors for this rescheduled round. But at least the servers seem to be more stable tonight (even if I am no less sleepy).

6 Australians this time. I guess one of them had difficulty connecting last night.

Tough round – I only submitted a solution for Q1, and I had low confidence, since it was a greedy solution and I could see the vague potential for non-greedy answers. During challenge phase, 40% of submitted solutions to Q1 in my room were successfully challenged, including mine. I was too tired to identify bugs in other peoples code however, which left me with 0 points.

Which given the round difficulty was 560th before systests. This raised to equal 441st in the final tally.

Only 2 successful solutions to Q3, and both of those people skipped Q2. Only ~30 solutions to Q2. Cut-off to advance was a fast time on Q1 and a successful challenge. Looks like John Dethridge and nabbftw might be in with a t-shirt chance from this round with a decent Q1 time – and EvgeniSergeev managed to get a successful challenge, the other 3 Aussies (including myself) left on 0.

Too tired to do any more write up now – I’ll try and come back to it later. But I do at least know why my Q1 answer failed.

Q1) Given 2 equal length sequences of letters, determine the correlated sub-sequences which when appended give the maximal result.

A1) I have managed to fashion a ‘brute-force’ solution which passes system tests, but I’m not 100% happy that it is actually correct. But I will present it here anyway.

The first thing I did was trim the first sequence so that it was in non-ascending order – performing the matching trims in the second sequence. It seems trivial enough to me that if your first subsequence ever increases, you could have removed the previous lower characters to get a better value. The problem then seems to become a question of how best to ‘cross the append’.

I identified a few different ways that you could cross the append, and brute forced them. Each way basically starts at some offset in the first sequence and tries to cross at some letter value. Each letter which gets in the way (is lower than the desired cross value) is removed as it is encountered – and as each letter is removed we check to see if we’ve found a better solution. However there is one exception, once you have crossed over, if a letter is in the ‘way’ it may correspond to a letter in the first sequence with value higher than your starting index, in which case removing it can never be a win. These letters are just left in place and hope they don’t make the solution worse. Every combination of offset and possible cross letter value is tried and we return the best we can find.

The way I handle this exception lowers my confidence in my solution a lot – but it works well enough for system tests.

A better solution is dynamic programming. The DP is ‘best solution of length n using the last k characters’. Initialized with length 0 being empty, each iteration we check the kth last character to see if it makes a better solution to length n+1 (than the previous length n+1 solution, if one exists) by pre-pending it to length n. (Proof of correctness of this still isn’t entirely obvious, I suspect a similar solution to ‘best solution of length n using the first k characters’ would fail – since an obviously correct solution would need to apply arbitrary subsets to jump from multiple iterations before to the new iteration, rather than being based purely off the previous iteration. Something to do with prefixing vs postfixing would seem to be the key.)

Q2) Given a square grid, where each cell can be any value 0 through 9, and up to 10 specified values, determine the number of scenarios which exist where each row and column adds to the same number modulo 10. (Answer given modulo large prime.) The grid may be large, up to 1000×1000.

A2) This problem boils down to 2 cases. Firstly the width of the grid may be larger than the number of given clues. In this case the answer is 10^((n-1)*(n-1)-k + 1) – where n is the width of the square grid and k is the number of specified values. Since the number of specified clues is less than the width of the grid, there exists at least 1 row and column which is completely independent of the specified values, which can be used to guarantee whatever desired modulo for each row/column.

This leaves the case where we don’t have an independent row and column. But we do have a maximum grid size of 10. We can break this down further. First if any row or column is fully specified, we know the target modulus. If there are 2 target modulus that don’t match, there is no solutions, and we can immediately return 0. If there is no target modulus, we try and see how many solutions there are for each potential target modulus and add them all together.

Looking at someone else’s solution I see that from here we can break the problem down in to independent subsets. If a cell is not specified, than that row/column pairing are not independent as they are linked by the choice to be made. But if the value is specified, then you can just subtract that value from the target for that row/column and they remain independent. A disjoint set tracker can be used to identify these independent sets of rows/columns, and the number of possibilities for each independent set can be multiplied together to get a result. Once we have an independent set of rows and columns, there is either no solutions or lots of solutions. If there is lots of solutions, the number of solutions is much like the original calculation, 10^((r-1)*(c-1)-k), where r is the number of rows in the independent subset, c the columns, and k the number of fixed values in those rows/columns. So all that remains is to identify when there are no solutions. This appears to be possible to do by looking at the fixed values in each row for columns which are independent, and for each column for rows which are independent. If the sum of each of these don’t have the same modulus, there is no way to set the values. (How you derive that leap of logic, I am unsure… but it sounds vaguely familiar from magic squares – sum of the sums each have to be equal.)

I wondered if a simpler solution is for a given target modulus, go to each value, set it to 0, perform any cascades forced by the modulus, and for each value you manage to set to 0, that is an extra power of 10. If you ever get to a contradiction, its not possible. To check, I wrote this up, and it passes system tests. Written correctly it is O(N^4) and with N at most 10 this easily runs in time. (The more complicated algorithm above appears to be O(N^3).)

Q3) Given 2 numbers A and B, determine the number of unique values X^Y where 1 <= X <= A and 1 <= Y <= B. A and B are both potentially large.

A3) I think I almost had a better chance of solving this problem than Q2. Although that isn’t saying much given how tired I was. We start with A*B scenarios, and start subtracting off duplications. With each of A and B being up to 1 billion, obviously duplications need to be removed in batches. The first trivial set of duplications is X=1, there are B-1 of these. Now we only have to consider X > 1. We need to identify scenarios of X^Y = N^M. Cases where N is a power of X aren’t too bad, but there are cases like 8^2=4^3 which are far more problematic.

So, taken from someone else’s solution… For each value between 2 and the sqrt of A inclusive, consider every power not greater than A. For every number between 2 and A inclusive not in this set there are B options. There is also the 1 extra for the value 1. What remains is how many contributions count from the others, numbers which have powers less than or equal to A. If we take each unique non-power in the range 2 to sqrt of (A) inclusive, and consider how many powers of it exist which are less than or equal to A we can work out how many unique values we can reach using an exponent of at most B. Each power can add B, but then we have to remove the overlaps. It almost looks like a standard add everything, remove pairwise overlaps, add back in the triplets, removing the 4’s, etc. But since the power could go up to 30, there are a lot of groups of 15, so it needs to be done a bit smarter. I can’t really work out how to describe the solution here properly, so if you are interested I guess you can go read a solution.

TCO13 R2A

So, up at 1am to prepare for the 3am start. Quite tired, but will see how I go…

2000 potential competitors for round 2, only 1260 turned up. 5 Australians battling the 3am start time. (Assuming they are east coast…)

3:10am – round starts, no one has been assigned rooms yet 😛

3:19am – round is cancelled 🙁

Update: Round is rescheduled for 3am tonight …

GCJ13

Not that I am going to be eligible to compete – but for anyone else out there who can – GCJ13 registration opens tomorrow – competition starts in April.

TCO13 R1C

Looks like there was a big turn out again, all 2000 registration slots used. 12 Aussies seems the biggest turn out so far – with 4 advancing. (I think that gives 7 Aussies in to round 2.)

1000pt obviously wasn’t as easy as last week, but a fast time on the 250pt wasn’t enough to get through again, so the 500pt can’t have been too difficult.

Q1) Given a sequence length, maximum step size, start and end values, determine the maximum possible sequence element.

A1) Appears pretty straight forward, the key thing to realise is that the answer will either be a multiple of the step size above the start value, or the end value (or both). With a maximum sequence length of 1million, you can even try both with brute force without running out of time (so long as for any given reached value, you calculate the number of steps back to the other end using division rather than repeated subtraction). More efficient methods include an inclusive binary search on the range max(start, end) to max(start, end)+(length*maxstep) and checking feasibility by using division to calculate the minimum number of steps required to get there from each and and comparing the sum to the sequence length. There is also probably a closed form formula involving the difference between the start and end values which does the problem in O(1) – but I haven’t bothered to derive it.

In any case you have to remember to handle a maximum step size of 0, which breaks all the division calculations. So treat it as a trivial special case and return start before running any of the above code.

Q2) Given a set of integers which are each formed from the sum of an unknown number of non-negative integer sub-components, determine the smallest number for which it is guaranteed that there is no more than k sub-components which are greater. Number of integers provided will be <50, but the values and k itself may be up to 1 billion.

A2) This is a binary search problem – since as you increase the comparison value, the maximum number of sub-components greater than it never increases. The answer will always be in the range 0 to 1 billion and 1, and each binary search step needs only perform a division calculation for each input value.

Specifically if we are considering whether n is a feasible solution, we divide each sum value by n+1 (since they have to score greater), and take the integer division result as the maximum number of sub-components that there could be which are greater. Add all of these answers together, and if it is no more than k then it is a feasible solution. Binary search until you find the minimal feasible solution. (As usual, this binary search is technically implemented to find a value between true and false, so you need one which returns the first value above when equality is not found.)

Q3) Given an NxN grid and assuming uniform random allocation of a pair of cells (not coincident), determine the expectation value for the number of cells of the grid ‘selected’ using a pattern which is centred on each of the randomly allocated cells. The pattern being up to 9 cells relative to the centre of (0,0), (a,b), (b,a), (-a, b), (-b, a), (-a,-b), (-b,-a), (a,-b), (b,-a). If the pattern falls off the edge of the grid, the corresponding member of the pattern is ignored. If any two pattern elements are for the same cell, a given cell can only be selected once. N, a and b are all no more than 1 billion.

A3) So yes, this is definitely harder than last week. With such a large value for N, obviously brute force is out of the question.

It isn’t too difficult to calculate the expectation value for a single randomly allocated centre, using N, a and b there is fairly trivial formulas for what area of the grid the centre can be in, compared to the total area to allow each of the 9 points to be valid, then those fractions can be added together, taking care that the areas are up to 10^18 in size, so even if you take advantage that each fraction has the same base you may even overflow a 64bit integer unless it is unsigned when you add them together. Also you need to handle the case where a equals b, and not double count the pattern elements which are then equal. Double this gives a first approximation for the expectation value for 2 randomly selected centres, we just need to subtract off the expectation of coincident pattern elements. And handle the fact that the 2 pieces will never be placed in the same place.

There are 3 cases of coincident pattern elements. The centres are coincident, which has an easily calculated chance of 1 in the area size, and cancels out an entire pattern contribution – the expectation value of which we just calculated before. In fact since centres co-incident is not allowed, we need to cancel out 2 entire pattern contributions. The remaining 2 cases are more complicated – first is centre is coincident on a non-centre element – this cancels out exactly 2 cells (I think), but the probability is related to a subset of the area divided by the area squared (considered for each of the up to 8 different directions). Finally we have to deal with the case of non-coincident centres where the pattern elements match. This appears to have 2 sub-cases, straight and bent. In straight, the line between the centres passes through the overlap cell, and there is exactly 1 cancelled out cell. Again the probability is a subset of the area divided by area squared (this time reduced by 2a, 2b or 2b, 2a – for up to 8 different relative arrangements). In bent, the centres end up a+b apart horizontally/vertically or a+b, a+b apart diagonally. In the diagonal case, exactly 2 elements are cancelled out, and each of the 4 cases can be handled like straight scenario. The horizontally/vertically cases are another question though – this time you have to consider up to 3 different scenarios. Because each of the 2 elements to be cancelled out may not actually fit on the NxN grid. But you can calculate an area where the ‘left/top’ element exists to be cancelled out, and another with the ‘right/bottom’ element exists to be cancelled out – and use those as individual 1 cancelled out cell cases like the straight scenario.

Once all these adjustments are made, I think a further adjustment of area/(area-1) needs to be made to compensate for the fact that co-incident centres could never be selected in the first place.

All in all, it isn’t an impossible problem if you’ve done some expectation value calculations before – but it is so fiddly, so many cases to code up, and so many chances to overflow/lose precision. And correctly handling the no-coincident of centres is a bit tricky.

Looking at some solutions, it appears that it is possible to avoid special casing this so much, by considering each pair of elements from the patterns. For the pair to be co-incident the centres have to be in a certain place relative to each other, which combined with the overlap position gives you the subtraction component as a fraction of area squared. Each of these single subtraction components all combined, should give the same sum as all the subtraction scenarios I mentioned above.

Socket.ReceiveAsync – broken or broken?

I ran into another .Net runtime bug today. Socket.ReceiveAsync. This method promises to improve performance compared to using Socket.BeginReceive/EndReceive and, I would say it manages that. However, if you are receiving data into continuously moving buffer, ReceiveAsync is dangerous. Specifically if you call BufferList setter or SetBuffer after each receive before starting the next, it returns seemingly random values for the number of bytes transferred compared to what is written in the actual buffer.

After many hours of reflection and debugging I think I may finally understand what is going wrong. It appears that ReceiveAsync is ‘too fast’, an asynchronous completion can be triggered before ReceiveAsync exits, presumably while WSARecv is still running. Then when BufferList gets set it modifies stuff that WSARecv is still using, causing much confusion and something goes horribly wrong…

I think BeginReceive/EndReceive don’t suffer from this problem because they construct a new set of parameters for WSARecv every time rather than attempting to reuse stuff to avoid memory allocations, so there is no potential for a still active WSARecv to mess around with the data about to be given to the next WSARecv. Either that or EndReceive ends up waiting in what would be the failure case for ReceiveAsync – but the reflected code doesn’t seem to indicate that.

The ‘solution’ I have for ReceiveAsync relies on waiting to reset BufferList until a ManualResetEventSlim gets set when the previous call to ReceiveAsync returns. This solves the problems, but even ManualResetEventSlim is expensive enough that it seems the performance gains are lost.

Hacky Ugly Example Code: (Note that removing the comment marks on the lines regarding the ManualResetEventSlim ensures this program never prints anything.)

using System;
using System.Net;
using System.Net.Sockets;
using System.Threading;

namespace ConsoleApplication8
{
    class Program
    {
        static void Main(string[] args)
        {
            TcpListener listener = new TcpListener(IPAddress.Any, 5678);
            listener.Start();
            client = new TcpClient("localhost", 5678);
            var server = listener.AcceptTcpClient();
            ThreadPool.QueueUserWorkItem(callback =>
                {
                    byte[] bufferInner = new byte[8192];
                    while (true)
                    {
                        try
                        {
                            int bytes = server.Client.Receive(bufferInner);
                            if (bytes == 0)
                                return;
                            server.Client.Send(bufferInner, 0, bytes, SocketFlags.None);
                        }
                        catch
                        {
                            break;
                        }
                    }

                });
            ThreadPool.QueueUserWorkItem(callback =>
                {
                    byte[] bufferInner = new byte[24];
                    for (int i = 0; i < bufferInner.Length; i++)
                        bufferInner[i] = (byte) (i+1);
                    for (int i = 0; i < 10000000; i++)
                        client.Client.Send(bufferInner);
                });
            asyncEventArgs1 = new SocketAsyncEventArgs();
            asyncEventArgs1.Completed += AsyncEventArgsOnCompleted;
            asyncEventArgs2 = new SocketAsyncEventArgs();
            asyncEventArgs2.Completed += AsyncEventArgsOnCompleted;
            buffer = new byte[bigBufferLength];
            BeginReceive();
            while (true)
            {
                Console.ReadKey();
            }
        }

        private const int bigBufferLength = 1024*1024;

        private static void BeginReceive()
        {
            if (asyncEventArgs == asyncEventArgs1)
                asyncEventArgs = asyncEventArgs2;
            else
                asyncEventArgs = asyncEventArgs2;
            int length = rnd.Next(2) == 0 ? 8192 : 100;
            prevLastWrapped = lastWrapped;
            lastWrapped = false;
            if (lastWriteIndex + length <= bigBufferLength)
                asyncEventArgs.BufferList = new ArraySegment<byte>[]
                    {new ArraySegment<byte>(buffer, lastWriteIndex, length),};
            else
            {
                asyncEventArgs.BufferList = new ArraySegment<byte>[]
                    {
                        new ArraySegment<byte>(buffer, lastWriteIndex, bigBufferLength - lastWriteIndex),
                        new ArraySegment<byte>(buffer, 0, lastWriteIndex + length - bigBufferLength),
                    };
                lastWrapped = true;
            }
            Clear(length);
            prevLastWriteLength = lastWriteLength;
            lastWriteLength = length;
            prevWasSync = wasSync;
            wasSync = false;
            int newVal = Interlocked.Decrement(ref counter);
            if (newVal != -1)
            {
                Console.WriteLine("BeginReceive called wrong number of times. {0}", newVal);
                throw new Exception("Fail!");
            }
            //callerExited.Reset();
            if (!client.Client.ReceiveAsync(asyncEventArgs))
            {
                wasSync = true;
                //callerExited.Set();
                AsyncEventArgsOnCompleted(null, asyncEventArgs);
            }
            else
            {
                //callerExited.Set();
            }
        }
        //private static ManualResetEventSlim callerExited = new ManualResetEventSlim(false);

        private static bool lastWrapped = false;
        private static bool prevLastWrapped = false;

        private static bool wasSync = false;
        private static bool prevWasSync = false;

        private static volatile int counter = 0;

        private static void Clear(int length)
        {
            if (lastWriteIndex + length <= bigBufferLength)
                Array.Clear(buffer, lastWriteIndex, length);
            else
            {
                Array.Clear(buffer, lastWriteIndex, bigBufferLength - lastWriteIndex);
                Array.Clear(buffer, 0, lastWriteIndex + length - bigBufferLength);                
            }
        }

        private static void AsyncEventArgsOnCompleted(object sender, SocketAsyncEventArgs socketAsyncEventArgs)
        {
            int newVal = Interlocked.Increment(ref counter);
            if (newVal != 0)
            {
                Console.WriteLine("OnCompleted called wrong number of times. {0}", newVal);
                throw new Exception("Fail!");
            }

            if (socketAsyncEventArgs.SocketError != SocketError.Success)
                return;
            if (socketAsyncEventArgs.BytesTransferred == 0)
                return;
            if (socketAsyncEventArgs.BytesTransferred > lastWriteLength)
            {
                Console.WriteLine("Receive too long, Sync:{0}:{1} LastWrapped:{2}:{3} Length:{4}:{5} ClaimedLength:{6}", wasSync, prevWasSync, lastWrapped, prevLastWrapped, lastWriteLength, prevLastWriteLength, socketAsyncEventArgs.BytesTransferred);
            }
            if (buffer[(socketAsyncEventArgs.BytesTransferred-1 + lastWriteIndex) % bigBufferLength] == 0)
            {
                Console.WriteLine("Receive too short-quickcheck , Sync:{0}:{1} LastWrapped:{2}:{3} Length:{4}:{5}", wasSync, prevWasSync, lastWrapped, prevLastWrapped, lastWriteLength, prevLastWriteLength);
            }
            //for (int i = 0; i < socketAsyncEventArgs.BytesTransferred; i++)
            //{
            //    if (buffer[(i + lastWriteIndex) % bigBufferLength] == 0)
            //    {
            //        Console.WriteLine("Receive too short , Sync:{0}:{1} LastWrapped:{2}:{3} Length:{4}:{5} ActualLength:{6}", wasSync, prevWasSync, lastWrapped, prevLastWrapped, lastWriteLength, prevLastWriteLength, i);
            //        break;
            //    }
            //}
            lastWriteIndex = (lastWriteIndex + socketAsyncEventArgs.BytesTransferred)%bigBufferLength;
            //callerExited.Wait();
            BeginReceive();
        }

        private static SocketAsyncEventArgs asyncEventArgs;
        private static SocketAsyncEventArgs asyncEventArgs1;
        private static SocketAsyncEventArgs asyncEventArgs2;
        private static byte[] buffer;
        private static int lastWriteIndex;
        private static int lastWriteLength;
        private static int prevLastWriteLength = 0;
        private static Random rnd =new Random();
        private static TcpClient client;
    }
}

TCO13 R1B

As previously mentioned I didn’t have to get up at 4am to do this round, but a quick look at the final scoreboard it appears the questions were significantly easier. Almost half of the advancers solved the 1000pt question correctly. The 600th place cut-off required solving the 250pt problem in decent time and a successful challenge as well.

In this competitive environment only 1 Australian advanced, 2 more falling just below the cut-off having solved the 250pt quickly.

Q1) Given an even number of integers, the need to be made into pairs. These pairs can have each member added together to create a value of that pair. Determine the minimum difference between the lowest and highest values in such a set of pairs. Between 2 and 50 starting integers and numbers are all in the range 1 to 1000.

A1) Without justification – the simple approach is to sort the inputs and pair them from outside to inside. Then just return the difference between the largest and smallest pairs created this way. This is also correct, but proving it appears not so trivial. (Certainly in the competition this would have put me at risk, not being able to prove it usually delays my submission.)

Q2) Given a rectangular array which is partially populated, and the ability to clear up to R consecutive rows at a time or C consecutive columns at a time, determine the minimum number of ‘clears’ required to clear the array completely. Rectangular array is maximum 15×15.

A2) So the first key point to this problem is that the maximum size is 15. This means a brute force on every subset of rows, or columns (but not both) is plausible. The next key point is that the order in which you perform your clears does not matter, if 2 clears overlap the common cells will be cleared regardless of order. So you can do all your row clears before your cell clears, without loss of generality.

This leads to a fairly simple approach – try every subset of rows (including empty) and then use a greedy clearing to determine the minimum number of moves by columns to clear what remains, and also to determine the minimum number of moves to implement the original subset of rows selected. (Don’t worry if your greedy approach clears excess to the selected subset, taking the minimum number of moves is more important. The actual subset that you erase is a better solution anyway.)

Simple runtime is O(2^(Min(w,h)) * w*h) – each subset requires 2 greedy clearing passes, which cost proportional to the size of the full array in the worst case. This runtime chooses the smallest dimension, but in practice the code is simpler if you just standardise on rows or columns, it will run fast enough regardless. An ideal solution might be to implement it faster by using a sparse array representation much like the one used in the Dancing Links algorithm. But that certainly wasn’t required here.

Q3) Given a dictionary of distinct words and the following rules, determine the minimum number of remaining words you can end up with. There are 2 rules. First rule is that if any time you get the same word duplicated, the duplicate and the word are removed. Second rule is that you can replace any word with the same word with an even length prefix reversed.

A3) The key point for this problem is that the even length prefix reversal lets you create almost any anagram out of an even length word. For odd length words you get almost any anagram where the last character hasn’t moved. Secondly it doesn’t matter what order you remove pairs that match along the way, because there are no dead ends in the reordering and if a can be made to match with b and c, b and c can be made to match.

Almost any anagram isn’t every anagram. Specifically it is any anagram of the original pairs, where each letter in a pair itself can be in any order. The pairs never get split up in the process, so you can’t reorder 1234 into 1324. We can create a canonical form for matchable groups in this subset of anagrams by sorting each pair of letters, then sorting the resultant pairs. For example the word ‘word’ would first be rearranged to ‘owdr’ then secondly to ‘drow’ to given the canonical order. (FIFO is its own canonical ordering under these rules…)

So for each word, if it is even length, replace it with the word with its characters in canonical order. If it is odd length, do the same but leave the last character in place. Then do a unique count of the result words. If the count for a unique word is even, all pairs can be removed, if the unique count is odd, you have to keep 1 copy of that word. So simply count those odd ones up and you have your answer.

This was pretty easy for a 1000pt problem (certainly it seems easier than the 500pt problem) – but identifying the way to canonicalise the available subset of anagrams isn’t totally trivial, so it might stump a few people.