Algorithms Weekly by Petr Mitrichev: A 3-person week

Facebook Hacker Cup 2019 Final in Dublin was the main event of the Oct 21 - Oct 27 week (problems, results, top 5 on the left, results announcement video, recap video, analysis). Gennady was so much faster solving the first few problems that he could afford to test his solution for the last one extensively, giving myself a glimmer of hope as I was the first to finish all solutions. It turned out to be just that, a glimmer of hope :) Congratulations to Gennady!

Codeforces Round 596 happened a day later (problems, results, top 5 on the left, analysis). Benq ended up winning the round after he found 10 successful challenges and wxhtxdy's solution for F exceeded the time limit. There was a bit of luck involved as not all rooms even had so many incorrect submissions available for challenging, but still finding and executing those challenges is no small feat. Congratulations to Benq!

I have ruined my contest by trying to squeeze in a solution for E that was just too slow. I got into this very bad state where I thought repeatedly that after one more small optimization it will work, while having no proof that it will be the case. My solution had a 3ⁿ term in its complexity, while the correct one only has a 2ⁿ :(

Here is the problem: you have n<=16 positive integers such that their sum does not exceed 2000, and another integer k, 2<=k<=2000. All n integers are not divisible by k. In one step, we can erase two numbers and instead write their sum divided by k repeatedly until the result is no longer divisible by k. For example, if k=2 and we have numbers 5 and 19, we can erase them and instead write 3 as 3=(5+19)/2/2/2 and 3 is not divisible by 2. Our goal is to end up with having just the number 1. Can you find a way to do that without iterating over 3ⁿ of something?

In my previous summary, I have mentioned a bunch of problems. The first one came from Codeforces: you are given two strings of even length, a and b, consisting of characters 0 and 1. You start with the string a. In one step, you can choose any prefix of the current string that has even length and reverse it. Your goal is to obtain the string b in at most n+1 steps. You don't need to minimize the number of steps. The length of each string is even and at most 4000.

Since we only ever reverse prefixes of even length, it's natural to see what happens to blocks of two consecutive characters. Blocks of 00 and 11 stay the same, 01 turns into a 10 and vice versa. Therefore, in order for this transformation to be possible at all, we need the number of 00 blocks to be the same between the two strings, and the number of 11 blocks to be the same between the two strings (and therefore the total number of remaining blocks, which are 01 and 10, will also be the same).

It turns out that this is both necessary and sufficient for a transformation to exist. One way to do it is to keep converting the string a into the string b from right to left. Consider the rightmost block of b, for example it's 00. Then we can find the same block anywhere in a, reverse the prefix ending with this block so that it moves to the first position, and then reverse the entire string a so that it moves to the end, and then we can forget about the last two characters of a and repeat the process.

We have handled two characters in two reversals, so it looks like we will finish in at most n steps. However, there is one tricky case: when the block we need is a 01, and a does not have any 01s, just 10s (or vice versa). In this case we might need an extra reversal in the middle to turn one into the other when it's in the first position of a, so we will end up spending 3 reversals per 2 characters, potentially using up to 1.5n steps which is too much.

The official solution modifies this process a bit to make sure we need at most one such extra reversal — check it out. I have tried to come up with something in that vein but could not make it work, so I've used a randomized approach: I've noticed that in some cases we can only spend 1 reversal per 2 characters, namely when the required block is already in the beginning of a (in reversed form). If the number of times we spend 1 reversal is the same or at most one less than the number of times we spend 3 reversals, then we're good. And intuitively the "1 reversal" situation is much more likely to happen as we just need the right type of block in the beginning, while for the "3 reversals" situation to happen we need some type of block to not exist at all. Since we have quite some freedom in choosing which block to move to the last place at each step, making this choice randomly, and then starting from scratch in case we exceed n+1 steps turned out to be good enough.

The second problem was from TopCoder: you are given a number n between 1 and 1000, and need to construct any tree with n vertices that has at least 2.62ⁿ colorings of the following kind: each vertex is colored red, green or blue in such a way that the distance between any two red vertices is at least 3 (there are no constraints on green and blue vertices).

My solution, which ended up being the only one that passed the system tests, was very simple: we start with a tree that looks like a chain. Then we repeatedly make random changes to the tree, keeping the change if the number of colorings increases, and reverting the change if it decreases, until we reach the required number of colorings.

This still requires to solve the sub-problem of counting the number of colorings, but it is solved with a relatively standard dynamic programming algorithm.

Since there were only 1000 possible inputs, I've ran my solution on all of them and was sure that it works when submitting it. And then I ended up resubmitting it with just seconds left in the round :) When running it on the 1000 inputs for the first time, I've noticed that it takes some time to converge to the required number of colorings for inputs under ~700, but the solution becomes instant for inputs larger than that. Somehow I did not think this was bad at the time, but of course it was. The reason for this was that the number of colorings overflowed the double floating-point type, so it was immediately finding a tree with "positive infinity" colorings :)

Finally, I have also mentioned an Open Cup problem: there is a hidden array a with at most 250 elements, and you need to determine its contents after asking at most 30 queries. Initially, you just know that size of a and that each element of a is an integer between 1 and 10⁹, inclusive. In one query, you can either ask for the value of a particular element of a, or choose a set of indices of elements of a, and get back the multiset of k*(k-1)/2 absolute values of pairwise differences between elements with those indices (where k is the number of indices you've chosen). Note that for a query of the second type you get back a multiset, in other words there is no order and you don't know which difference corresponds to which pair.

The first crucial trick that our team came up with is the following: if we send two queries of the second type for a set S and for a set S+{x}, and then find the difference between the returned multisets, then you get a set of absolute values of differences between x and all elements of S.

Now, if we somehow magically knew an element x which is the global minimum or maximum, we could solve the problem in the following way: let's repeat the above trick 8 times, on the i-th try asking for differences between x and a set of indices which have the i-th bit set to 1.

Since we have less than 2⁸ elements, and all differences will be unique thanks to x being the global minimum or maximum, this will allow us to uniquely identify the element corresponding to each difference by looking at whether each bit in its index is 0 or 1. The difference corresponding to all bits being 0 will never appear in our output, but we can just start the indexes from 1 to avoid this issue.

Now we just need to find the value of x and whether it is the minimum or the maximum, which we can do with two queries of the first type, and all other elements can be obtained by adding (when x is the minimum) or subtracting (when x is the maximum) the corresponding difference. We have spent only 8*2+2=18 queries in this part.

How to find the global maximum or minimum using the remaining 12 queries? It turns out that we can craft a binary search using the presence of the overall maximum absolute difference as our test. First, we send a query of the second type for all indices, and find the overall maximum absolute difference — it will be the difference between the global maximum and the global minimum.

Then, suppose we have identified a subset S which contains the global maximum, the global minimum, or both (initially S will be equal to all indices), and let T be its complement (initially empty). Let's split S into two equal parts A and B. Let's ask send a query of the second type for the set A+T. In case this query returns the overall maximum absolute difference, then A contains at least one of the global maximum and minimum (because otherwise T would have to contain both, which is a contradiction). In case it does not, then B must contain at least one of them. So we can reduce our set of candidates in half using one query of the second type, and therefore find our global maximum or minimum in 1+8=9 queries, solving the entire problem in 27 queries.

This problem was one of the few cases where we have actually used the whole Open Cup 3-person team to discuss it and make incremental progress until all pieces of the puzzle fell together. I found the "uncertain" binary search quite cool in particular, as we manage to look for one of the two elements without knowing which one of the two it is, and without even knowing if the remaining subset contains only one of the elements or both :)

Thanks for reading, and check back for more!

Algorithms Weekly by Petr Mitrichev

Wednesday, December 25, 2019

A 3-person week

No comments:

Post a Comment