Algorithms Weekly by Petr Mitrichev: An odd knapsack week

Pinely Round 3 on Codeforces took place on Saturday (problems, results, top 5 on the left, analysis). Quite a few people got the first 8 problems correctly, some with more than half of the contest time left, but the last two problems proved to be quite tough to crack. Only zh0ukangyang got problem I right, and only maroonrk got problem H right, therefore earning the first two places. Congratulations!

Quite interestingly, this time there was no contestant who successfuly solved H or I after skipping one of the easier problems (so rainboy sadly scored 0 :(), which goes to show that such strategy looks amazing when it works, but also carries huge risks :)

As for my contest, I wasted way too much time on F and G, and therefore did not spend any meaningful amount of time on H and I.

In problem F, when trying to generalize my solution from F1 to also solve F2 I got a quadratic-time solution that seemed like it could be sped up by using fast convolution; however, I could not immediately see how, and the number of accepted solutions indicated that there is something simpler there. In the end, I got the worst of both worlds: I did not find the simpler solution, but I also did not pull myself together to write down the quadratic formula explicitly on paper. When I did this after the end of the round, I have quickly upsolved it using fast convolution since we were summing up a product of various factorials and inverse factorials of amounts that depend either on u, or v, or on u+v (where u and v are the nested loop variables that yield quadratic running time).

In problem G, I wasted about half an hour when my initial solution got TLE, even though its complexity was O(n). It constructed a suffix array instead of using the z-function though, so I guess the lesson learned here is that for practical purposes the O(n) suffix array construction should be treated as O(nlogn), as I assume the constraints were so high (n=10⁷) precisely to cut off O(nlogn) solutions. In the end I was able to replace the suffix arrays usage with z-function.

Having read maroonrk's solution for H (spoilers, do not click if you want to try solving yourself first!), I regretted a lot that I did not have time left to try solving it as the problem was amazing. Here is the problem statement: you are given an array of size 3*10⁵. In one operation, you can take any segment of this array of even length, and swap the 1st and 2nd numbers in that segment, the 3rd and 4th numbers, and so on. You need to sort the array in at most 10⁶ operations. You do not need to minimize the number of operations. Can you see a way?

In my previous summary, I have mentioned an AtCoder problem: you are given up to 100000 positive integers A_i, each up to 10⁹. Your goal is to split each A_i into two non-negative integer parts A_i=B_i+C_i so that there is no way to choose exactly one of B_i or C_i for each i such that the sum of chosen numbers is equal to exactly half of the sum of all A_i (that sum is guaranteed to be even).

When solving this problem, the first observation I made is that it is in fact hard, likely impossible, to check in a reasonable time if a given split into B_i or C_i satisfies the requirements or not, as it requires solving a pretty large instance of the knapsack problem. This may be the reason that the problem does not require to print a certificate, just a Yes/No answer.

This naturally leads to the following question: for which splits into B_i or C_i we can reasonably easily prove that achieving exactly half is impossible? To make such proof easier, it makes sense to split all even numbers exactly in half such that B_i=C_i: then we know for sure those numbers' contribution to the sum, and there are fewer possibilities to check. However, if all numbers are even and we do this for all numbers, then it would be possible to achieve exactly half of the total sum (in fact, it would be impossible to achieve anything else :)). But then we can do this even split for all numbers except one, and for one number (say, A₁) we set B₁=0 and C₁=A₁. Then we get exactly half from all other numbers, but if we choose B₁ then the sum is slightly less than exactly half of the total, and if we choose C₁ it is greater. Therefore we have solved the problem for the case where all A_i are even (the answer is always Yes).

What can we do about odd numbers? They cannot be split exactly in half, but we can try to build on the above construction: let us split all odd numbers almost in half, such that B_i+1=C_i, and split one number, the biggest one (assume we reorder the numbers and it is A₁), as B₁=0 and C₁=A₁. Now if the amount of odd numbers is less than A₁, then we still cannot achieve exactly half, because if we choose B₁, even taking C_i from all odd numbers will still leave us short of half of the total, and if we choose C₁, we overshoot. There is a slight complication that happens when A₁ is odd, as then we should not count it towards the amount of odd numbers we split almost in half; however, since the total amount of odd numbers is always even (because the sum is even), this does not affect our comparison and we can still compare if A₁ is strictly greater than the total amount of odd numbers.

This criterion was my first submission, however, it got WA. As I did not have any immediate ideas for other situations where achieving exactly half is clearly impossible, I implemented a brute force solution and stress-tested it against this one. The smallest counterexample it produced was: 1, 3, 3, 3. In this case we set all B_i=0 and C_i=A_i and there is no way to achieve the required sum of 5 from some subset of 1, 3, 3, 3. The first idea after seeing this was that divisbility by 3 is somehow a factor; however, quite quickly I realized that we can slightly generalize the construction from the first submission above: we take all odd numbers, sort them, and split them into two parts of odd size. In the part containing the smaller numbers, we set B_i+1=C_i, and in the part containing the bigger numbers, we set B_i+D=C_i, where D is the smallest of those bigger numbers. Now if the size of the part with smaller numbers is less than D, then we always fall short of half of the total if we choose more B_i's than C_i's in the part with the bigger odd numbers, and we always overshoot otherwise.

This solution passed the stress-test against the brute force for small constraints, therefore I submitted it and it got accepted. I did not bother proving it formally since the stress-test was proof enough, but the intuition is somewhat clear: now we say No only if there are at least two odd numbers up to 1, at least four odd numbers up to 3, at least six odd numbers up to 5, and so on until we run out of odd numbers, and the total amount of odd numbers is at least the biggest number. I did not write down all details, but the following method likely works to achieve exactly half in this case: we first go through all even numbers, and then through all odd numbers in decreasing order. If the sum we accumulated so far is bigger than half of total of the numbers processed so far, we take the smaller one of B_i and C_i, otherwise the bigger one. We can now prove by induction that after processing all odd numbers except the x smallest ones, the current sum differs from half of all processed numbers by at most (x+1)/2, which means that in the end it is exactly equal.

Thanks for reading, and check back next week!

Algorithms Weekly by Petr Mitrichev

Sunday, December 24, 2023

An odd knapsack week

2 comments: