Sunday, December 17, 2023

A three-step week

The 2nd Universal Cup Stage 13: Shenyang took place last week, but its results were not public when I wrote the last summary (problemsresults, top 5 on the left, analysis). The first two places in this round coincide with the first two places in the overall Universal Cup standings, and they were also the only teams to solve 12 problems. So I guess one could say this round went exactly as expected :) Congratulations to USA1 and HoMaMaOvO!

This round used the problemset from an ICPC regional contest, and the best team from that contest is only on place 23 in the scoreboard with 9 problems solved, which underscores how the Univesal Cup gathers the best teams in the world.

The 2nd Universal Cup Stage 14: Southeastern Europe took place this Saturday (problems, results, top 5 on the left, overall standings, analysis). Team HoMaMaOvO got the second place just like last week, but the winner was different: team 03 Slimes got 11 problems solved at just 1:43 into the contest, and therefore had all the time in the world to solve the 12th. Congratulations on the win!

This round has also used the problemset from an ICPC regional contest, but this time the best team from the onsite round placed a bit worse — at place 36, with 9 problems solved.

Finally, AtCoder Grand Contest 065 wrapped up this week (problems, results, top 5 on the left, overall standings, analysis). There was a huge gap in difficulty and in scores between the first four problems and the last two, therefore in this round it could actually be a very good strategy to start with one of the two difficult problems to be able to properly estimate how many easier problems one can squeeze in the remaining time. mulgokizary and newbiedmy executed this strategy successfully to place 3rd and 4th, well done! Of course, it's even better if one can solve the four easier problems and one difficult one, as zhoukangyang and ecnerwala did :) Congratulations to them as well!

The round went quite well for me, in a large part thanks to the fact that I was able to quickly find this page for problem D, and this paper for problem F. However, while the implementation for D is pretty straightforward with the linked formula, one still needs to make a few more steps on top of the linked paper to get F, and I managed to get stuck in those steps: by the end of the round, my solution returned 125405280 instead of 128792160 for n=10 :(

While solving problem C in this round, I followed a quite typical pattern, at least for AtCoder problems: come up with a reasonably simple sufficient but not clearly necessary condition, implement it, submit, get WA, implement a stupid solution and run a stress test, find a case where the solutions differ, come with another, also reasonably simple sufficient but not clearly necessary condition, implement it, run stress test with larger cases, find a bug in the implementation, fix it, pass stress test, submit, get AC :) I think seeing the diffs found by stress test was instrumental for me to discover the correct solution idea. For those who solved this problem during the round or want to upsolve now, were you able to do it without the stress test?

Here's that problem's statement, aptly titled "Avoid Half Sum": you are given up to 100000 positive integers Ai, each up to 109. Your goal is to split each Ai into two non-negative integer parts Ai=Bi+Ci so that there is no way to choose exactly one of Bi or Ci for each i such that the sum of chosen numbers is equal to exactly half of the sum of all Ai.

This round has seemingly concluded (even though a man can hope: maybe one more AGC in 2023 will pop up as a Christmas present? :)) the AtCoder Race Ranking 2023 (top 14 on the left). Therefore it seems that I have barely missed qualifying to the World Tour Finals in Japan. Amazingly, the cutoff for the top 12 did not change at all in this round, as Um_nik has kept his final qualifying place while being outside of the top 30 in the round. It means that even the fourth place would have been enough for me to qualify. Not making a wrong attempt on C (or convincing Gennady to skip this round to increase my chances) would have gotten me the fifth place, but to get fourth I really had to solve either E or F. Well, I will try harder next year, and huge congratulations to the 12 finalists!

In my previous summary I have mentioned a Hacker Cup problem: two players are playing a nim-like game, starting with two piles of stones. In one move, a player first chooses one of the remaining non-empty piles, let's say this pile has k stones. Then they can take between Ak and Bk stones from this pile, and they also must create a new pile with Ck stones (1 <= Ak <= Bk <= k, 0 <= Ck < k). Since 1 <= Ak and Ck < k, this game will eventually terminate, and the player unable to make a move loses the game. Your goal is to find for each size (between 1 and n) of the first of the two initial piles the smallest size of the second initial pile that leads to a losing position, and print the sum of those n sizes. n is up to 2 million.

The first step in solving this problem is pretty straightforward. As the games on each pile are independent, we can use the Sprague-Grundy theorem, therefore we just need to find the nimber for a pile of size k for each k. Denoting this nimber as Nk, from the game rules we get that Nk=mex(NiNCk over all i between k-Bk and k-Ak).

So we need some data structure that can find mex on a range, with the added twist that all numbers on the range are first xored with some constant. Finding things on a range is typically done with a segment tree, but to find mex even without the xor-constant complexity would require to propagate a lot of information along the tree.

The key step to progress further in solving this problem is to actually forget about the ranges for now, and focus on the xor-constant part. Suppose we just have a static set of numbers, and need to answer questions: what is the mex of all those numbers xored with a given constant? In this case it is reasonably clear what to do: we need to determine the mex bit-by-bit from the highest bit to the lowest bit. Suppose we want to find the k-th bit having already found out that the answer is equal to r for bits higher than k, in other words we know that the answer is in range [r,r+2k+1), and need to tell if it is in range [r,r+2k) or  [r+2k,r+2k+1). Because bitwise xor is applied independently to high and low bits, we simply need to know if there is at least one number missing in our set from the range [rs,rs+2k), where s is the bits k and higher from our constant. And finding if a number is missing on a range can be done with a balanced tree or again with a segment tree. Note that even though we forgot about the ranges, the ranges have reappeared: instead of ranges on k, we now have ranges on Nk.

Now let us reintroduce the ranges on k. First, let us consider only half-ranges: suppose Ak=1 for all k. Then in the above bit-by-bit solution we need to find out if there is at least one number missing from a given range on a suffix of Nk. This can be done by modifying the segment tree approach: let us use a segment tree that, instead of just remembering if a certain number has appeared or not, will remember is rightmost appearance. Then we can find the minimum of those appearances on the needed range, and compare it to k-Bk. In fact, since all ranges of nimbers that we query are aligned with the powers of two, each query will exactly correspond to one of the nodes in the segment tree, and therefore can be done in O(1) (but an update still needs to touch O(log(n)) nodes).

What to do about the other side of the range on k, in other words when Ak>1? Here comes another relatively standard trick: since we only look at indices up to k-Ak, we could have executed this query when we were processing k'=k-Ak+1, and at that moment this query would be a half-range with only the left boundary, which we can handle using the procedure described above. So we would like to already compute Nk when processing k'=k-Ak+1, however we cannot do that since we might not know NCk at that point yet if Ck>k-Ak. This naturally points us towards persistent data structures: we can modify our segment tree to be able to not just query what is the minimum on a range, but to also to query what was the minimum on a range at any previous state of the data structure, in particular when k'=k-Ak+1.

There are several standard ways to do it, one of which is to actually store a tree as a set of immutable nodes with each node pointing to children, and every time we need to change the value in a node we would actually clone the node with the new value instead, together with its path to the root. This way we only create O(log(n)) additional nodes per operation, so the total memory usage is still acceptable at O(n*log(n)), but now since all nodes are immutable we can simply query any old root of the tree to get the minimum on a range at a point in the past.

I think this problem is educational since it has three steps of "unwrapping the present", as we first solve an easier version of the problem and then gradually add back the full complexity. Each particular step is more or less a well-known trick, but one still needs to find which simplifcation of the problem to tackle first, and for that it is vital for those well-known tricks to really be "in RAM", as well as to have a good intuition about what is not solvable at all, so that one can explore many directions and find the correct three-step path. If one has to think for half an hour to solve each particular step, there is really no chance to find the correct sequence of three steps in time, as there will necessarily be other promising directions that won't lead anywhere but waste a lot of solving time.

Thanks for reading, and check back next week!

No comments:

Post a Comment