Algorithms Weekly by Petr Mitrichev: A lifting week

The Universal Cup held its Stage 2: Paris during the Oct 6 - Oct 12 week (problems, results, top 5 on the left). This was one of the very rare occassions where the winner seemingly was not a veteran team, but a team THU1 consisting of university students. They were already ahead of the usual suspects on 12 problems by penalty time, and solving the last problem 10 minutes before the end was the icing on the cake. Congratulations!

Codeforces Round 1058 wrapped up the week on Sunday (problems, results, top 5 on the left, analysis, my screencast). As the custom goes these days, I have started well on the easier problems but then got stuck, trying to make progress in either of the three remaining problems, and actually being close to solving D2 at the end, but not quite.

Nobody was able to solve everything this time, but the top 5 shows that many different strategies were competitive. In the end solving the problems fast from left to right was the winning recipe for Benq. Congratulations!

In my previous summary, I have mentioned another Codeforces problem: you are given n<=250000 numbers which are written in sorted order. You need to process q<=250000 queries, each query is a segment [l, r] of the numbers. For each query, you need to find the maximum amount of numbers within the segment [l, r] that can be taken in such a way that for each three taken numbers the difference between the largest and the smallest of them is more than z. The number z is the same for all queries.

The first step is solving the problem for a single [l, r] query. It is not hard to convince oneself (if we want to have a formal proof, we can use the exchange argument) that we can do it greedily: take the first two numbers in the segment a_l and a_l+1, then take the first number at position l+2 or higher that is bigger than a_l+z, then take the first number to the right of that that is bigger than a_l+1+z, and so on.

Now we need to learn to simulate this process fast so that we can process many queries with different values of l within the time limit. The key observation is that the greedy solution can be split into two almost independent chains: the odd-numbered values are a_l, then the first number a_k that is bigger than a_l+z, then the first number that is bigger than a_k+z, and so on; and the even-numbered values are a_l+1, the first number that is bigger that a_l+1+z, and so on. So a solution could count the numbers in those two chains independently and add the results up.

How do we count the numbers in one of those chains, starting with a given position a_l and until we pass a_r? This can be done using a standard algorithm called binary lifting: we can precompute where do we end up after we do 2ⁱ jumps from position l for each l and i. Then to answer an [l, r] query we first try to make 2ⁱ jumps with the highest possible i, if we overshoot a_r then we skip this step otherwise we take it, and then do the 2^i-1 step, and so on. Since the true answer has a binary representation, we will find it in just O(log n) steps per query, and the precomputation takes O(n*log n).

However, this approach is not the full solution since the two chains are only "almost" independent. More specifically, it can happen that the arrive at the same position, and then one of the chains will instead take the next consecutive number as described above.

It turns out that for a given two starting positions l and l+1, we can again use binary lifting to find out when this happens! If two chains are at the same position at a certain moment, they will also be at the same position after any number of additional jumps. It means that we can still use the binary lifting idea of trying a large 2ⁱ jump first for both chains, and if we are in the same position then we roll it back and try 2^i-1, and so on.

Note that it is not necessarily true that the two chains will reach the same number at the same step, though. For example, it might happen that a_l+1>a_l+z, and then the first chain will have a_l+1 at the second position. However, because the jump function is monotonic, the second chain will always be at the same position or to the right of it for the same step, and the first chain will always be at the same position or to the right of it when one step ahead of the second chain. So we just need to check those two possibilities (a collision at the same step, or a collision where the first chain is one step ahead) in the binary lifting condition.

This way we can precompute for all n options for the starting position l, where, and after how many jumps, do the chains starting from a_l and a_l+1 collide. This precomputation takes O(n*log n). Note that after the two chains collide, we now have the collision position k, and the two chains continue from a_k and a_k+1, so we have exactly the same situation as in the beginning.

Now we can consider a new type of jump, from a_l to a_k as described above, and do binary lifting on those big jumps! This time we need to precompute two things for each l and i: where do we end up after 2ⁱ big jumps, and how many small jumps does that entail.

And finally to correctly answer an [l, r] query, we first do the binary lifting steps using big jumps from a_l, and as soon as the next big jump would overshoot a_r, we do the binary lifting steps using small jumps for the remainder.

The solution was ultimately about applying a standard algorithm three times, not about coming up with a novel approach. However, I still find it quite beautiful how the same idea just keeps on giving.

Thanks for reading, and check back for last week's summary!

Algorithms Weekly by Petr Mitrichev

Monday, October 20, 2025

A lifting week

3 comments: