Saturday, August 2, 2025

A focus week

There were no contests that I want to mention last week, which allows us to focus on the two problems I mentioned in the previous summary. Some blog meta: I have been trying to write this post for almost a week now, of course not full time but whenever I had nothing more important to do. However, I wanted to solve both problems to be able to share my own insights and not just rehash the editorials, and it seems so hard for me to focus on solving problems outside of a running contest these days! (one can't help but mention that focusing during a running contest is also suffering, of course :))

The first problem was from the AtCoder World Tour Finals 2025: you are given n<=100 days and m movies, the i-th movie running from day li to day ri (all pairs (li, ri) are different). Find the sum of the following values: for each of the 2m sets of movies, what is the maximum amount of different movies from that set one can watch, if one only watches at most one movie per day?

During the broadcast I got some ideas from listening to Riku and reading the contestants' solutions. However, it turns out that even with those ideas I could not solve this problem after at least 5 hours of thinking (probably more since I did not count so precisely), so I actually checked the editorial to complete the below solution.

The first idea that I had (and which Riku confirmed was on the right track) was to choose a particular algorithm for finding the maximum amount for a particular set (out of 2m possible) of movies. This is a classical problem of course, and the typical greedy approach is to sort the movies by the end day, and then process them in this order and for each movie that still has free days to watch it choose the earliest free day to do it. This works since when we process the movies in the increasing order of end day, the early free days are less flexible than the late free days (because every movie that can be watched on the early one can also be watched on the late one), so it makes sense to pick the least flexible day for a given movie.

Now we will count not just the sum of those maximums over all 2m sets, but the sum of those maximums as obtained by the above greedy. This is of course the same number, but we can now regroup those sums, for example look at how often we watch a movie on a particular day.

More precisely, let us look at how often we do not watch a movie on a particular day a. It turns out that the problem neatly separates into two independent subproblems: since we know that day a always stays free in the above greedy, it means that movies that start on or before day a are never watched after day a. Therefore the movies that start on or before day a only affect days before a, and the movies that start after day a only affect days after a. The actual greedy processes movies that start before and after a in some interleaved order, but it does not matter since they do not interact with each other as long as a is free.

A problem splitting into two independent parts naturally suggests dynamic programming. The subproblem happening after day a in this case is completely identical to the original problem, which is very nice for the dynamic programming.

However, what happens before day a has an additional constraint: we must never watch a movie on day a. And actually, to avoid double-counting we should likely choose a as the first day where we do not watch a movie, so the subproblem now is: for a given segment of days [l, r], and for movies that start in this segment, how many subsets are there that lead to us watching a movie on all days in this segment except the last day r?

This subproblem is not the same as the original problem, so we need to come up with a different way to count this. First, I considered possibilities to get this number by subtraction. If we know the answer for segments [l, r'] for r'<r, it is relatively easy to use subtraction to find the number of subsets where we watch a movie for all days between l and r-1, but I could not figure out a way to split this number into those where we also watch a movie on day r and those where we don't.

Then I tried to come up with a similar dynamic programming argument. A natural question to ask is: what is the last day on the segment [l, r] that was filled by the above greedy? Suppose this day is m. Now it would seem that for days on the segment [m+1, r] we can solve the same subproblem again, and multiply the answer by the answer for the subproblem. However, I think such multiplication leads to double-counting, since it matters which of the segments (filling the last spot in the left part, or filling the last spot in the right part) we encounter before the other. This is where I got stuck, and could not make further progress.

After checking the editorial, it turns out that the idea that I was missing was more or less the following: since we need to rely on the order of segments in the greedy to avoid double-counting, we need to add the segment position in the greedy order (by end day) to the dynamic programming state, instead of just trying to have states based on days only!

So for each k,l,r we count the number of subsets of the movies that start in the segment [l, r], and also are among first k movies in the greedy order, that fill the entire segment [l, r] except its right boundary. When going from k to k+1 we now have a concrete movie that we are processing, and the above approach over iterating over the day m that would be the last one to be filled on the segment now works, since we can check if this concrete movie covers that day or not, and since before the (k+1)-th movie day m was free, the subproblems before and after day m can be treated independently and theirs answers multiplied. 

This means that I was on the right track (which makes sense, given Riku's hints), and still could not see the last step of the solution for many hours. Any advice on how to avoid getting stuck here? Maybe there is a sub-step that can be made first?

The second problem was from the EGOI 2025: there are n<=50 players playing bingo. Each player will have a card with exactly 12 numbers, each between 1 and 50. Then, all numbers from 1 to 50 will be read out in some order, and the first player whose all 12 numbers have been read out will be declared the winner. Can you design the contents of the players' cards in such a way that there will never be a tie?

It seems surprising at first that this is possible at all beyond the trivial cases where the player cards do not intersect. On the other hand, we have only 50 possible inputs. So my first instinct was to implement some kind of randomized hill climbing.

In order to do that, we need to learn to check if a tie is possible for a particular set of cards. To do this, let us think how a tie can happen. Given a pair of cards x, y that have a common number u, reading this number would result in a tie between x and y if all other numbers from x and y have already been read, and no other card has won. So if we never have a tie, it means that for all such x, y, u there must be another card z that is a subset of the union of x and y, and does not have u.

This leads to the following hill climbing approach: while we have a possible tie, we need to either modify x or y to not have u, or to modify another card to become such z. We can make one of those things at random, and keep repeating this until we have no possible ties.

Unfortunately, this does not seem to ever converge for any n>=5, so we need to try something else. My next idea was to try coming up (on paper) with any non-trivial set of cards without ties. And I quickly found a somewhat natural set: consider 13 cards, each with all numbers between 1 and 13 except one. Then as soon as 12 out of the first 13 numbers are read, exactly one of those cards will win, so we never have a tie.

This only solves the problem for n=13, but we can make several copies of this to solve n=26 and n=39, and also augment with the cards that have 12 numbers that do not appear on any other card to also solve this for n=14, n=15, n=16, n=27, n=28.

The next idea is to try to generalize this a bit more. For example, we can build a similar set of cards, each with all numbers between 1 and 7 except one. Those cards satisfy the no tie criterion, but they have 6 numbers each instead of 12. However, now we can replace each number x with two numbers x,x+7, and obtain a valid set. So we have 7 cards with numbers between 1 and 14 and no ties, solving the problem for n=7, n=7+7, n=13+7 etc. We can also use other divisors of 12 in the same fashion to also solve n=5 with numbers up to 15, n=4 with numbers up to 16, n=3 with numbers up to 18. Combining those actually solves 33 out of 50 test cases, yielding 66 points.

It turns out that we can generalize this pattern even more: instead of having all numbers except one on a card, we can have all numbers except two, or all numbers except three, and so on. This gets us n equal to some combination numbers. And it turns out that combining copies of those constructions allows us to solve all 50 testcases. The most difficult is n=47, which actually needs all numbers from 1 to 50 (so the two 50s in the problem statement, the maximum n and the maximum number on the card, are in fact the same only by coincidence).

Before solving this problem, I was hoping that there would be some computer experimentation involved, and indeed, even though the constructions for individual combination numbers can and likely will be invented on paper, checking if combining their copies always fits within 50 is much easier to do with a quick knapsack implementation. So insisting on solving the problem fully on paper might backfire as one might search for an even more general construction without realizing the current one is already enough.  

I have checked the editorial after getting 100 points for the above solution, and I was very surprised to see that the solution there is exactly the same. It seems that the problem gives us so much freedom that there must be other approaches that work, potentially involving even more implementation/experimentation. Do you know one?

Thanks for reading, and check back tomorrow for this week's summary!

2 comments: