Sunday, May 5, 2024

A notorious week

Codeforces Round 942 was the first event of this week (problems, results, top 5 on the left, analysis). tourist has returned to the top of the scoreboard, and also to the top of the rating list — congratulations! It is also great to see that jqdai0815 keeps participating actively and getting great results after going for a long break during the pandemic. Who knows, maybe even I still have a chance to return to the top 10 :)

On Saturday, we hosted the online mirror of the Helvetic Coding Contest 2024 (problems, results, top 5 on the left, onsite results, analysis). This is originally an onsite competition that took place in Lausanne in April, it is ran completely by a group of student volunteers at the EPFL, who take care of all tasks such as finding sponsors, advertising the event, planning the onsite events, setting up the onsite competition environment, organizing meals, and of course preparing the problems. Seeing this volunteer-driven contest appear out of nowhere after a five-year hiatus really inspires and makes me proud of the competitive programming community. Therefore I have submitted two problems this year (A and D), I hope you liked them!

But I also need to mention that this year one of the problems unfortunately was the same as Problem 5 of the November Lunchtime 2021 at Codechef. As you can see, there were quite a few competitors that took part both in that round and in this week's mirror, and there could be much more who have upsolved it or heard about it from a friend, which of course made the contest worse, even though it was unrated. Below I am trying to share my understanding of what happened, but note that my goal is not to absolve myself from the responsibility — I think it was a failure, and I apologize for it.

As it is often the case, the reason for this seems to be bad communication along a chain of people operating with good intent. The original author (by the way, nice problem!) clearly knew that this problem was used for the Lunchtime when sending it to one of the HC2 organizers, but expected the HC2 organizers to make their own judgement about how appropriate it is to use it in the onsite event, they likely did not even know that a mirror might happen.

But then as the information about this problem propagated from one HC2 organizer to another through a couple of more hops, this fact was lost, with various people thinking that this problem was only used for a private contest with a few participants, or that this problem was prepared but rejected and not used at all in the past.

What probably made the matter worse is that different people have different perceptions of what is OK (should we never give the same problem to two contests? Or maybe it is OK if the set of participants is not intersecting and it was not available publicly after the first one? Or maybe it is OK if the first occurrence happened long ago? Or maybe it is OK if the second round is unrated?), and this perception affects what information about the problem they decide to communicate to other people.

Moreover, people often expect other people to have the same perception of the situation, and therefore treat the lack of communication as information as well. As a result, the people organizing the mirror (such as myself) did not try to figure out more information about the origins of this problem even though we knew from Polygon that it was prepared a bit earlier, assuming that other organizers who are closer to the beginning of this communication chain know the problem better and therefore are in a better position to judge if it is appropriate to use, so if they say nothing, all is good. But this line of reasoning fails to account for the fact that the people closer to the beginning of the communication chain have a different context (might not even be aware that there's a public mirror, for example) and different perceptions of what is OK.

So here is my takeaway from all this: more communication is always better when preparing a contest! I will try to keep this in mind when preparing future rounds, hopefully including the Helvetic Coding Contest 2025.

In my previous summary, I have mentioned a Codeforces problem. You are given two integers n and k. Your goal is to create a (multi-)set of positive integers such that among its sub(-multi-)sets we can find ones which sum to any integer between 1 and n, except k. n is at most 106, and the set you create must have at most 25 elements.

The first part of the solution is somewhat clear/standard: we need to be able to represent all numbers between 1 and k-1, but not the number k. For this, we can take all powers of two less than k: 1, 2, 4, ..., 2i, such that 2i<k<=2i+1, but then in order to not overshoot k we should replace 2i with k-2i: then the sum of all numbers is k-1, and clearly all numbers between 1 and k-1 are still representable.

Then, as long as all other numbers that we take into our set are at least k+1, k will still not be representable. But how do we cover all numbers between k+1 and n? After trying to come up with something concrete on paper unsuccessfully for some time, I've decided to just run a dynamic programming that remembers which numbers are representable, and repeatedly take the smallest non-representable number. It is not obvious why this will have at most 25 elements, but it is very easy to try.

Here is the output of this approach for n=1000000, k=5:
1 2 1 6 11 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288

And for n=1000000, k=6:
1 2 2 7 13 19 38 76 152 304 608 1216 2432 4864 9728 19456 38912 77824 155648 311296 622592

Now we can notice a pattern: the numbers we add in the second phase are k+1, 2k+1, 3k+1, 2(3k+1), 4(3k+1), 8(3k+1), ...  We can now both be more confident that it will always fit under 25 elements, and also try to prove that this pattern always works. Or just submit. 

My submission in the actual contest is more complex than that, and even includes some randomization :) The reason for that is that I had a bug in the implementation of the simple dynamic programming which made me think it produces more than 25 elements sometimes, adding randomization helped fit under 25 but did not fix the bug, and after fixing the bug I did not check if randomization was still needed.

Thanks for reading, and check back next week!