Wednesday, July 24, 2024

A 39 vs 17 week

Codeforces Round 959 sponsored by NEAR was the main event of last week (problems, results, top 5 on the left, analysis). AWTF participants have probably already returned home and were ready to fill the top five places in this round, but Egor has managed to solve everything and squeeze in the middle of their party. Congratulations to Egor on doing this and to tourist on winning the round!

In my previous summary, I have mentioned two problems. The first one came from AWTF: you are given n<=250000 slimes on a number line, each with weight 1. You need to choose k of them, all others will be removed. Then, they will start moving: at each moment in time, every slime moves with velocity r-l, where r is the total weight of all slimes to the right of it, and l is the total weight of all slimes to the left of it. When r-l is positive, it moves to the right, and when it is negative, it moves to the left. Since this rule pushes the slimes towards each other, sometimes they will meet. When two or more slimes meet, they merge into one slime with weight equal to their total weight, which continues to move according to the above rule. Eventually, all slimes will merge into one stationary slime, suppose this happens at time t. What is the maximum value of t, given that you can choose the k slimes to use freely?

Even though the problem does not ask for it, it is actually super helpful to think where will the final big slime be located. For me, this would have probably been the hardest part of solving this problem. Why should one think about the final position if the problem only asks about the final time?..

After studying a few examples, one can notice that the final position is always equal to the average of the starting positions of the slimes. Having noticed this, it is relatively easy to prove: consider the function sum(ai*wi) where ai is the position of the i-th slime, and wi is its weight, and consider two slimes (ai,wi) and (aj,wj). The first one contributes -wi to the velocity of the second one, while the second one contributes wj to the velocity of the first one. Therefore together they contribute -wi*wj+wj*wi=0 to the velocity of value sum(ai*wi), therefore that sum stays constant. And it also does not change when two slimes merge, therefore it is always constant and has the same value for the final big slime.

Now is the time for the next cool observation, this one is a bit more logical though. Consider the last two slimes that merge. They split the original slimes into two parts. What happens to the weighted averages of the positions of those two parts sum(ai*wi)/sum(wi)? By the same argument as above, influences within each of the parts on that average cancel out. The influences of the parts on each other do not cancel out though, but we can easily compute them and find out that those two averages are moving towards each other with constant velocity equal to the total weight, in other words k.

Therefore if we know which two parts form the two final slimes we can find the answer using a simple formula: (sum(ai*wi)/sum(wi)-sum(aj*wj)/sum(wj))/k, where j iterates over the slimes in the first part, and i iterates over the slimes in the second part.

And here comes the final leap of faith, also quite logical: the answer can actually be found by finding the maximum of that amount over all ways to choose a prefix and a suffix as the two parts. This can be proven by noticing that the difference between averages starts decreasing slower than k if some slimes merge across the part boundary, therefore the number we compute is both a lower bound and an upper bound on the actual answer.

We have not yet dealt to the fact that we have to choose k slimes out of n, but seeing the above formula it is pretty clear that we should simply take a prefix of all available slimes for the sum over j, and a suffix of all available slimes for the sum over i. Now all pieces of the puzzle fit together very well, and we have a full solution.

The second problem I mentioned came from Universal Cup: you are given a string of length n<=500000. We choose one its arbitrary non-empty substring and erase it from this string, in other words we concatenate a prefix and a suffix of this string with total length less than n. There are n*(n+1)/2 ways to do it, but some of them may lead to equal strings. How many distinct strings can we get?

A natural question to ask is: how can two such strings be the same? If we align two different ways to erase a substring of the same length that lead to the same result, we get the following two representations of the same string:

abcd=aebd

where in one case we delete the substring c, and in the other case we delete the substring e to obtain the result abd. We can notice that such pairs of equal prefix+suffix concatenations are in a 1:1 correspondence with pairs of equal substrings within the string (two occurrences of the substring b in the above example). It is not true though that our answer is simply the number of distinct substrings, as we have merely proven that the sum of c*(c-1)/2 over all repeat quantities c of equal substrings is the same as the sum of d*(d-1)/2 over all repeat quantities d of equal prefix+suffix, but that does not mean that the sum of c is the same as the sum of d.

However, this makes one think that probably the suffix data structures, such as the suffix tree, array or automaton, will help solve this problem in the same way they help count the number of distinct substrings. This turns out to be a false path, as the actual solution is much simpler!

Consider the middle part of the above equation (bc=eb), and let us write out an example with longer strings for clarity:

abacaba
abacaba

We can notice that the following are also valid ways to obtain the same string:

abacaba
abacaba
abacaba
abacaba

So the structure here is simpler than in counting distinct substrings. In order to count only distinct prefix+suffix strings, let us count only canonical respresentations, for example those where the prefix is the longest. The criteria for when we cannot make the prefix even longer is evident from the above example: the next character after the prefix (the one that is removed) must be different from the first character of the suffix, if any. Therefore the answer is simply equal to the number of pairs of characters in the given string that differ, plus n to account for the representations where the suffix is completely empty. 

Thanks for reading, and check back next week!

Sunday, July 14, 2024

An ecnerwala week

AWTF24 was the main event of this week. I have mentioned its results in the previous post, so I want to use this one to discuss its problems briefly.

Problems A and B both had short solutions and were quick to implement, but required one to come up with a beautiful idea somehow. When Riku was explaining their solutions during the broadcast, I was amazed but could not understand how to come up with them :) One way to do it, at least for problem A, was to actually start by assuming the problem is beautiful, and to try coming up with some beautiful lower or upper bound for the answer, which can turn out to be the answer.

To test if you can walk this path, here is the problem statement: you are given n<=250000 slimes on a number line, each with weight 1. You need to choose k of them, all others will be removed. Then, they will start moving: at each moment in time, every slime moves with velocity r-l, where r is the total weight of all slimes to the right of it, and l is the total weight of all slimes to the left of it. When r-l is positive, it moves to the right, and when it is negative, it moves to the left. Since this rule pushes the slimes towards each other, sometimes they will meet. When two or more slimes meet, they merge into one slime with weight equal to their total weight, which continues to move according to the above rule. Eventually, all slimes will merge into one stationary slime, suppose this happens at time t. What is the maximum value of t, given that you can choose the k slimes to use freely?

Problem D turned out to be equivalent to an old Codeforces problem applied to the inverse permutation. Most of this week's finalists have participated in that round or upsolved it, so it was not too unfair. The top two contestants ecnerwala and zhoukangyang did solve the Codeforces problem back in 2022, but did not remember it, and implemented the solution to D from scratch (even though of course having solved the old problem might have helped come up with the correct idea here). ksun48 and heno239 in places 3 and 4 did copy-paste their code from 2022.

Problems C and E involved a bit more code and effort to figure out all details, but one could make gradual progress towards a solution when solving them, instead of having to pull a beautiful idea out of thin air. Were I actually participating in this round, I would most likely spend the most time on, and maybe even solve those two problems.

Overall, this was a very nice round, and I'm looking forward to more AGCs in 2024 to try my hand at more amazing problems!

On the next day after the AWTF, the 3rd Universal Cup Stage 4: Hongō took place (problems, results, top 5 on the left). 8 out of 9 participants from the first three teams (congrats!), which coincidentally are also the first three teams in the season ranking, were in Tokyo, so Riku and Makoto have organized an onsite version of this stage at the AtCoder office. Solving a 5-hour contest with your team in person instead of online is already much more fun, but having other teams in the room and discussing the problems with them right after the round is even better. I guess I'm still yearning for the Open Cup days :)

I was solving this round together with Mikhail and Makoto. Thanks a lot Makoto for briefly coming out of retirement to participate, it was great fun solving this round together, and I can confirm that you're still very strong! Maybe we can have another go next year.

Problem N was not very difficult (I spent at least half an hour without any success, explained the problem to Makoto and he solved it almost immediately), but still enjoyable: you are given a string of length n<=500000. We choose one its arbitrary non-empty substring and erase it from this string, in other words we concatenate a prefix and a suffix of this string with total length less than n. There are n*(n+1)/2 ways to do it, but some of them may lead to equal strings. How many distinct strings can we get?

Thanks for reading, and check back next week.

Saturday, July 13, 2024

AWTF24 contest

AtCoder World Tour Finals 2024 took place today (problems, results, top 5 on the left, broadcast recordinganalysis). This was my first 5-hour commentating experience, and I enjoyed it a lot! How did it look from your side? Please share your improvement suggestions, just for the remote chance that I do not qualify again :)

This time the contestants had an opportunity to share their thoughts on stream (similar to the "confession booth" concept from some chess tournaments recently), and while not everybody used it, it was great fun and great insight to listen to those who did (for example; did somebody maybe gather all timestamps?). I hope this practice gets expanded and improved at future contests!

ecnerwala also tried to share his thoughts before the contest even started, but unfortunately the stream had not started as well at that point and therefore his words were not recorded. Nevertheless, maybe this helped him get into the right mood and solve problem E quickly, which was key to his win. Congratulations to him and to zhoukangyang and ksun48 who also won prizes!

Thanks for reading, and check back tomorrow for this week's summary.

Thursday, July 11, 2024

AWTF24 pre-contest

On Wednesday, the contestants were gathering in the hotel. The contestants from Europe and America hat some very long flights behind them, so there was not much appetite for activities. Therefore we played some board games in the hotel lobby in between short excursions to get some Japanese food. We did not actually meet most of the contestants from Asia — maybe the reason was that they actually had more energy for exploring Tokyo and did not hang around in the hotel :)

The games of choice (well, those were the only ones I brought so there was not that much choice...) were Cascadia and (Level 1 H-Group) Hanabi. It turns out that the synergies of the H-Group conventions are not so obvious at level 1, so probably next time we introduce somebody to them we should start at least with level 3.

We also got to witness the AtCoder admins printing the logos on the official t-shirts, as it turned out that the shop where one can print arbitrary content on a t-shirt in a self-service manner happened to be on the lower floors of the hotel building. Even though this is not much different from a normal printer, seeing how one can slightly adjust the image and then get an actual t-shirt with this image in a couple of minutes was quite impressive.

Today was a free day for the contestants, who have ventured a bit more into the city having rested from their travels. It was still funny with the timezone differences and jetlag, as the same meal was breakfast for me, lunch for the locals, and dinner for the contestants from America. Some contestants warmed up their problem solving capabilities by doing escape rooms, while others opted for actually solving old competitive programming problems for some last-ditch practice.

Tomorrow is the big day! The overall setup is similar to the last year, but with just one contest: 5 problems for 5 hours, the penalty time is computed as the time of the last successful submission (not the usual ICPC sum) plus 5 minutes for each incorrect submission. You can find more details in Riku's post. And of course tune in to see my and Riku's commentary on the live broadcast which will start at the same time as the contest, 12:30 Tokyo time, and last for 5 hours.

All 12 finalists are very strong, so it is hard to predict who will come out on top. zhoukangyang won 4 out of the last 6 AGCs, tourist has a lot of experience winning those contests, and jiangly has won the AWTF last year  — I guess we can keep an eye for those three, but anything can happen really.

Thanks for reading, and tune in tomorrow!

UPD: the live broadcast link has been updated.

Tuesday, July 9, 2024

A 熱中症予防 week

There were no contests that I'd like to mention last week, so I can get straight to the new format of this blog for the coming week: a travel blog! I am going to the AtCoder World Tour Finals 2024 in Tokyo. I did not manage to qualify this time, placing 14th when I needed to be in top 12, so I am going as a spectator and as a co-commentator for the stream, together with Riku, the main admin of AtCoder.

For a second year running, Tokyo welcomes the participants with an extreme weather warning in Japanese, this time for extreme heat. Please all take care, and focus on playing board games in the air conditioned hotel!

Speaking for 5 hours while 12 contestants are facing, to put it mildly, challenging problems is also not a walk in the park. Please help us by suggesting topics that we should discuss or things we should do on stream in comments!

In my previous summary, I have mentioned a Universal Cup problem: first, we draw the vertices of a regular n-gon (n<=10000) on the plane. Then we repeat the following process m times (m<=30000): take any 3 already drawn points (either the original n-gon vertices, or ones drawn during the previous steps) Ai, Bi, Ci, and draw the point Ai+Bi-Ci (the fourth vertex of a parallelogram). Finally, we need to handle multiple queries of the form: given k drawn points (the sum of k over all queries <=30000), do they form the vertices of a regular k-gon in some order?

We can get points that are really close to a regular k-gon but not exactly one in this problem, and no feasible floating point precision is enough. Therefore we need to solve it using only integer computations. Nevertheless, let us first consider how a floating-point solution might work.

We can imagine that the points lie on the complex plane, and the initial n points are ei/n. Drawing a new point corresponds to computing Ai+Bi-Ci using the complex number arithmetic. There are many ways to check if k computed points form a regular k-gon, here is one: we need to check that the k points, in some order, can be represented as x, xy, xy2, ..., xyk-1, where y is such that yk=1 and no smaller power of y is equal to 1. Note that this order does not have to be the clockwise/counter-clockwise order of the vertices: multiplying by y can also represent a jump by any coprime number of edges, and this criterion will still be valid. Also note that we can actually pick any of the vertices as x and such y will exist, moreover there will be φ(k) different values of y that work for each vertex. So one way to find y is to just try a few other vertices z of the polygon, let y=z/x, and check if the criterion is satisfied. Since φ(k) is not too small compared to k, we should find a valid y after a few attempts, let's say at most 50. Of course, we could've just said that y=ei/k, but you will see below that the y=z/x approach leads to an interesting question that I want to discuss.

If we denote the "base" point as u=e2π/n, then all other initial points are powers of u, all computed points are polynomials of u, and the checks we are making boil down to whether a certain polynomial of u with integer coefficients is equal to 0 or not (even though we also use division, checking if poly1(u)/poly2(u)=poly3(u)/poly4(u) is the same as checking if poly1(u)*poly4(u)-poly2(u)*poly3(u)=0). We could try to maintain said polynomials with integer coefficients, but since the degrees would be on the order of n, and the coefficients themselves could be huge, this is not really feasible within the time limit. 

Here comes the key idea: instead of complex numbers and u=e2π/n, let us do the same computations using integers modulo a prime p and using such u that the order of u modulo p is equal to n. Such u exists if and only if p=sn+1 for some s, so we can search for a random big prime number of this form, which can be found quickly since all arithmetic progressions with coprime first element and difference have a lot of prime numbers.

This actually works very nicely and allowed us to get this problem accepted. However, why does this actually work? The order of u is the same in our two models (complex numbers and modulo p), so every polynomial of the form ut-1 is equal to 0 in both or in neither model. However, this does not guarantee the same for an arbitrary polynomial with integer coefficients, or does it?

Of course it is not true for an arbitrary polynomial. For example, the polynomial p is equal to 0 modulo p, but not equal to 0 in complex numbers. However, we can deal with this by picking the modulo p at random, and potentially also checking several moduli in case the judges actually create testcases against many potential moduli. So the real question is: are there polynomials for which being equal to 0 differs between the complex numbers and computations modulo p for all or a significant fraction of all possible values of p?

Here we need to bring in some maths. When two polynomials with rational coefficients are equal to 0 for the given u their greatest common divisor also has rational coefficients and is also equal to 0 for the given u, which means that there must exist a minimal polynomial such that a polynomial with rational coefficients is equal to 0 for the given u if an only if the minimal polynomial divides it. Such minimal polynomial for our u is called the n-th cyclotomic polynomial Φn(u).

Now, consider the equality un-1=Φn(u)*gn(u) (where gn(u) is just the result of dividing un-1 by Φn(u)). This equality is true in rational numbers, so it is also true modulo p where there is no division by p in it, so for almost all p. The left-hand side is 0 modulo p because of our choice of u, so either Φn(u) or gn(u) must be 0. However, from the structure of the cyclotomic polynomials we know that gn(u) is a product of cyclotomic polynomials of smaller order, so if it was equal to 0, it would mean that the identity ut-1=0 would hold for some t<n, which contradicts our choice of u. So we know that Φn(u)=0 modulo p, which means that every polynomial with integer coefficients that is equal to 0 for the given complex u will also be equal to 0 for the given u modulo p. So we have proven one of the two required implications.

Now let us tackle the the opposite implication. Consider a polynomial h(u) with integer coefficients that is equal to 0 for all or a significant fraction of all possible values of p (with the corresponding u). If Φn(u) divides this polynomial (as a polynomial with rational coefficients), then it is also equal to 0 for the given complex u, as needed. If Φn(u) does not divide it, then we can find the greatest common divisor of Φn(u) and h(u), again doing computations using polynomials with rational coefficients. Since Φn(u) is irreducible over polynomials with rational coefficients, this greatest common divisor will be 1, so we have 1=Φn(u)*a(u)+h(u)*b(u). The right side involves a finite number of different integers in denominators, so this equality will also hold modulo p for all p except those dividing one of the denominators, in other words for almost all p. But since both Φn(u) and h(u) are equal to 0 for all or a significant fraction of all possible values of p, this means that 1 is equal to 0 for all or a significant fraction of all possible values of p, which is a contradiction. Therefore we have also proven the opposite implication and this solution does in fact work.

There are still a few things I do not fully understand about this setup. One is the following: it turns out that when n is odd, we can actually construct a regular 2n-gon (roughly speaking using the fact that -1 helps generate the other n points; there was such example in the samples for n=3, k=6), so k does not have to divide n. In this case, the number y that we find as part of solving the problem must have order 2n modulo p. However, note that in general it is not even guaranteed that there is any number with order 2n modulo p, as we only choose p in such a way that there is a number with order n. Since we compute y=z/x, we can do this computation for any p where we can compute z and x. So it seems that the above also proves that for almost all primes p if there is a number of odd order n modulo p, there is also a number of order 2n modulo p. This observation is in fact true for a straightforward reason: almost all primes are odd, so there is an even number p-1 of nonzero remainders, therefore there is a number of order 2, and we can multiply the number of odd order n by the number of order 2 to get a number of order 2n. Still, I can't get rid of the feeling that I might be missing something here. Any comments?

The second thing I don't fully understand is whether we truly need a full understanding of the structure of cyclotomic polynomials to prove that Φn(u)=0 modulo p. It feels that maybe there is an easier way to explain this that does not require so much knowledge?

Thanks for reading, and check back for more AWTF updates!

Monday, June 10, 2024

A code week

Universal Cup started its 3rd season this Saturday with Stage 1: St Petersburg (problems, results, top 5 on the left, analysis). Team 03 Slimes was in the lead for a long time, then paused a bit and gave everyone a chance to catch up, only to solve 3 more problems in the last hour and claim the clear first place. And all that even though our team stole one of their usual team members and they had to participate with just 2. Well done!

Solving problem B was very satisfying, and I think describing its solution next week will help me understand the underlying math better, so here is the problem statement: first, we draw the vertices of a regular n-gon (n<=10000) on the plane. Then we repeat the following process m times (m<=30000): take any 3 already drawn points (either the original n-gon vertices, or ones drawn during the previous steps) Ai, Bi, Ci, and draw the point Ai+Bi-Ci (the fourth vertex of a parallelogram). Finally, we need to handle multiple queries of the form: given k drawn points (the sum of k over all queries <=30000), do they form the vertices of a regular k-gon in some order?

Initially the problem seems quite straightforward as we can just simulate the process and check the required condition using some simple geometric formulas. However, it turns out that we can get points that are really close to a regular k-gon but not exactly one in this problem, and no feasible floating point precision is enough. So how to solve this using only integer computations?

Codeforces Global Round 26 wrapped up the week on Sunday (problems, results, top 5 on the left, analysis). One of the highlights of the round was an amazing successful experiment, but there was also a pretty close fight for the first place, to the point that jiangly needed just one successful hack to win, which he unsuccessfully attempted. I wanted to check if he had a chance, in other words if there were any failed system tests in jiangly's room, but I could not find a way to find which of the 646 available rooms is it. Does anybody know how to find the room of another contestant? Apparently this link was staring me in the face, it is shown at the top of the user's submission history, which appears after a double-click in the corresponding standings cell. And it turns out that there were no failed system tests in jiangly's room, which means that his only chance for the required +100 could lie in some anti-hash tech.

In any case, tourist held on to his first place, and increased the gap at the top of the rating list. Interestingly, the top 3 in this round coincided with the top 3 by rating after Benq did find his successful hack. Congratulations to tourist, jiangly and Benq!

Thanks for reading, and check back next week.

Friday, May 24, 2024

A return@week

Kotlin Heroes Episode 10 on Codeforces was the main event of last week (problems, results, top 5 on the left, analysis). Only solutions in Kotlin programming language could be submitted. Long gone are the days of IDEA's Java-to-Kotlin converter, at least nobody in the top 20 seems to use it, and I think the Kotlin Heroes series is quite a success in introducing people to Kotlin and getting them to like it. At the same time, Kotlin knowledge is still not at 100% in the community, so other translation tools are on the rise :)

The top 2 were the same as in the previous edition, and they were the only participants to solve all problems. However, this time arvindf232 was faster and earned a well-deserved victory. Congratulations to arvindf232 and tourist on the great performance!

Tourist's wrong submissions on E and F have seriously damaged his winning chances. The wrong submission on E (link, compare to correct) stems from a wrong understanding of an advanced language feature, return@repeat, which turned out to mean continue and not break :) And the wrong submission on F (link, compare to correct) stems from not using a prewritten code library, more specifically a modint class. The winner's submission to the same problem does use prewritten code to deal with modular arithmetics, but interestingly it is not done through operator overloading, but rather through defining new infix operations that also refer to a global variable, leading to weird-looking code like ret[paths] = ret[paths] mp (count mm FACT.choose(options, k-1)), where "mp" and "mm" stand for modular addition and multiplication. arvindf232 or others reading this, what is the reason for this approach?

Thanks for reading, and check back next week!