Algorithms Weekly by Petr Mitrichev: January 2024

Wednesday, January 31, 2024

A stable denominator week

TopCoder returned last week from another long break with SRM 852 (problems, results, top 5 on the left). The 1000-pointer was about counting k-th roots of a specific permutation, and it took the winner SSRS_ just 3.5 minutes since they reused their submission for a more general problem about counting k-th roots of any permutation. More generally this problem did not present as much of a challenge for the top participants as the 500-pointer, which saw many solutions fail and therefore offered a lot of challenge opportunities. Of the three contestants who managed to solve everything, kotatsugame was behind after the coding phase but managed to recover to the second place thanks to 200 challenge points. Congratulations to all three on the great performance!

The 2nd Universal Cup. Stage 20: Ōokayama followed, as usual, on Saturday (problems, results, top 5 on the left, analysis). 16 problems in a round is still a lot even by today's standards, but this still did not stop team HoMaMaOvO from solving all of them with 6 minutes to spare :) Well done! This is their 7th win of the season compared to 9 for USA1, and they are definitely still in contention for the overall first place.

Finally, Codeforces Round 921 wrapped up the competitive week also on Saturday (problems, results, top 5 on the left, analysis). Having dealt with the four easier problems in the first hour, I've decided to focus on problem F since there it seemed that we just need to solve the n=3 case and the rest will easily follow, while problem E gave some generating function vibes :) Unfortunately even the n=3 case in F turned out too hard for me. I have even implemented a brute force that tried different random strategies and it still could not solve n=3. The reason for that was probably the fact that the strategies I tried were non-adaptive: they asked the same questions irrespective of the answers.

Implementing a brute force over adaptive strategies seemed harder, so I've decided to give E another chance. I've realized it feels difficult because the number of choices we have always changes, therefore we are multplying probabilities with different denominators and it's not clear how to do the summation over different trajectories nicely. But then I remembered I already had this feeling in the past, and writing about this in my blog :) So I tried searching for [divide by the sum site:blog.mitrichev.ch], and for some reason this search returned what I was looking for on the first page. I've re-read that solution, and remembered the key trick: even if the overall number of choices is different every time, since all choices have the same probability at each step, the relative probability for a particular subset of choices of fixed size will always be the same. In the linked problem it was just 2 options each with probability 50%, while in problem E this time, if we focus on whether we visit a particular pair (x,y) or not, the number of choices affecting this outcome is always x+y, no matter the current state, so we can compute the probability of such visit happening very nicely.

There was still some work to do to improve the complexity from O(n²) to O(n*logn), and luckily I managed to do it before the end of the round. This was of course still not enough to compete with jiangly and others who focused on problem E earlier. Congratulations on the win!

Thanks for reading, and check back next week.

Sunday, January 21, 2024

A Frobenius week

The 2nd Universal Cup Stage 19: Estonia was the only event of this week (problems, results, top 5 on the left, analysis). Team 03 Slimes, who are a distant third in the overall standings, won their second stage of the season in an impressive fashion, beating the top two teams in the overall standings by two problems. Judging by the +32, some squeezing was involved, potentially of an approach that was not intended to pass — but that is also an important skill in algorthmic competitions, so well done! I am also not sure who actually was on the team this time, as Mingyang Deng is also listed as part of MIT CopyPaste on the same scoreboard.

Possibly rejuvenated by the news that the 2022 ICPC World Finals seem to be finally happening in April, Harbour.Space P+P+P followed in second place — congratulations as well! (jk, they actually wrote this contest as part of the Osijek camp back in September, so they were still practicing towards the original dates).

Finally, this week an anonymous LGM published a very nice result that drops a logarithmic factor from matrix exponentiation. Of course, theoretically we already know ways to drop much more from the asymptotic complexity, but all of those are not really beneficial in practice on the time limits prevalent in the algorithmic competitions. This result, however, did allow to slash the fastest time to find a power of a 200x200 matrix roughly 3x (compared to the straightforward binary exponentiation method; the judge also has some fast submissions from November 2023 that start with "vector<ll> f=a.poly();", so possibly some other version or precursor of this result?..

I guess now it will be a bit harder to separate wrong submissions when preparing problems, on the other hand the squeezing toolkit just got a bump :)

Thanks for reading, and check back next week!

Friday, January 19, 2024

A HoMaMaOvO week

The 2nd Universal Cup. Stage 18: Dolgoprudny was the only event of last week (problems, results, top 5 on the left, analysis). Team Hailiang FLS + RDFZ: Anonymous were the first to 6 problems at 1:58 into the contest, followed by Team HoMaMaOvO at 2:15. The remaining problems were much harder, and the teams were probably faced with tough choices between pursuing multiple problems in parallel and focusing on one problem to get it accepted. Both teams submitted 3 problems in the remaining time, and Anonymous were the first to get one of them accepted, but then HoMaMaOvO overtook them by getting two in the last hour. Congratulations to both teams!

In my previous summary, I have mentioned a Codeforces problem: consider a rooted tree where each node either has a leaf, or has exactly two children. For each node with two children, one of the edges to the children has weight 0, while the other has weight 1. We run a depth-first traversal of this tree starting from the root, and for each leaf we print the sum of weights on the path from the root to this leaf. Now we are given just the printed sequence and not the tree, and need to answer if there exists a tree as described above that could lead to this sequence being printed? For example, 2 1 0 1 1 can be printed, but 1 0 2 1 3 can't. The length of the given sequence is up to 2*10⁵.

The key observation in this problem is to imagine that we are building our tree step by step. Initially, we have only the root, which is also a leaf, so the sequence printed will be just 0. Now we add two children to the root, using edges with weights 0 and 1, and the depth-first search traverses them in some order. Depending on this order, the sequence we get will become either 0 1 or 1 0. We continue by repeatedly taking a leaf and adding two children to it, and the sequence always changes in the following manner: if the number corresponding to this leaf was x, then we insert x+1 into this sequence either directly to the left or directly to the right of it. And we can apply this operation to any number in the sequence, so we just need to check if the sequence we are given can be obtained starting from just the number 0 using this operation repeatedly.

Now let us look at the process in reverse, in other words we will repeatedly remove numbers that are adjacent to a number smaller by 1 until only 0 is left. What should we remove first? All numbers in this sequence that are adjacent to a number smaller by 1 are candidates. Now consider what happens when we remove number x that was adjacent to x-1: now x-1 becomes adjacent to some other number y. When y=x-2, the number x-1 become a candidate. When y=x, the number y becomes a candidate. In all other cases, no new candidates appear. This means that the new appearing candidates are never bigger than the number being removed.

Therefore we can safely remove any of the candidates with the highest value: we will never get candidates with even higher value in the future because of the previous argument, therefore those numbers themselves will never be useful to support removal of another number. So we can just keep removing one of the remaining highest values while it is possible, and we either reach a single 0, in which case we have found a solution, or we get stuck when all remaining highest values cannot be removed, in which case there is no solution.

Thanks for reading, and check back for more!

Sunday, January 7, 2024

A 1:1 week

For the third consecutive week, the competitive programming scene consisted of a Universal Cup round, and a Codeforces round, both on Saturday. The Universal Cup round was called Stage 17: Jinan (problems, results, top 5 on the left, analysis). The usual suspects occupied the first two places (well done!), but quite unusually the scores of two teams that participated in the original ICPC regional, both from Peking University according to Google Translate, were enough for 3rd and 4th. I guess we need to pay attention to Peking University at the upcoming World Finals indeed :)

The Codeforces round was called Hello 2024 (problems, results, top 5 on the left, analysis). I started relatively quickly, and was in 2nd place after solving problem F using the excellent AtCoder library segment tree template that helps focus on defining the monoid in question (search for "Data op" function in my solution). This has left me with 3 problems to solve: problem E, where the contour of a quadratic dynamic programming solution was more or less clear after reading the problem, and problems G and H, which looked very interesting to solve but where I did not have any good idea quickly. I've decided to implement E before thinking more about G and H — after all, how long can figuring out the details and implementing a relatively straightforward quadratic dynamic programming take? It turns out, almost two hours :(

ecnerwala did not get stuck on problem E, and therefore had plenty of time for the two harder problems, and he needed that time as he solved G with just 7 minutes remaining in the round and claimed the first place. Congratulations!

Let me highlight problem D from this round: consider a rooted tree where each node either has a leaf, or has exactly two children. For each node with two children, one of the edges to the children has weight 0, while the other has weight 1. We run a depth-first traversal of this tree starting from the root, and for each leaf we print the sum of weights on the path from the root to this leaf. Now we are given just the printed sequence and not the tree, and need to answer if there exists a tree as described above that could lead to this sequence being printed? For example, 2 1 0 1 1 can be printed, but 1 0 2 1 3 can't. The length of the given sequence is up to 2*10⁵.

In my previous summary, I have mentioned a Universal Cup problem: your program is executed twice. In the first run, you are given a graph with n=1000 (note that it is not n<=1000, but exactly n=1000) vertices, and 2000<=m<=5000 edges. It is guaranteed that this graph was generated randomly by adding a uniformly random non-existing edge m times. You can make at most 5 changes to this graph, where one change is adding an edge between two vertices not connected by an edge, or removing an edge. The first run of your program ends there. The graph with your changes applied is then shuffled (randomly permute vertices, edges, and ends of an edge), and is given as input to the second run of your program. The trick is that your program is not told whether it is the first run or the second run, and instead needs to detect it itself. In other words, you need to apply such changes in the first run that you'd be able to detect that the graph was changed from a random one in the second run.

As this comment suggests, there are so many things one can try in this problem. When I solved it earlier today, I've decided to go for the most general approach: let us choose some hash of a graph that is reasonably fast to compute, and that does not change when we shuffle the graph. Now we say we are in the second run if this hash modulo 10000 is equal to 0, and otherwise we just keep trying random edges to add to make this hash modulo 10000 equal to 0. This will require 10000 iterations on average to find, which should be fast enough as each iteration is O(m), and this will incorrectly work on the first run with probability one in 10000, which is tolerable as the problem likely has on the order of 100 testcases.

As for the concrete hash function, I've used the (generally incorrect) idea of the hash described in "Problem 2" in this post with 10 iterations, but using the polynomial combination method from "Problem 4" to quickly combine the values of adjacent nodes since it avoids sorting. Most likely any not extremely bad hash would work since the given graphs are random. This was accepted from the first attempt, I did not even have to do any hyperparameter tuning.

Finally, Mike Mirzayanov has decided to tackle a long-standing Codeforces issue: people using multiple accounts (and maybe also the same account being used by multiple people?). Of course, it escalated rather quickly :) As for me, the social aspect — competing against other people, seeing their progress over the years, learning what type of problems they solve best, talking to them on the forums, and so on — is very important in competitive programming, and it complements the problem solving part very nicely. I think people violating the 1:1 property of the person-to-account mapping do ruin the social aspect, especially if they compete at the highest level (in this case there were two accounts in the top 10 by rating), and therefore I support Mike's decision to act. It is also quite sad that it has come to this at all — competitive programming depends on participants' integrity in a big way (it is very hard to detect if one consults with others during the round, for example), and seeing top-level participants violate one rule that it hard to enforce does not lend confidence that other rules that are hard to enforce are still standing.

Thanks for reading, and check back next week!