Algorithms Weekly by Petr Mitrichev: 2023

Sunday, December 31, 2023

A run twice week

The 2nd Universal Cup Stage 15: Macau took place last week, but its results were not public when I wrote the last summary (problems, results, top 5 on the left, analysis). Similar to the previous stages, this seems to have originated as an ICPC regional contest, but this time the top onsite team got quite high place 11 on the Universal Cup scoreboard (still with 9 problems, which seems to be a universal constant) — I guess we need to keep an eye at the Peking University team at one of the upcoming World Finals :) Congratulations to USA1 and HoMaMaOvO, the clear top 2 in the overall standings as well, on solving everything!

The 2nd Universal Cup Stage 16: Run Twice followed this Saturday (problems, results, top 5 on the left, analysis). The round featured 11 problems in the relatively new run-twice format (announcement), which I think is an awesome idea that extends the boundaries of what an algorithmic contest problem can be. The new format did not bring a new winner, as team HoMaMaOvO solved everything with an hour to go. Well done!

I did not participate in this round, but I checked out the problems because the topic was so interesting, and I'd like to highlight problem F: your program is executed twice. In the first run, you are given a graph with n=1000 (note that it is not n<=1000, but exactly n=1000) vertices, and 2000<=m<=5000 edges. It is guaranteed that this graph was generated randomly by adding a uniformly random non-existing edge m times. You can make at most 5 changes to this graph, where one change is adding an edge between two vertices not connected by an edge, or removing an edge. The first run of your program ends there. The graph with your changes applied is then shuffled (randomly permute vertices, edges, and ends of an edge), and is given as input to the second run of your program. The trick is that your program is not told whether it is the first run or the second run, and instead needs to detect it itself. In other words, you need to apply such changes in the first run that you'd be able to detect that the graph was changed from a random one in the second run. Do you see a way?

In continuing another great Open Cup tradition, Universal Cup is holding a Prime Contest this and next week. Given the higher amount of Universal Cup rounds, the Prime Contest appears practially infinite with problem numbers going up to 167, and nevertheless 36 out of 39 problems have already been solved, and even more amazingly 26 of those were solved by the same team! There are still 5 days left and 3 problems have not yet been solved, so maybe this is your chance — but remember that those problems are not for the faint-hearted :) You need an Universal Cup login to participate, and I believe you can get one via the registration form.

Codeforces Good Bye 2023 wrapped up the competitive year (problems, results, top 5 on the left). The round was not received well (1, 2, 3), but nevertheless congratulations to ksun48 on being the only one to solve everything and therefore getting a clear first place!

Thanks for reading, and check back in 2024.

Sunday, December 24, 2023

An odd knapsack week

Pinely Round 3 on Codeforces took place on Saturday (problems, results, top 5 on the left, analysis). Quite a few people got the first 8 problems correctly, some with more than half of the contest time left, but the last two problems proved to be quite tough to crack. Only zh0ukangyang got problem I right, and only maroonrk got problem H right, therefore earning the first two places. Congratulations!

Quite interestingly, this time there was no contestant who successfuly solved H or I after skipping one of the easier problems (so rainboy sadly scored 0 :(), which goes to show that such strategy looks amazing when it works, but also carries huge risks :)

As for my contest, I wasted way too much time on F and G, and therefore did not spend any meaningful amount of time on H and I.

In problem F, when trying to generalize my solution from F1 to also solve F2 I got a quadratic-time solution that seemed like it could be sped up by using fast convolution; however, I could not immediately see how, and the number of accepted solutions indicated that there is something simpler there. In the end, I got the worst of both worlds: I did not find the simpler solution, but I also did not pull myself together to write down the quadratic formula explicitly on paper. When I did this after the end of the round, I have quickly upsolved it using fast convolution since we were summing up a product of various factorials and inverse factorials of amounts that depend either on u, or v, or on u+v (where u and v are the nested loop variables that yield quadratic running time).

In problem G, I wasted about half an hour when my initial solution got TLE, even though its complexity was O(n). It constructed a suffix array instead of using the z-function though, so I guess the lesson learned here is that for practical purposes the O(n) suffix array construction should be treated as O(nlogn), as I assume the constraints were so high (n=10⁷) precisely to cut off O(nlogn) solutions. In the end I was able to replace the suffix arrays usage with z-function.

Having read maroonrk's solution for H (spoilers, do not click if you want to try solving yourself first!), I regretted a lot that I did not have time left to try solving it as the problem was amazing. Here is the problem statement: you are given an array of size 3*10⁵. In one operation, you can take any segment of this array of even length, and swap the 1st and 2nd numbers in that segment, the 3rd and 4th numbers, and so on. You need to sort the array in at most 10⁶ operations. You do not need to minimize the number of operations. Can you see a way?

In my previous summary, I have mentioned an AtCoder problem: you are given up to 100000 positive integers A_i, each up to 10⁹. Your goal is to split each A_i into two non-negative integer parts A_i=B_i+C_i so that there is no way to choose exactly one of B_i or C_i for each i such that the sum of chosen numbers is equal to exactly half of the sum of all A_i (that sum is guaranteed to be even).

When solving this problem, the first observation I made is that it is in fact hard, likely impossible, to check in a reasonable time if a given split into B_i or C_i satisfies the requirements or not, as it requires solving a pretty large instance of the knapsack problem. This may be the reason that the problem does not require to print a certificate, just a Yes/No answer.

This naturally leads to the following question: for which splits into B_i or C_i we can reasonably easily prove that achieving exactly half is impossible? To make such proof easier, it makes sense to split all even numbers exactly in half such that B_i=C_i: then we know for sure those numbers' contribution to the sum, and there are fewer possibilities to check. However, if all numbers are even and we do this for all numbers, then it would be possible to achieve exactly half of the total sum (in fact, it would be impossible to achieve anything else :)). But then we can do this even split for all numbers except one, and for one number (say, A₁) we set B₁=0 and C₁=A₁. Then we get exactly half from all other numbers, but if we choose B₁ then the sum is slightly less than exactly half of the total, and if we choose C₁ it is greater. Therefore we have solved the problem for the case where all A_i are even (the answer is always Yes).

What can we do about odd numbers? They cannot be split exactly in half, but we can try to build on the above construction: let us split all odd numbers almost in half, such that B_i+1=C_i, and split one number, the biggest one (assume we reorder the numbers and it is A₁), as B₁=0 and C₁=A₁. Now if the amount of odd numbers is less than A₁, then we still cannot achieve exactly half, because if we choose B₁, even taking C_i from all odd numbers will still leave us short of half of the total, and if we choose C₁, we overshoot. There is a slight complication that happens when A₁ is odd, as then we should not count it towards the amount of odd numbers we split almost in half; however, since the total amount of odd numbers is always even (because the sum is even), this does not affect our comparison and we can still compare if A₁ is strictly greater than the total amount of odd numbers.

This criterion was my first submission, however, it got WA. As I did not have any immediate ideas for other situations where achieving exactly half is clearly impossible, I implemented a brute force solution and stress-tested it against this one. The smallest counterexample it produced was: 1, 3, 3, 3. In this case we set all B_i=0 and C_i=A_i and there is no way to achieve the required sum of 5 from some subset of 1, 3, 3, 3. The first idea after seeing this was that divisbility by 3 is somehow a factor; however, quite quickly I realized that we can slightly generalize the construction from the first submission above: we take all odd numbers, sort them, and split them into two parts of odd size. In the part containing the smaller numbers, we set B_i+1=C_i, and in the part containing the bigger numbers, we set B_i+D=C_i, where D is the smallest of those bigger numbers. Now if the size of the part with smaller numbers is less than D, then we always fall short of half of the total if we choose more B_i's than C_i's in the part with the bigger odd numbers, and we always overshoot otherwise.

This solution passed the stress-test against the brute force for small constraints, therefore I submitted it and it got accepted. I did not bother proving it formally since the stress-test was proof enough, but the intuition is somewhat clear: now we say No only if there are at least two odd numbers up to 1, at least four odd numbers up to 3, at least six odd numbers up to 5, and so on until we run out of odd numbers, and the total amount of odd numbers is at least the biggest number. I did not write down all details, but the following method likely works to achieve exactly half in this case: we first go through all even numbers, and then through all odd numbers in decreasing order. If the sum we accumulated so far is bigger than half of total of the numbers processed so far, we take the smaller one of B_i and C_i, otherwise the bigger one. We can now prove by induction that after processing all odd numbers except the x smallest ones, the current sum differs from half of all processed numbers by at most (x+1)/2, which means that in the end it is exactly equal.

Thanks for reading, and check back next week!

Sunday, December 17, 2023

A three-step week

The 2nd Universal Cup Stage 13: Shenyang took place last week, but its results were not public when I wrote the last summary (problems, results, top 5 on the left, analysis). The first two places in this round coincide with the first two places in the overall Universal Cup standings, and they were also the only teams to solve 12 problems. So I guess one could say this round went exactly as expected :) Congratulations to USA1 and HoMaMaOvO!

This round used the problemset from an ICPC regional contest, and the best team from that contest is only on place 23 in the scoreboard with 9 problems solved, which underscores how the Univesal Cup gathers the best teams in the world.

The 2nd Universal Cup Stage 14: Southeastern Europe took place this Saturday (problems, results, top 5 on the left, overall standings, analysis). Team HoMaMaOvO got the second place just like last week, but the winner was different: team 03 Slimes got 11 problems solved at just 1:43 into the contest, and therefore had all the time in the world to solve the 12th. Congratulations on the win!

This round has also used the problemset from an ICPC regional contest, but this time the best team from the onsite round placed a bit worse — at place 36, with 9 problems solved.

Finally, AtCoder Grand Contest 065 wrapped up this week (problems, results, top 5 on the left, overall standings, analysis). There was a huge gap in difficulty and in scores between the first four problems and the last two, therefore in this round it could actually be a very good strategy to start with one of the two difficult problems to be able to properly estimate how many easier problems one can squeeze in the remaining time. mulgokizary and newbiedmy executed this strategy successfully to place 3rd and 4th, well done! Of course, it's even better if one can solve the four easier problems and one difficult one, as zhoukangyang and ecnerwala did :) Congratulations to them as well!

The round went quite well for me, in a large part thanks to the fact that I was able to quickly find this page for problem D, and this paper for problem F. However, while the implementation for D is pretty straightforward with the linked formula, one still needs to make a few more steps on top of the linked paper to get F, and I managed to get stuck in those steps: by the end of the round, my solution returned 125405280 instead of 128792160 for n=10 :(

While solving problem C in this round, I followed a quite typical pattern, at least for AtCoder problems: come up with a reasonably simple sufficient but not clearly necessary condition, implement it, submit, get WA, implement a stupid solution and run a stress test, find a case where the solutions differ, come with another, also reasonably simple sufficient but not clearly necessary condition, implement it, run stress test with larger cases, find a bug in the implementation, fix it, pass stress test, submit, get AC :) I think seeing the diffs found by stress test was instrumental for me to discover the correct solution idea. For those who solved this problem during the round or want to upsolve now, were you able to do it without the stress test?

Here's that problem's statement, aptly titled "Avoid Half Sum": you are given up to 100000 positive integers A_i, each up to 10⁹. Your goal is to split each A_i into two non-negative integer parts A_i=B_i+C_i so that there is no way to choose exactly one of B_i or C_i for each i such that the sum of chosen numbers is equal to exactly half of the sum of all A_i.

This round has seemingly concluded (even though a man can hope: maybe one more AGC in 2023 will pop up as a Christmas present? :)) the AtCoder Race Ranking 2023 (top 14 on the left). Therefore it seems that I have barely missed qualifying to the World Tour Finals in Japan. Amazingly, the cutoff for the top 12 did not change at all in this round, as Um_nik has kept his final qualifying place while being outside of the top 30 in the round. It means that even the fourth place would have been enough for me to qualify. Not making a wrong attempt on C (or convincing Gennady to skip this round to increase my chances) would have gotten me the fifth place, but to get fourth I really had to solve either E or F. Well, I will try harder next year, and huge congratulations to the 12 finalists!

In my previous summary I have mentioned a Hacker Cup problem: two players are playing a nim-like game, starting with two piles of stones. In one move, a player first chooses one of the remaining non-empty piles, let's say this pile has k stones. Then they can take between A_k and B_k stones from this pile, and they also must create a new pile with C_k stones (1 <= A_k <= B_k <= k, 0 <= C_k < k). Since 1 <= A_k and C_k < k, this game will eventually terminate, and the player unable to make a move loses the game. Your goal is to find for each size (between 1 and n) of the first of the two initial piles the smallest size of the second initial pile that leads to a losing position, and print the sum of those n sizes. n is up to 2 million.

The first step in solving this problem is pretty straightforward. As the games on each pile are independent, we can use the Sprague-Grundy theorem, therefore we just need to find the nimber for a pile of size k for each k. Denoting this nimber as N_k, from the game rules we get that N_k=mex(N_i⊕N_{C_k} over all i between k-B_k and k-A_k).

So we need some data structure that can find mex on a range, with the added twist that all numbers on the range are first xored with some constant. Finding things on a range is typically done with a segment tree, but to find mex even without the xor-constant complexity would require to propagate a lot of information along the tree.

The key step to progress further in solving this problem is to actually forget about the ranges for now, and focus on the xor-constant part. Suppose we just have a static set of numbers, and need to answer questions: what is the mex of all those numbers xored with a given constant? In this case it is reasonably clear what to do: we need to determine the mex bit-by-bit from the highest bit to the lowest bit. Suppose we want to find the k-th bit having already found out that the answer is equal to r for bits higher than k, in other words we know that the answer is in range [r,r+2^k+1), and need to tell if it is in range [r,r+2^k) or [r+2^k,r+2^k+1). Because bitwise xor is applied independently to high and low bits, we simply need to know if there is at least one number missing in our set from the range [r⊕s,r⊕s+2^k), where s is the bits k and higher from our constant. And finding if a number is missing on a range can be done with a balanced tree or again with a segment tree. Note that even though we forgot about the ranges, the ranges have reappeared: instead of ranges on k, we now have ranges on N_k.

Now let us reintroduce the ranges on k. First, let us consider only half-ranges: suppose A_k=1 for all k. Then in the above bit-by-bit solution we need to find out if there is at least one number missing from a given range on a suffix of N_k. This can be done by modifying the segment tree approach: let us use a segment tree that, instead of just remembering if a certain number has appeared or not, will remember is rightmost appearance. Then we can find the minimum of those appearances on the needed range, and compare it to k-B_k. In fact, since all ranges of nimbers that we query are aligned with the powers of two, each query will exactly correspond to one of the nodes in the segment tree, and therefore can be done in O(1) (but an update still needs to touch O(log(n)) nodes).

What to do about the other side of the range on k, in other words when A_k>1? Here comes another relatively standard trick: since we only look at indices up to k-A_k, we could have executed this query when we were processing k'=k-A_k+1, and at that moment this query would be a half-range with only the left boundary, which we can handle using the procedure described above. So we would like to already compute N_k when processing k'=k-A_k+1, however we cannot do that since we might not know N_{C_k} at that point yet if C_k>k-A_k. This naturally points us towards persistent data structures: we can modify our segment tree to be able to not just query what is the minimum on a range, but to also to query what was the minimum on a range at any previous state of the data structure, in particular when k'=k-A_k+1.

There are several standard ways to do it, one of which is to actually store a tree as a set of immutable nodes with each node pointing to children, and every time we need to change the value in a node we would actually clone the node with the new value instead, together with its path to the root. This way we only create O(log(n)) additional nodes per operation, so the total memory usage is still acceptable at O(n*log(n)), but now since all nodes are immutable we can simply query any old root of the tree to get the minimum on a range at a point in the past.

I think this problem is educational since it has three steps of "unwrapping the present", as we first solve an easier version of the problem and then gradually add back the full complexity. Each particular step is more or less a well-known trick, but one still needs to find which simplifcation of the problem to tackle first, and for that it is vital for those well-known tricks to really be "in RAM", as well as to have a good intuition about what is not solvable at all, so that one can explore many directions and find the correct three-step path. If one has to think for half an hour to solve each particular step, there is really no chance to find the correct sequence of three steps in time, as there will necessarily be other promising directions that won't lead anywhere but waste a lot of solving time.

Thanks for reading, and check back next week!

Sunday, December 10, 2023

A 17107 week

TopCoder SRM 851 was their first round after a long while (problems, results, top 5 on the left). Only three contestants got the 1000 right. Out of those, snuke had a small lead after the coding phase despite a resubmit on the 1000, and he managed to extend the lead with an active challenge phase (+2-2). Well done!

Meta Hacker Cup 2023 Final Round was the last but also the most important event of the week (problems, results, top 5 on the left, analysis). The scoreboard was quite exciting to watch during the round, as different people went for completely different orders of tackling the problems (and also for creative ways to avoid thinking too much about cacti!). The order did not matter in the end as only Gennady managed to solve all problems, which was actually necessary for him to claim the first place from Benq who had a better penalty time. Congratulations Gennady on winning the 5th Hacker Cup title!

As I saw the point values for the problems, the optimal strategy seemed obvious: problem A with its 19 total points effectively has weight 2 for the penalty purposes, so assuming the 19 points accurately reflect its difficulty, it is much better to solve it in the beginning. Combine this with the fact that it was a constructive problem, the type I typically enjoy solving, and you can guess what I did for the first 2 hours of the 4-hour round :) I think the main reason it took so long was that I was constantly quite close to a working solution, and therefore it always seemed that I need just one more small trick, so I went for small tricks instead of rethinking the approach. I ended up accumulating a lot of those small tricks: in addition to k->2k and k->2k+1 steps that everyone likely used, my solution also used k->4k-1, k->8k-1, ... (therefore using the A=A-1 primitive that the official analysis dismissed so easily :), and also k->3k and k->3k+2; to accommodate such a wide variety of possible steps, I chose the cells to include into the labyrinth outside of the main path dynamically based on the type of the next primitive I needed.

Even though spending 2 hours on this problem ruined my (mostly theoretical anyway) chances for a good result, I am still grateful to the organizers for not setting the tightest possible constraints and therefore allowing alternative approaches like mine.

One of the problems that I could not crack in the remaining time even though I was very close, and that I think is quite educational despite having a very ugly statement, is Problem E: two players are playing a nim-like game, starting with two piles of stones. In one move, a player first chooses one of the remaining non-empty piles, let's say this pile has k stones. Then they can take between A_k and B_k stones from this pile, and they also must create a new pile with C_k stones (1 <= A_k <= B_k <= k, 0 <= C_k < k). Since 1 <= A_k and C_k < k, this game will eventually terminate, and the player unable to make a move loses the game. Your goal is to find for each size (between 1 and n) of the first of the two initial piles the smallest size of the second initial pile that leads to a losing position, and print the sum of those n sizes. n is up to 2 million.

Thanks for reading, and check back next week for the results of the most important round of December, the last AGC of the year! Thanks to Um_nik now I know that 12 people (up from 8) qualify to the WTF, which means that even the second place will probably do ;)

Saturday, September 23, 2023

A MEX week

Codeforces CodeTON Round 6 was the main event of the last two weeks (problems, results, top 5 on the left, analysis, discussion). orzdevinwang solved 8 problems while everybody else got at most 7 and I barely got 6, and earned a well-deserved first place. Congratulations!

While I could more or less keep up with the leaders on the first 4 easier problems, I was not able to solve E at all and spent a lot of time implementing, speeding up and debugging F, even though the algorithmic solution was clear to me reasonably quickly. On the other hand, I could solve G in 22 minutes, which seems to be the fastest among the top scorers, but it was already too late to catch up :) I guess that's one more lesson to read all problems, at least when one is stuck trying to solve the next problem by difficulty.

Here is problem F that has caused me so much implementation pain: we write down all integers between 1 and n as strings in base k (assuming we have characters for all digits between 0 and k-1). Now we sort those strings lexicographically, for example the first few would typically be 1, 10 (=k), 100 (=k²), ... How many numbers are in the same position in this sorted order as in the order where we just sort by number? n and k are up to 10¹⁸, and you have to solve 1000 testcases in 3 seconds.

In my previous summary, I've highlighted one of the AWTF problems: n people are standing in a row, each with either a red or a blue hat on. Each person can see the hat colors of the people in front of them (with smaller position numbers), but not their own hat color or the hat colors of the people behind them. Each person also knows the total number of red and blue hats. Then, the following process happens in rounds: in one round, every person who can already deduce their hat color, declare that they have done so (they do not declare the color itself). If multiple people can deduce it in the same round, they declare it simultaneously without hearing each other first. In the next round, on the other hand, people can already use the information of who has declared in the previous round to potentally make additional deductions themselves. Which people will eventually declare, and which will still be in the dark about their hat color even after an large number of rounds?

The key step in such problems about logic is to figure out the correct formalization. What exactly does it mean to be able to deduce one's hat color using the information of who has declared in prevoius rounds? Or in other words, we can start by finding a solution that runs with any time complexity, but that is correct. When I was solving this problem, I've only thought and implemented such solution for a stress test after my approach did not pass the samples, which in hindsight was way too late.

Here is the formalization: there are C(n,r) possible sequences of hat colors, where r is the globally known number of red hats. After the i-th round, from the point of view of the first person who does not see any hats, some of those sequences are still possible, and some are not (in other words, if the hats did in fact correspond to this sequence, would everybody say what they have said?). This set of possible sequences S_i also clearly defines what all other people are thinking: for each person, the set of possible sequences that is possible from their point of view is equal to the intersection of S_i with the set of sequences that have the correct hat colors for those hats that they see. This sounds trivial when spelled out, but it was actually not that easy to state it for me during the round.

Now, what will each person do during the i-th round? They will look at the set of sequences that are still possible from their point of view (given by the intersection mentioned above), and check if their hat color is the same in all of them. If yes, they will declare, otherwise they won't.

How to compute S_i given S_i-1 and the declarations? We need to check the declarations that would have happened for each sequence in S_i-1 (assuming each person sees some prefix of that sequence), and remove those sequences where this set of declarations does not match one for the real sequence of hats. Once again, this looks very simple, almost trivial, but it was actually far from easy to state concisely.

This is how far I've got during the round: I've implemented a slow solution based on the above, and was trying to find some fast heuristic solution that would match it on random inputs. It turns out that this did not lead to a correct solution. Instead, one simply had to speed up the slow solution!

One had to notice that after each step, the set S_ican be described by the lists of possible amounts of red hats in each of the prefixes of the sequence. For example, suppose there are 4 red and 3 blue hats in total. The initial set S₀can then be described as: empty prefix has 0 red hats; the prefix of length 1 has 0 or 1 red hats; 2 has 0, 1, 2; 3 has 0, 1, 2, 3; 4 has 1, 2, 3, 4; 5 has 2, 3, 4; 6 has 3, 4; and the whole sequence has 4. Every sequence that has 4 red and 3 blue hats satisfies those constraints, and every sequence that satisfies those constraints has 4 red and 3 blue hats.

Then suppose during the first round only the last two people have declared that they know the color of their hats. It turns out that the resulting set S₁can then be described as: empty prefix has 0 red; 1 has 0, 1; 2 has 0, 1, 2; 3 has 1, 2, 3; 4 has 2, 3; 5 has 2, 4; 6 has 3, 4; 7 has 4.

More generally, if we describe the prefix of length i having j red hats as (i,j), then the (i+1)-th person will declare if exactly one of (i+1,j) and (i+1,j+1) is still possible, and not declare if both are possible. This criterion allows us to compute the declarations that actually happen, and the same criterion then allows us to eliminate states which contradict those declarations. However, we also need to pay attention to also eliminate states that do not have a possible predecessor (if both (i-1,j-1) and (i-1,j) are eliminated, then (i,j) should be eliminated as well), or a possible successor (if both (i+1,j) and (i+1,j+1) are eliminated, then (i,j) should be eliminated as well). This criterion is also the basis of the inductive proof that S_i can always be represented in the above form.

Therefore we can maintain the set of states (i,j) that are still possible after each round, and stop when this set of states stops changing.

Thanks for reading, and check back next week!

Sunday, September 10, 2023

A dual week

I have previously posted all the links to the AWTF onsite contests (day 1, day 2, thanks jonathanirvings for the screenshot!), so let us talk about the problems now. I have only solved a few of them, for the remaining ones I have either got the solution from other competitors or read the editorial.

On the first day, the easiest problem A "Save the monsters", similar to many game problems, required one to consider possible good strategies for both players until you find two that achieve the same score, thus proving that it is the answer. I have made a bug in my initial submit and therefore assumed that I had a logical issue and the approach does not work, but instead it was just a coding mistake.

The key to solving problem B "Non-Overlapping Swaps" was the idea that if we swap elements in positions 1 and 2, then this swap most likely does not contradict the swaps before and after, so we can use this to divide the entire process into two recursive parts with this swap in the middle. There were still details to figure out as "most likely" does not mean "for sure". This idea did not occur to me unfortunately. I did examine swapping the first two positions as the first move and then doing two independent processes for the two resulting cycles, but it was not clear how to transition from one to the other, and in fact I have found a small counterexample when doing such first move leads to inability to finish in time. I did examine other promising ideas, for example I've noticed that if we have overlapping swaps with a common end, we can always replace them with non-overlapping ones with the same effect, for example swapping positions (a,c) then (b,c) with a<b<c is equivalent to first swapping (b,c) then (a,b). Therefore I was trying to construct some kind of iterative process that gradually removes overlaps.

I've spent most of the time on problem C "Shrink the Tree". While I could make progress by finding a canonical representation for a tree, I did not have the key idea from the editorial that we should then switch to looking at the problem from the other side: thinking what are the obvious criteria for us to be able to collapse a given forest, and when are they not enough, potentially using a stress test to come up with new criteria. The approach of moving from two sides is similar to problem A, and will actually appear many more times in this problemset. One could also say that thinking from the other side makes an implicit assumption that the problem is beautiful: that the real criteria will be easy to state. Otherwise trying to conjure them up from the air would be much less promising than the "minimal representation for a forest" approach.

Solving problem D "Welcome to Tokyo!" required doing three things in sequence: applying the trick from Aliens, then looking at the dual problem for a linear program, and finally noticing that solving many instances of that problem with a varying parameter can be done efficiently. My issue was that after applying the trick from Aliens, which I of course considered, we seem to have made the problem more difficult than it was originally, as because of a different party cost we'd have to redo things from scratch every time, and therefore be at least quadratic. Therefore I have discarded this direction as a dead end.

Finally (for day 1), solving problem E "Sort A[i]-i" required noticing some tricky combinatorial identities. I have not spent much time on this problem because I expected the solution to involve heavy manipulations with generating functions, which I am not very good at. It turns out the generating functions were actually only needed to prove the identities, and therefore could probably be avoided completely. To be honest, I am not sure how feasible it was to find those identities in another way, maybe hos_lyric can share the solving process?

I'd like to give the readers a week to think about second day's problem A "Hat Puzzle", so I will just share the statement for now: n people are standing in a row, each with either a red or a blue hat on. Each person can see the hat colors of the people in front of them (with smaller position numbers), but not their own hat color or the hat colors of the people behind them. Each person also knows the total number of red and blue hats. Then, the following process happens in rounds: in one round, every person who can already deduce their hat color, declare that they have done so (they do not declare the color itself). If multiple people can deduce it in the same round, they declare it simultaneously without hearing each other first. In the next round, on the other hand, people can already use the information of who has declared in the previous round to potentally make additional deductions themselves. Which people will eventually declare, and which will still be in the dark about their hat color even after an large number of rounds?

Solving problem B "The Greatest Two" was once again reliant on "assume the problem is beautiful" or "come up with a bound and then assume/prove it is tight" approach: the key trick was to say that for every number we have a range of possible positions, and that those ranges were more or less independent. Simplifying the problem like this makes the solution a straightforward, if a bit tricky, implementation (it took me ~1.5 hours to implement after having this idea I think), but it was not at all obvious to me that this framework is correct at all, even though it can be proven by induction after the fact, so I just took a leap of faith.

Similarly, the solution for problem C "Jewel Pairs" seemed to involve two steps where one comes up with a reasonably simple bound and assumes/proves it: first the description of the matching criteria (from the words "Focusing on the form of the mincut" in the editorial; those words also mean that it might be able to deduce this form analytically instead of going "from the other side", but I did not try doing so myself as I was focusing on other problems), and then the criteria for dealing with the colors 2f(c)<=A-B.

The key to solving problem D "Cat Jumps" was finding a way to decompose the overall construction into building blocks that can be combined arbitrarily, so that we can use dynamic programming (or just multiplication) to compute the answer. This is a common theme in many counting problems, but even after reading the editorial I had no idea how to actually come up with the decomposition in this problem. It does not look that we can go from the other side in this case ("What could we reasonably compute with DP for the given constraints?"), and the decomposition itself is too complex to just appear out of nowhere. I did not think much about this problem during the round.

Finally, problem E "Adjacent Xor Game" was once again solved from the other side: one had to hypothesize or prove that all that matters in the end is how many times each bit is flipped (where going from y₁ to y₂ such that y₁^y₂=x we count all times a bit is flipped as we count y₁, y₁+1, ..., y₂, not just whether y₁ and y₂ have a difference in a given bit), and we can just get a bound on the answer independently for each bit, and then take a maximum of those bounds. I have spent time building a more direct solution instead (given the lower bound on y₁, how do we find the smallest y₂ for a given x?), figured out a rule for that with 4 cases, but this road did not seem to lead anywhere. Maybe had I considered replacing going from y₁ to y₂ in one go with processing y₁, y₁+1, ..., y₂ step-by-step, I would have come up with the criteria, but this idea never occurred to me.

Overall, the problems were beautiful and interesting to solve, even if I was feeling quite stuck for long periods of time :) The biggest common theme seems to be that in 5 out of 10 problems (1A, 1C, 2B, 2C, 2E, and maybe also 1D as linear programming duality is the same idea), one had to stop thinking "how can we optimize/improve what we already can" and go from the other side, "what would be the reasonable bounds/criteria when we clearly can't", and then either proving those are tight, or just assuming it. This could be seen as basing the solution on the fact that the problem is beautiful, or at the very least on the fact that it is solvalbe for the given constraints. So one takeaway is that I should try this solving approach more often.

I'm sorry for a long brain dump, feel free to tell me that what I'm writing makes no sense at all :)

The 2nd Universal Cup Stage 2: SPb also took place over the weekend (problems, results, top 5 on the left). This season follows the highly successful debut season of the Universal Cup, which has more or less taken Open Cup's space as the main team competition outside of the ICPC system. As I understand, Stage 1 of the 2nd Universal Cup was declared unrated because of the server problems, so this was the actual first stage.

On the surface, it might have seemed that making last season's winner USA1 even stronger would be impossible, but they have found a way, demolishing the field in just over three hours with Gennady on the team. Well done!

It would be nice to gather Petr Team again to participate, but given that the number of stages in a year is ~2x that of the Open Cup, with a stage is happening almost every weekend, the time commitment required to feel a real part of the proceedings would be way too big. We should try to do a one-off round some time, though :)

Codeforces Round 896 wrapped up the week (problems, results, top 5 on the left, analysis, discussion). Quite fittingly, the first two places in the round went to the problemsetter (well, well) and the winner of AWTF. Congratulations to both!

In the last week's summary, I have mentioned a Codeforces problem: you are given an array of at most 10000 integers, each between 0 and 2⁶⁰. In one operation, you split the array in the middle into two parts, compute the bitwise xor of each part, and discard the part where the bitwise xor is smaller. In case they are equal, you may discard either part. After doing this operation several times, you have just one number remaining. Which positions in the initial array could this number correspond to?

The first observation, which taking bitwise xors of two parts points to, is to notice that the bitwise xor of the bitwise xors of the two parts is equal to the bitwise xor of the entire array. Therefore if the bitwise xor of the first part is x, and the bitwise xor of the entire array is s, then the bitwise xor of the other part can simply be found as s^x. And when comparing x and s^x, they will differ in those bits where s has a 1 bit, therefore we need to look at the highest 1 bit of s, and we will always choose such part that x has a 1 bit in that position. This condition is both necessary and sufficient: any part which has a 1 bit in that position will be chosen.

The fact that only the highest 1 bit matters allows to speed up the straightforward dynamic programming from O(n³) to O(n²), because we can handle multiple transitions at the same time. The dynamic programming will compute for each of the O(n²) subsegments whether we can reach it. For O(n³) complexity, for every reachable state we can simply iterate over all ways to move the left boundary while keeping the right boundary constant, or vice versa, and check if a move is possible (by comparing x and s^x).

However, we can notice that doing the transitions that try moving from (l,r) to (l,r-2), (l,r-3), ... and the transitions that try moving from (l,r-1) to (l,r-2), (l,r-3), ... have many things in common. In fact, if (l,r) and (l,r-1) have the same highest 1 bit in their bitwise xor, then those transitions are either possible or not possible at the same time.

This hints at what should we be computing in the O(n²) dynamic programming: for every segment (l,r) we will compute the set of possible highest 1 bits (represented as a bitmask) of all reachable containing segments with the same left boundary (l,r₁) where r₁>r, and the same for the containing segments with the same right boundary. To compute this number for (l,r) we can take the same number for (l,r+1) and update it with the highest 1 bit of the bitwise xor of the segment (l,r+1), if it is reachable. And knowing this bitmask allows to check if this segment is reachable by simply testing if this bitmask has a non-zero bitwise and with the bitwise xor of this segment.

Another way of putting the same idea is to say that we're introducing intermediate states into our dynamic programming to aid reuse: instead of going from a reachable segment to its prefix or suffix directly, we now first go from a reachable segment to a (segment, highest 1 bit, decision whether we will change left or right boundary) tuple, which allows to add transitions between those tuples, and transitions from those tuples back to segments, in such a way that we have O(1) transitions per tuple instead of having O(n) transitions per segment. This would mean going from O(n²) states and O(n³) transitions to O(n²*#bits) states and O(n²*#bits) transitions, but then using bitmasks helps get rid of the #bits factor.

Thanks for reading, and check back next week!

Saturday, September 9, 2023

AWTF22 day 2

AtCoder WTF22 onsite round wrapped up today with the second contest (problems, combined results, top 5 on the left, mirror results, analysis, day 1 livestream, day 2 livestream). Once again I have managed to solve only one problem in 5 hours, but at least this time it was worth more points, and it was more satisfying as I have actually gradually approached the correct solution instead of just pulling it out of thin air. Overall, I remained in the 15th place that I had after the first round, and this is also the place I have when ordered by rating. I guess everything went as expected after all :) Qualifying next time would require crawling back into the top 8, though, and that will be a big challenge (current standings).

The overall top 3 solved non-intersecting sets of problems today, and it worked out the best for jiangly who combined the first place on day 2 with the second place on day 1 to win the event. Huge congratulations to him, ksun48 and tourist!

In a search for external factors that prevented myself from doing better, I've noticed that advancing to the finals from the 2021 season was the most correlated with the final results: the 2021 finalists got places 1,2,3,4,6,7,8 and 14. Maybe the problems for the finals were mostly prepared at the same time as the problems for the AGCs from 2021, and therefore require similar solving approaches? Unfortunately I have advanced in all other years, but not in 2021, hence the poor performance ;) Well done peti1234 for bucking the trend and winning the 5th place!

Huge thanks to the AtCoder organizing team for the amazing event, and to the problemsetters and testers for the problems! I have also enjoyed the social part immensely, meeting some old friends and getting to know new ones. I hope to meet you all again at some other competition in the future!

Thanks for reading, and check back tomorrow for this week's summary where I will also choose some AWTF problems to highlight.

Friday, September 8, 2023

AWTF22 day 1

AtCoder WTF22 onsite day 1 just finished (problems, results, top5 on the left, mirror results, analysis). I was actually able to get a problem accepted after two hours have passed, with the caveat that this was the only problem I got accepted :) I probably spent about an hour on B and about three hours on C, with no success. In C, I passed all samples except the last one. It turns out that I did have the correct "minimal representation" for a tree at the end, but then I tried to merge them to obtain a minimal representation for a forest in the same format, but it turns out that this was not possible at all: when merging trees B->W<-B (W is root) and W->B (B is root), the minimal representation is just B, but if we later merge it with a tree W->B<-W (B is root), we get a non-empty representation, while if we choose B->W<-B for the result of the first merge, then we can cancel everything out on the second merge. This counterexample stems from the correct solution idea, which tells that one of the conditions for being able to cancel everything is having at least two roots of different colors; hence any approach that merges the representations of trees has to be able to tell if we have seen two roots of different colors. I feel that this was pretty hard to come up with without actually coming up with the trick from the editorial directly.

Congratulations to everyone who was able to solve two problems in this round, and especially to ksun48 on the total domination and to hos_lyric for solving E!

Given that tomorrow's round has a 1000-pointer as the first problem, I might have to be content with the points I already have. But I will do my best ;)

Wednesday, September 6, 2023

AWTF22 taifun

One thing that greeted me on arrival to Tokyo was an unusual notification in Google Maps (see on the left). It looked somewhat scary, but then I remembered the amount of dangerous weather notifications that I get in Zurich that typically do not mean anything too unusual, and relaxed a bit. Imagine how much more relaxed I became when this note greeted me in my hotel room (see on the bottom-right).

So, maybe people with experience of Tokyo winds can share: how bad can it be?

In general, my approach to this trip is to avoid jetlag by living more or less on Swiss time. The contests start at 6:00 Zurich time, which is definitely early, but not outrageously so. Given that I'm here for just 4 days, it seems logical to not bother shifting the schedule by much, and to sleep 20:00-5:00 Swiss time (=3:00-12:00 here).

One of the benefits is that I get to do sightseeing in the evening, when it is much more pleasant outside, and there are less people everywhere. The flip side of this coin is that many places will be closed at night.

One particularly important challenge is getting food. Lunch happens during dinner time here, so it is not really an issue. I will be late for the hotel breakfast, but it will already be lunchtime, which means that I can get some lunch food in one of the cafes near the Shinjuku station (not sure if the lunch in the hotel restaurants will be quick enough to be in time for the contests). However, what to do about dinner? Any suggestions about where to eat in Tokyo at 1am? Bonus points if your suggestion comes in the next 2 hours ;)

AWTF22 arrival

So, an onsite event unfortunately means a couple of long flights :) Can you guess which of the two photos is the Zurich airport, and which is the Tokyo airport?

Surprisingly, there were no signs announcing the AtCoder WTF at the airport — how could the airport forget to greet the participants? ;) Having encountered the Japanese reality both in a good way (the immigration was well-organized, and apparently there's even a way to fill all forms online in advance) and in a not-so-good way (apparently the tickets for some, but not all, trains can only be bought in cash, so I had to wait for the next train because there was no time to run to an ATM), I have almost arrived at the competition hotel.

The competition itself takes place on Friday and Saturday, with two five-hour rounds currently scheduled. I do not know if those are just placeholders, but if not, that is quite a challenge for an individual event. I have a feeling that recently I do not get any problems accepted after the 2 hour mark in contests that last longer than 2 hours, so I will have to do very well in the first 2 hours of each day :)

There is probably a way to use Codeforces API to get a more grounded assessment of that feeling. Or maybe there is an existing tool that can help answer the questions like "what would my place be in Round X if it was over after 2 hours"?

Thanks for reading, and check back soon for more AWTF updates. And I'm sorry for the low fraction of the actual algorithmic content this week :)

Sunday, September 3, 2023

A Yoneda week

The 35th International Olympiad in Informatics in Szeged, Hungary was the main event of this week (problems, results, top 5 on the left, day 1 broadcast, day 2 broadcast). Tingqiang Xu and Siyuan Cheng were miles ahead of everybody else (who still did great :)), and only 1 point ended up deciding the overall winner. Congratulations to them but also to all medalists! My inevitable slide in the Hall of Fame continued, seemingly by just 1 place though.

According to the problem page, three out of six problems were created by the amazing team of square1001 and E869120. Creating just one IOI-quality problem is an amazing achievement requiring great ideas and weeks (if not months) of work, so it is hard for me to even imagine creating three in the same year. Well done!

Codeforces ran Pinely Round 2 just a few hours after the first IOI day was over (problems, results, top 5 on the left, my screencast, discussion, analysis). I wonder who is the highest-placed participant who took part in both :) Among the participants well past their IOI days, tourist got his first place using just under an hour out of the three hours available. It could all change at the end of the round, as both last problems were not completely unsolvable and saw a lot of last-minute submissions, but nobody was able to get all details exactly right. Congratulations to tourist!

Problem F did not require one to come up with complicated ideas, but instead required some accuracy "unwrapping the present" to arrive at the solution, and it was nice that the solution turned out quite easy to implement: you are given an array of at most 10000 integers, each between 0 and 2⁶⁰. In one operation, you split the array in the middle into two parts, compute the bitwise xor of each part, and discard the part where the bitwise xor is smaller. In case they are equal, you may discard either part. After doing this operation several times, you have just one number remaining. Which positions in the initial array could this number correspond to?

You might have noticed that the address of this blog has changed to https://blog.mitrichev.ch/. I am going to use this address going forward, even though the old address will keep redirecting to the new one for the foreseeable future. So, welcome to the new home!

It is also a good opportunity to write about my online presence in general. I do not have accounts in the common social networks, such as Facebook, Twitter/X, LinkedIn, TikTok, Instagram, etc. Of the social network-like things I mainly have this blog, my Youtube channel, and my Codeforces account.

Thanks for reading, and check back soon! I will try to post updates about the AWTF during the week.

Sunday, August 27, 2023

An onsite week

Harbour.Space Scholarship Contest 2023-2024 on Codeforces was the main event of the week (problems, results, top 5 on the left, discussion, analysis). Radewoosh solved the first eight problems 20 minutes faster than everybody else, and therefore had all but guaranteed himself first place by the middle of the contest. Still, he kept his focus and managed to get problem I accepted as well with just 3 minutes to go. Huge congratulations!

I skipped this round and therefore cannot tell you much about its problems. Therefore, I'd like to use this post to talk about the upcoming onsite event that you might not be aware of :) I'm not referring to the IOI, which takes place next week in Szeged, Hungary, since I expect everybody interested in it to know where to find more information. Instead, I'd like to talk about the AtCoder WTF that takes place on September 8-9 in Tokyo, Japan.

When AtCoder announced its own onsite competition, the World Tour Finals, back in 2017, the world of onsite algorithmic competitions was quite packed. They have managed to host one edition of the WTF before the pandemic hit. All onsite competitions were cancelled or moved online during the pandemic, but instead of returning after global travel became relevant again, most were cancelled completely: Google Code Jam, TopCoder Open, likely Meta Hacker Cup. There were a few more Russia-based onsite competitions that are of course no longer a thing. ACM ICPC and high school olympiads such as the IOI are back in the onsite mode, but they have age upper bounds for participation.

AtCoder did not move the WTF online, instead they kept running the yearly qualification cycle, promising to invite the top 8 from each year to the onsite round in Japan when it becomes possible again. This is finally happening in two weeks! 18 finalists out of the maximum of 32 from the four qualification cycles will compete on the same problems over two competition days. This approach lies in contrast to the ACM ICPC approach where the finals for the 2022 and 2023 seasons are happening together in Egypt but the actual competitions will be kept separate. Which one do you think makes more sense?

I am really excited both for the competition (where to be honest I no longer stand a chance), and for the onsite event atmosphere. As an added twist, AtCoder have announced the online participation option for the future WTFs; I can't predict how this will work out, but when TopCoder did the same for TCO22, so few people decided to come onsite that they decided to cancel the onsite event altogether. Therefore this might be the last onsite event without the age upper bound ever :)

Thanks for reading, and check back next week!

Sunday, August 20, 2023

An arborescence week

There was no contest that I'd like to mention this week, so let us come back to the AtCoder problem from the previous summary: you are given a connected graph where each edge is either red or blue, and each vertex is also either red or blue. You need to find a spanning tree in this graph such that for each vertex it has at least one adjacent edge of the same color as the vertex, or report that there isn't any.

The first step is to notice that all edges are of three types: type 2 are the edges where both ends have the same color as the edge, type 1 are the edges where one end has the same color as the edge, and type 0 are the edges where no end has the same color as the edge. We will also direct the type 1 edges from the vertex of the correct color to the vertex of the wrong color.

Clearly type 2 edges are the most useful for reaching our goal. In fact, if every vertex has at least one type 2 edge adjacent to it, then we can simply find the spanning forest among the type 2 edges. If it is a spanning tree, then we have solved the problem. If it is not connected, since we have already satisfied the color constraints, we can augment the forest to a tree using the remaining edges in any way.

What if there are some vertices that do not have type 2 edges adjacent to them? If there is a vertex with no outgoing type 1 edges, or in other words a vertex where all adjacent edges have the wrong color, then there is clearly no solution. So we only have to consider the case where some vertices have type 2 edges adjacent to them, and all remaining vertices have at least 1 outgoing type 1 edge. We will call the vertices with type 2 edges adjacent to them type 2 vertices, and the ones without them type 1 vertices.

The key remaining observation is that we cannot solve the problem using only type 1 edges. Each type 1 edge satisfies the color constraint for 1 vertex, but overall we have n vertices but only n-1 edges. So a logical idea is to start with the spanning forest of the type 2 edges that will cover all type 2 vertices, and then repeatedly try to find a type 1 edge that goes from a yet uncovered vertex to an already covered vertex (building something similar to an arborescence). If we manage to cover all vertices this way, then we can once again augment the resulting forest to a tree using the remaining edges. If we get stuck at some point, then there is no solution.

But why is this correct? We have proven that we need at least one type 2 edge, but why is it always correct to take ~all of them? This is a very typical situation for problems involving the spanning trees, as they all usually end up relying on some kind of greedy argument, so one could just bet on intuition here in my opinion. Still, here is the formal proof: consider the situation where our solution gets stuck. At this moment, the set S of uncoverted vertices contains only type 1 vertices, and there is no type 1 edge going from this set to the set of covered vertices. But then how could S be covered in any other solution? Since we have at most |S|-1 edges within S, and only edges within S can cover its vertices, one of them will have to remain uncovered in any tree.

Thanks for reading, and check back next week!

Sunday, August 13, 2023

An apiad week

AtCoder Grand Contest 064 was the main event of the week (problems, results, top 5 on the left, analysis, my screencast, race standings). apiad's usual strategy worked spectacularly well this time, as after spending two hours on E he was able to solve just enough easier problems to win the round, while those who started with the four easier problems did not have two hours remaining for E and could not figure out all details. Congratulatons on the victory!

I was one of those starting with the four easier problems. Somewhat unexpectedly, I did not get stuck in them and did actually have a bit more than an hour remaining for E, and made some significant progress on paper: I realized that if the sum of all a_i is zero, and the sum of all b_j is zero, and we can place the n² negated sums -(a_i+b_j) in the matrix in such a way that every column and row has exactly one of each a_i and b_j, then the sum of the 2n-1 cells in a cross, which is the sum of the row plus the sum of the column minus the middle cell, will be exactly a_i+b_j. I've then realized that if the sum of all a_i plus the sum of all b_j is divisible by 2n-1, then we can get both of those sums equal to zero with some constant shifts. Finally, I've correctly hypothesized that in case it's not divisible by 2n-1, then we can only achieve score n²-1, and figured out how to do so with some more constant shifts for the first row and the first column.

Therefore the only remaining step was to place the n² negated sums -(a_i+b_j) in the matrix in such a way that every column and row has exactly one of each a_i and b_j. This does not depend on the values of a and b, so this simply needs to be done once for every n. It seemed quite doable as it felt like a more advanced derangement, and derangements are quite dense. I've tried some random placements and noticed that I can find such a placement for odd n, but not for even n. And this is pretty much how far I've got in an hour :)

Judging from the editorial (which I do not understand), it seems that for even n we need to use more degrees of freedom, therefore while I enjoyed coming up with my approach, I needed another huge step to finish the solution, and therefore needed at least another hour.

I found problem B to be a cute test of how well one understands the spanning tree mechanics (it is amazing how many nice problems can be derived from a relatively simple concept of a spanning tree, and the fact that it can be found greedily!): you are given a connected graph where each edge is either red or blue, and each vertex is also either red or blue. You need to find a spanning tree in this graph such that for each vertex it has at least one adjacent edge of the same color as the vertex, or report that there isn't any. Can you see a way to do it?

Thanks for reading, and check back next week!