Algorithms Weekly by Petr Mitrichev: September 2023

Saturday, September 23, 2023

A MEX week

Codeforces CodeTON Round 6 was the main event of the last two weeks (problems, results, top 5 on the left, analysis, discussion). orzdevinwang solved 8 problems while everybody else got at most 7 and I barely got 6, and earned a well-deserved first place. Congratulations!

While I could more or less keep up with the leaders on the first 4 easier problems, I was not able to solve E at all and spent a lot of time implementing, speeding up and debugging F, even though the algorithmic solution was clear to me reasonably quickly. On the other hand, I could solve G in 22 minutes, which seems to be the fastest among the top scorers, but it was already too late to catch up :) I guess that's one more lesson to read all problems, at least when one is stuck trying to solve the next problem by difficulty.

Here is problem F that has caused me so much implementation pain: we write down all integers between 1 and n as strings in base k (assuming we have characters for all digits between 0 and k-1). Now we sort those strings lexicographically, for example the first few would typically be 1, 10 (=k), 100 (=k²), ... How many numbers are in the same position in this sorted order as in the order where we just sort by number? n and k are up to 10¹⁸, and you have to solve 1000 testcases in 3 seconds.

In my previous summary, I've highlighted one of the AWTF problems: n people are standing in a row, each with either a red or a blue hat on. Each person can see the hat colors of the people in front of them (with smaller position numbers), but not their own hat color or the hat colors of the people behind them. Each person also knows the total number of red and blue hats. Then, the following process happens in rounds: in one round, every person who can already deduce their hat color, declare that they have done so (they do not declare the color itself). If multiple people can deduce it in the same round, they declare it simultaneously without hearing each other first. In the next round, on the other hand, people can already use the information of who has declared in the previous round to potentally make additional deductions themselves. Which people will eventually declare, and which will still be in the dark about their hat color even after an large number of rounds?

The key step in such problems about logic is to figure out the correct formalization. What exactly does it mean to be able to deduce one's hat color using the information of who has declared in prevoius rounds? Or in other words, we can start by finding a solution that runs with any time complexity, but that is correct. When I was solving this problem, I've only thought and implemented such solution for a stress test after my approach did not pass the samples, which in hindsight was way too late.

Here is the formalization: there are C(n,r) possible sequences of hat colors, where r is the globally known number of red hats. After the i-th round, from the point of view of the first person who does not see any hats, some of those sequences are still possible, and some are not (in other words, if the hats did in fact correspond to this sequence, would everybody say what they have said?). This set of possible sequences S_i also clearly defines what all other people are thinking: for each person, the set of possible sequences that is possible from their point of view is equal to the intersection of S_i with the set of sequences that have the correct hat colors for those hats that they see. This sounds trivial when spelled out, but it was actually not that easy to state it for me during the round.

Now, what will each person do during the i-th round? They will look at the set of sequences that are still possible from their point of view (given by the intersection mentioned above), and check if their hat color is the same in all of them. If yes, they will declare, otherwise they won't.

How to compute S_i given S_i-1 and the declarations? We need to check the declarations that would have happened for each sequence in S_i-1 (assuming each person sees some prefix of that sequence), and remove those sequences where this set of declarations does not match one for the real sequence of hats. Once again, this looks very simple, almost trivial, but it was actually far from easy to state concisely.

This is how far I've got during the round: I've implemented a slow solution based on the above, and was trying to find some fast heuristic solution that would match it on random inputs. It turns out that this did not lead to a correct solution. Instead, one simply had to speed up the slow solution!

One had to notice that after each step, the set S_ican be described by the lists of possible amounts of red hats in each of the prefixes of the sequence. For example, suppose there are 4 red and 3 blue hats in total. The initial set S₀can then be described as: empty prefix has 0 red hats; the prefix of length 1 has 0 or 1 red hats; 2 has 0, 1, 2; 3 has 0, 1, 2, 3; 4 has 1, 2, 3, 4; 5 has 2, 3, 4; 6 has 3, 4; and the whole sequence has 4. Every sequence that has 4 red and 3 blue hats satisfies those constraints, and every sequence that satisfies those constraints has 4 red and 3 blue hats.

Then suppose during the first round only the last two people have declared that they know the color of their hats. It turns out that the resulting set S₁can then be described as: empty prefix has 0 red; 1 has 0, 1; 2 has 0, 1, 2; 3 has 1, 2, 3; 4 has 2, 3; 5 has 2, 4; 6 has 3, 4; 7 has 4.

More generally, if we describe the prefix of length i having j red hats as (i,j), then the (i+1)-th person will declare if exactly one of (i+1,j) and (i+1,j+1) is still possible, and not declare if both are possible. This criterion allows us to compute the declarations that actually happen, and the same criterion then allows us to eliminate states which contradict those declarations. However, we also need to pay attention to also eliminate states that do not have a possible predecessor (if both (i-1,j-1) and (i-1,j) are eliminated, then (i,j) should be eliminated as well), or a possible successor (if both (i+1,j) and (i+1,j+1) are eliminated, then (i,j) should be eliminated as well). This criterion is also the basis of the inductive proof that S_i can always be represented in the above form.

Therefore we can maintain the set of states (i,j) that are still possible after each round, and stop when this set of states stops changing.

Thanks for reading, and check back next week!

Sunday, September 10, 2023

A dual week

I have previously posted all the links to the AWTF onsite contests (day 1, day 2, thanks jonathanirvings for the screenshot!), so let us talk about the problems now. I have only solved a few of them, for the remaining ones I have either got the solution from other competitors or read the editorial.

On the first day, the easiest problem A "Save the monsters", similar to many game problems, required one to consider possible good strategies for both players until you find two that achieve the same score, thus proving that it is the answer. I have made a bug in my initial submit and therefore assumed that I had a logical issue and the approach does not work, but instead it was just a coding mistake.

The key to solving problem B "Non-Overlapping Swaps" was the idea that if we swap elements in positions 1 and 2, then this swap most likely does not contradict the swaps before and after, so we can use this to divide the entire process into two recursive parts with this swap in the middle. There were still details to figure out as "most likely" does not mean "for sure". This idea did not occur to me unfortunately. I did examine swapping the first two positions as the first move and then doing two independent processes for the two resulting cycles, but it was not clear how to transition from one to the other, and in fact I have found a small counterexample when doing such first move leads to inability to finish in time. I did examine other promising ideas, for example I've noticed that if we have overlapping swaps with a common end, we can always replace them with non-overlapping ones with the same effect, for example swapping positions (a,c) then (b,c) with a<b<c is equivalent to first swapping (b,c) then (a,b). Therefore I was trying to construct some kind of iterative process that gradually removes overlaps.

I've spent most of the time on problem C "Shrink the Tree". While I could make progress by finding a canonical representation for a tree, I did not have the key idea from the editorial that we should then switch to looking at the problem from the other side: thinking what are the obvious criteria for us to be able to collapse a given forest, and when are they not enough, potentially using a stress test to come up with new criteria. The approach of moving from two sides is similar to problem A, and will actually appear many more times in this problemset. One could also say that thinking from the other side makes an implicit assumption that the problem is beautiful: that the real criteria will be easy to state. Otherwise trying to conjure them up from the air would be much less promising than the "minimal representation for a forest" approach.

Solving problem D "Welcome to Tokyo!" required doing three things in sequence: applying the trick from Aliens, then looking at the dual problem for a linear program, and finally noticing that solving many instances of that problem with a varying parameter can be done efficiently. My issue was that after applying the trick from Aliens, which I of course considered, we seem to have made the problem more difficult than it was originally, as because of a different party cost we'd have to redo things from scratch every time, and therefore be at least quadratic. Therefore I have discarded this direction as a dead end.

Finally (for day 1), solving problem E "Sort A[i]-i" required noticing some tricky combinatorial identities. I have not spent much time on this problem because I expected the solution to involve heavy manipulations with generating functions, which I am not very good at. It turns out the generating functions were actually only needed to prove the identities, and therefore could probably be avoided completely. To be honest, I am not sure how feasible it was to find those identities in another way, maybe hos_lyric can share the solving process?

I'd like to give the readers a week to think about second day's problem A "Hat Puzzle", so I will just share the statement for now: n people are standing in a row, each with either a red or a blue hat on. Each person can see the hat colors of the people in front of them (with smaller position numbers), but not their own hat color or the hat colors of the people behind them. Each person also knows the total number of red and blue hats. Then, the following process happens in rounds: in one round, every person who can already deduce their hat color, declare that they have done so (they do not declare the color itself). If multiple people can deduce it in the same round, they declare it simultaneously without hearing each other first. In the next round, on the other hand, people can already use the information of who has declared in the previous round to potentally make additional deductions themselves. Which people will eventually declare, and which will still be in the dark about their hat color even after an large number of rounds?

Solving problem B "The Greatest Two" was once again reliant on "assume the problem is beautiful" or "come up with a bound and then assume/prove it is tight" approach: the key trick was to say that for every number we have a range of possible positions, and that those ranges were more or less independent. Simplifying the problem like this makes the solution a straightforward, if a bit tricky, implementation (it took me ~1.5 hours to implement after having this idea I think), but it was not at all obvious to me that this framework is correct at all, even though it can be proven by induction after the fact, so I just took a leap of faith.

Similarly, the solution for problem C "Jewel Pairs" seemed to involve two steps where one comes up with a reasonably simple bound and assumes/proves it: first the description of the matching criteria (from the words "Focusing on the form of the mincut" in the editorial; those words also mean that it might be able to deduce this form analytically instead of going "from the other side", but I did not try doing so myself as I was focusing on other problems), and then the criteria for dealing with the colors 2f(c)<=A-B.

The key to solving problem D "Cat Jumps" was finding a way to decompose the overall construction into building blocks that can be combined arbitrarily, so that we can use dynamic programming (or just multiplication) to compute the answer. This is a common theme in many counting problems, but even after reading the editorial I had no idea how to actually come up with the decomposition in this problem. It does not look that we can go from the other side in this case ("What could we reasonably compute with DP for the given constraints?"), and the decomposition itself is too complex to just appear out of nowhere. I did not think much about this problem during the round.

Finally, problem E "Adjacent Xor Game" was once again solved from the other side: one had to hypothesize or prove that all that matters in the end is how many times each bit is flipped (where going from y₁ to y₂ such that y₁^y₂=x we count all times a bit is flipped as we count y₁, y₁+1, ..., y₂, not just whether y₁ and y₂ have a difference in a given bit), and we can just get a bound on the answer independently for each bit, and then take a maximum of those bounds. I have spent time building a more direct solution instead (given the lower bound on y₁, how do we find the smallest y₂ for a given x?), figured out a rule for that with 4 cases, but this road did not seem to lead anywhere. Maybe had I considered replacing going from y₁ to y₂ in one go with processing y₁, y₁+1, ..., y₂ step-by-step, I would have come up with the criteria, but this idea never occurred to me.

Overall, the problems were beautiful and interesting to solve, even if I was feeling quite stuck for long periods of time :) The biggest common theme seems to be that in 5 out of 10 problems (1A, 1C, 2B, 2C, 2E, and maybe also 1D as linear programming duality is the same idea), one had to stop thinking "how can we optimize/improve what we already can" and go from the other side, "what would be the reasonable bounds/criteria when we clearly can't", and then either proving those are tight, or just assuming it. This could be seen as basing the solution on the fact that the problem is beautiful, or at the very least on the fact that it is solvalbe for the given constraints. So one takeaway is that I should try this solving approach more often.

I'm sorry for a long brain dump, feel free to tell me that what I'm writing makes no sense at all :)

The 2nd Universal Cup Stage 2: SPb also took place over the weekend (problems, results, top 5 on the left). This season follows the highly successful debut season of the Universal Cup, which has more or less taken Open Cup's space as the main team competition outside of the ICPC system. As I understand, Stage 1 of the 2nd Universal Cup was declared unrated because of the server problems, so this was the actual first stage.

On the surface, it might have seemed that making last season's winner USA1 even stronger would be impossible, but they have found a way, demolishing the field in just over three hours with Gennady on the team. Well done!

It would be nice to gather Petr Team again to participate, but given that the number of stages in a year is ~2x that of the Open Cup, with a stage is happening almost every weekend, the time commitment required to feel a real part of the proceedings would be way too big. We should try to do a one-off round some time, though :)

Codeforces Round 896 wrapped up the week (problems, results, top 5 on the left, analysis, discussion). Quite fittingly, the first two places in the round went to the problemsetter (well, well) and the winner of AWTF. Congratulations to both!

In the last week's summary, I have mentioned a Codeforces problem: you are given an array of at most 10000 integers, each between 0 and 2⁶⁰. In one operation, you split the array in the middle into two parts, compute the bitwise xor of each part, and discard the part where the bitwise xor is smaller. In case they are equal, you may discard either part. After doing this operation several times, you have just one number remaining. Which positions in the initial array could this number correspond to?

The first observation, which taking bitwise xors of two parts points to, is to notice that the bitwise xor of the bitwise xors of the two parts is equal to the bitwise xor of the entire array. Therefore if the bitwise xor of the first part is x, and the bitwise xor of the entire array is s, then the bitwise xor of the other part can simply be found as s^x. And when comparing x and s^x, they will differ in those bits where s has a 1 bit, therefore we need to look at the highest 1 bit of s, and we will always choose such part that x has a 1 bit in that position. This condition is both necessary and sufficient: any part which has a 1 bit in that position will be chosen.

The fact that only the highest 1 bit matters allows to speed up the straightforward dynamic programming from O(n³) to O(n²), because we can handle multiple transitions at the same time. The dynamic programming will compute for each of the O(n²) subsegments whether we can reach it. For O(n³) complexity, for every reachable state we can simply iterate over all ways to move the left boundary while keeping the right boundary constant, or vice versa, and check if a move is possible (by comparing x and s^x).

However, we can notice that doing the transitions that try moving from (l,r) to (l,r-2), (l,r-3), ... and the transitions that try moving from (l,r-1) to (l,r-2), (l,r-3), ... have many things in common. In fact, if (l,r) and (l,r-1) have the same highest 1 bit in their bitwise xor, then those transitions are either possible or not possible at the same time.

This hints at what should we be computing in the O(n²) dynamic programming: for every segment (l,r) we will compute the set of possible highest 1 bits (represented as a bitmask) of all reachable containing segments with the same left boundary (l,r₁) where r₁>r, and the same for the containing segments with the same right boundary. To compute this number for (l,r) we can take the same number for (l,r+1) and update it with the highest 1 bit of the bitwise xor of the segment (l,r+1), if it is reachable. And knowing this bitmask allows to check if this segment is reachable by simply testing if this bitmask has a non-zero bitwise and with the bitwise xor of this segment.

Another way of putting the same idea is to say that we're introducing intermediate states into our dynamic programming to aid reuse: instead of going from a reachable segment to its prefix or suffix directly, we now first go from a reachable segment to a (segment, highest 1 bit, decision whether we will change left or right boundary) tuple, which allows to add transitions between those tuples, and transitions from those tuples back to segments, in such a way that we have O(1) transitions per tuple instead of having O(n) transitions per segment. This would mean going from O(n²) states and O(n³) transitions to O(n²*#bits) states and O(n²*#bits) transitions, but then using bitmasks helps get rid of the #bits factor.

Thanks for reading, and check back next week!

Saturday, September 9, 2023

AWTF22 day 2

AtCoder WTF22 onsite round wrapped up today with the second contest (problems, combined results, top 5 on the left, mirror results, analysis, day 1 livestream, day 2 livestream). Once again I have managed to solve only one problem in 5 hours, but at least this time it was worth more points, and it was more satisfying as I have actually gradually approached the correct solution instead of just pulling it out of thin air. Overall, I remained in the 15th place that I had after the first round, and this is also the place I have when ordered by rating. I guess everything went as expected after all :) Qualifying next time would require crawling back into the top 8, though, and that will be a big challenge (current standings).

The overall top 3 solved non-intersecting sets of problems today, and it worked out the best for jiangly who combined the first place on day 2 with the second place on day 1 to win the event. Huge congratulations to him, ksun48 and tourist!

In a search for external factors that prevented myself from doing better, I've noticed that advancing to the finals from the 2021 season was the most correlated with the final results: the 2021 finalists got places 1,2,3,4,6,7,8 and 14. Maybe the problems for the finals were mostly prepared at the same time as the problems for the AGCs from 2021, and therefore require similar solving approaches? Unfortunately I have advanced in all other years, but not in 2021, hence the poor performance ;) Well done peti1234 for bucking the trend and winning the 5th place!

Huge thanks to the AtCoder organizing team for the amazing event, and to the problemsetters and testers for the problems! I have also enjoyed the social part immensely, meeting some old friends and getting to know new ones. I hope to meet you all again at some other competition in the future!

Thanks for reading, and check back tomorrow for this week's summary where I will also choose some AWTF problems to highlight.

Friday, September 8, 2023

AWTF22 day 1

AtCoder WTF22 onsite day 1 just finished (problems, results, top5 on the left, mirror results, analysis). I was actually able to get a problem accepted after two hours have passed, with the caveat that this was the only problem I got accepted :) I probably spent about an hour on B and about three hours on C, with no success. In C, I passed all samples except the last one. It turns out that I did have the correct "minimal representation" for a tree at the end, but then I tried to merge them to obtain a minimal representation for a forest in the same format, but it turns out that this was not possible at all: when merging trees B->W<-B (W is root) and W->B (B is root), the minimal representation is just B, but if we later merge it with a tree W->B<-W (B is root), we get a non-empty representation, while if we choose B->W<-B for the result of the first merge, then we can cancel everything out on the second merge. This counterexample stems from the correct solution idea, which tells that one of the conditions for being able to cancel everything is having at least two roots of different colors; hence any approach that merges the representations of trees has to be able to tell if we have seen two roots of different colors. I feel that this was pretty hard to come up with without actually coming up with the trick from the editorial directly.

Congratulations to everyone who was able to solve two problems in this round, and especially to ksun48 on the total domination and to hos_lyric for solving E!

Given that tomorrow's round has a 1000-pointer as the first problem, I might have to be content with the points I already have. But I will do my best ;)

Wednesday, September 6, 2023

AWTF22 taifun

One thing that greeted me on arrival to Tokyo was an unusual notification in Google Maps (see on the left). It looked somewhat scary, but then I remembered the amount of dangerous weather notifications that I get in Zurich that typically do not mean anything too unusual, and relaxed a bit. Imagine how much more relaxed I became when this note greeted me in my hotel room (see on the bottom-right).

So, maybe people with experience of Tokyo winds can share: how bad can it be?

In general, my approach to this trip is to avoid jetlag by living more or less on Swiss time. The contests start at 6:00 Zurich time, which is definitely early, but not outrageously so. Given that I'm here for just 4 days, it seems logical to not bother shifting the schedule by much, and to sleep 20:00-5:00 Swiss time (=3:00-12:00 here).

One of the benefits is that I get to do sightseeing in the evening, when it is much more pleasant outside, and there are less people everywhere. The flip side of this coin is that many places will be closed at night.

One particularly important challenge is getting food. Lunch happens during dinner time here, so it is not really an issue. I will be late for the hotel breakfast, but it will already be lunchtime, which means that I can get some lunch food in one of the cafes near the Shinjuku station (not sure if the lunch in the hotel restaurants will be quick enough to be in time for the contests). However, what to do about dinner? Any suggestions about where to eat in Tokyo at 1am? Bonus points if your suggestion comes in the next 2 hours ;)

AWTF22 arrival

So, an onsite event unfortunately means a couple of long flights :) Can you guess which of the two photos is the Zurich airport, and which is the Tokyo airport?

Surprisingly, there were no signs announcing the AtCoder WTF at the airport — how could the airport forget to greet the participants? ;) Having encountered the Japanese reality both in a good way (the immigration was well-organized, and apparently there's even a way to fill all forms online in advance) and in a not-so-good way (apparently the tickets for some, but not all, trains can only be bought in cash, so I had to wait for the next train because there was no time to run to an ATM), I have almost arrived at the competition hotel.

The competition itself takes place on Friday and Saturday, with two five-hour rounds currently scheduled. I do not know if those are just placeholders, but if not, that is quite a challenge for an individual event. I have a feeling that recently I do not get any problems accepted after the 2 hour mark in contests that last longer than 2 hours, so I will have to do very well in the first 2 hours of each day :)

There is probably a way to use Codeforces API to get a more grounded assessment of that feeling. Or maybe there is an existing tool that can help answer the questions like "what would my place be in Round X if it was over after 2 hours"?

Thanks for reading, and check back soon for more AWTF updates. And I'm sorry for the low fraction of the actual algorithmic content this week :)

Sunday, September 3, 2023

A Yoneda week

The 35th International Olympiad in Informatics in Szeged, Hungary was the main event of this week (problems, results, top 5 on the left, day 1 broadcast, day 2 broadcast). Tingqiang Xu and Siyuan Cheng were miles ahead of everybody else (who still did great :)), and only 1 point ended up deciding the overall winner. Congratulations to them but also to all medalists! My inevitable slide in the Hall of Fame continued, seemingly by just 1 place though.

According to the problem page, three out of six problems were created by the amazing team of square1001 and E869120. Creating just one IOI-quality problem is an amazing achievement requiring great ideas and weeks (if not months) of work, so it is hard for me to even imagine creating three in the same year. Well done!

Codeforces ran Pinely Round 2 just a few hours after the first IOI day was over (problems, results, top 5 on the left, my screencast, discussion, analysis). I wonder who is the highest-placed participant who took part in both :) Among the participants well past their IOI days, tourist got his first place using just under an hour out of the three hours available. It could all change at the end of the round, as both last problems were not completely unsolvable and saw a lot of last-minute submissions, but nobody was able to get all details exactly right. Congratulations to tourist!

Problem F did not require one to come up with complicated ideas, but instead required some accuracy "unwrapping the present" to arrive at the solution, and it was nice that the solution turned out quite easy to implement: you are given an array of at most 10000 integers, each between 0 and 2⁶⁰. In one operation, you split the array in the middle into two parts, compute the bitwise xor of each part, and discard the part where the bitwise xor is smaller. In case they are equal, you may discard either part. After doing this operation several times, you have just one number remaining. Which positions in the initial array could this number correspond to?

You might have noticed that the address of this blog has changed to https://blog.mitrichev.ch/. I am going to use this address going forward, even though the old address will keep redirecting to the new one for the foreseeable future. So, welcome to the new home!

It is also a good opportunity to write about my online presence in general. I do not have accounts in the common social networks, such as Facebook, Twitter/X, LinkedIn, TikTok, Instagram, etc. Of the social network-like things I mainly have this blog, my Youtube channel, and my Codeforces account.

Thanks for reading, and check back soon! I will try to post updates about the AWTF during the week.