Algorithms Weekly by Petr Mitrichev: October 2019

Monday, October 28, 2019

A congratulations week

TopCoder SRM 762 was the first contest of Jul 1 - Jul 7 week (problems, results, top 5 on the left). There were only three correct submissions for the hard problem, and tourist has spent just 35 minutes on it compared to Egor's and uwi's 50, therefore his first place was clear. Well done to all three!

Codeforces Round 572 followed on Friday (problems, results, top 5 on the left, analysis). Um_nik has completely dominated the field, completing all problems with half an hour remaining. Congratulations on the great performance!

Finally, TopCoder Open 2019 Round 3A wrapped up the week on Saturday (problems, results, top 5 on the left, parallel round results, my screencast). yutaka1999's times on all problems were quite good but not the fastest after the coding phase, but many other high-scoring competitors have failed either the challenge or the system test phase, and Yuta emerged in clear first place. Well done!

Thanks for reading, and check back for more :)

Sunday, October 27, 2019

A fusion week

The Jun 24 - Jun 30 week was quite calm, therefore we can divert our full attention to the problem from the previous summary: you are given a graph with n<=15 vertices and m<=200 edges. For each k between n-1 and m, compute the number of spanning sets of edges of size k (ones that connect all n vertices together). The numbers need to be computed modulo 10⁹+7.

If we didn't care about the number of edges, and just wanted to count all spanning sets, the following standard approach would be used: let's count all sets (there are 2^m of them) and subtract the sets which are not spanning. Each set that is not spanning has a uniquely determined connected component containing the vertex 0. So for each possible non-full subset of vertices containing the vertex 0 we need to subtract the number of subsets of edges with both ends within that component that are spanning within that components times two to the power of number of edges with both ends outside that component. Thus we can determine the number of spanning subsets for each subset of vertices containing the vertex 0 using dynamic programming with a running time of O(3ⁿ) if we precompute the number of edges within that subset for each subset of vertices, and the powers of two.

Adding a parameter for the number of edges makes this dynamic programming run in O(3ⁿ*m²): for each of O(3ⁿ) (set, subset) pairs we need to iterate over the number of edges in the subset, and the number of edges being added outside the subset, and our dynamic programming transition looks like

dp[set,a+b]-=dp[subset,a]*comb(outside_edges(set,subset),b)

Where comb(outside_edges(...),b) is the combination number that counts the ways to pick b edges to be added in the disconnected part. O(3ⁿ*m²) was too slow in this problem, but how to make this solution faster?

In many problems (for example in the TopCoder problem from three summaries ago), the key idea is about collapsing a subproblem as a separate dynamic programming to reduce the number of states or the processing time for each transition in the main dynamic programming. In this problem, we're going to do the opposite: the combination number is computed by its own dynamic programming, so we're going to unwrap that dynamic programming and fuse it with the main one!

Our state will now be the set of vertices, the number of edges we got so far, and the number of outside edges, for each of which we need to take a decision whether we add it to our edge counter or not.

We can then do (I'm describing a "forward" dynamic programming everywhere in this post)

sub[set,a,outside_edges(set,subset)] += dp[subset,a]

for all subsets of set and all values of a, then

sub[set,a,o-1] += sub[set,a,o]
sub[set,a+1,o-1] += sub[set,a,o]

to reconstruct the combination number by trying two possibilities on whether to include the next outside edge, and finally

dp[set,a] -= sub[set,a,0]

to subtract the disconnected graphs once we have made all decisions on the outside edges.

The first statement runs in O(3ⁿ*m), the second block of statements runs in O(2ⁿ*m²), and the last statement in O(2ⁿ*m), so the running time now is O(3ⁿ*m+2ⁿ*m²) which is fast enough.

By fusing the combination number dynamic programming into the main one we could no longer reuse its results between different transitions in the main part, so our complexity could become O(3ⁿ*m³) after doing this. But we achieved a different type of reuse: we can now do the part that adds the outside edges for all values of subset of a given set, and for all values of a for a given set, at the same time instead of doing it separately for each (subset, a) combination by iterating over b as we did before.

This dynamic programming fusion looks to be a quite general technique, but I couldn't remember many problems where it appeared, apart from one large cluster of problems: dynamic programming on "broken profile" can be viewed as an application of such fusion to dynamic programming on "normal profile" and the sub-dynamic programming between two profiles. Do you have more examples?

Thanks for reading, and check back for more!

Saturday, October 26, 2019

A Stonefeang week

There were a couple of rounds during the Jun 17 - Jun 23 week. One of them was Codeforces Round 569 (problems, results, top 5 on the left, analysis). Radewoosh and ACRush have managed to finish the problemset in the last minutes of the contest (just 43 seconds before the end in case of Radewoosh!), and thus jumped far above everybody else. Well done!

The other round was TopCoder SRM 761 (problems, results, top 5 on the left, my screencast). The hard problem was solved by just 3 contestants, and my solution employed a somewhat general trick that I nevertheless don't remember appearing in many other problems. You are given a graph with n<=15 vertices and m<=200 edges. For each k between n-1 and m, compute the number of spanning sets of edges of size k (ones that connect all n vertices together). The numbers need to be computed modulo 10⁹+7.

Thanks for reading, and check back for my solution of this problem!

A twenty billion week

TopCoder SRM 760 was the main event of the Jun 10 - Jun 16 week, and even it did not have so many participants since it took place at a Europe-unfriendly time (problems, results, top 5 on the left). fjzzq2002 had a healthy lead after the coding phase, and had only increased it through challenges. Congratulations on a convincing victory!

In my previous summary, I have mentioned a Google Code Jam problem: you are playing a game that starts with 10¹² coins arranged in a row. In one turn, each player must pick any 10¹⁰ consecutive coins and remove them. The remaining coins are not shifted, therefore the coins that were on both sides of the removed segment do not become consecutive. The player who can't make a move loses. The first player always picks a valid move uniformly at random, and you are playing for the second player. Your goal is to win at least 499 games out of 500.

The key idea is the following: a block of exactly 2*10¹⁰ consecutive coins is extremely useful when playing against a random player. If a random player makes a move inside this block, it will almost certainly be not at the boundary, and thus no further moves will be possible there, so unless we make a move at the boundary ourselves, this block behaves exactly like a block of size 2*10¹⁰-1, or a block of size 10¹⁰. However, if it's our turn and we need to do so, we can move at the boundary of this block and leave a block of size 10¹⁰ behind which still allows one more move. In other words, we can skip a move if we have a block of size exactly 2*10¹⁰available.

Skipping a move turns a losing position for you into a losing position for other player, which means we can win with certainty if we know if the current position is winning or losing :) However, we don't know that in the general case. Instead of trying to determine it, let's just do the following: if there is a block of size >=3*10¹⁰, we can make a move that creates a block of size exactly 2*10¹⁰, let's do so. Otherwise if there's still a block of size at least 2*10¹⁰+1, let's make any move inside any such block.

At some point we will end up in a situation where all remaining blocks are of size between 10¹⁰ and 2*10¹⁰. The random player makes exactly one move in any such block, so it's very easy to determine if a position is winning or losing: losing positions have an even number of such blocks. If we are in a losing position, and there's still at least one block of size 2*10¹⁰ remaining, we can use the above trick to skip a turn and give the other player a losing position.

And since we always create such blocks while there are still large blocks remaining, and the random player can destroy them but does not always do so as it's more likely to move inside a bigger remaining block, the probability that we will have at least one block of size 2*10¹⁰ remaining when we reach that "simple" state with no bigger blocks is essentially 1.

Thanks for reading, and check back for more!

A week independent of x

The Jun 3 - Jun 9 week started with TopCoder Open 2019 Round 2B, the second and last chance to qualify for round 3 (problems, results, top 5 on the left, parallel round results). The problems were a bit harder than in Round 2A, and only two participants could solve the hard one: Deretin and Kostroma. Congratulations to both, and to all advancers to Round 3!

Codeforces Round 564 followed on Friday (problems, results, top 5 on the left, analysis). The last two problems turned out to be unsolvable during the round, and CauchySheep was the fastest on the first five. Congratulations on the victory!

Finally, Google Code Jam Online Round 3 selected the top 25 that would travel to San Francisco for the onsite finals (problems, results, top 5 on the left). The hardest problem turned out to be unsolvable in this round as well, and the speed of solving the remaining problems was key. ACRushTC beat the second place by just 3 minutes, and the following gaps were even smaller. Congratulations to Tiancheng and to everybody who qualified for the onsite!

The easiest problem in this round required a bit of creativity: you are playing a game that starts with 10¹² coins arranged in a row. In one turn, each player must pick any 10¹⁰ consecutive coins and remove them. The remaining coins are not shifted, therefore the coins that were on both sides of the removed segment do not become consecutive. The player who can't make a move loses. The first player always picks a valid move uniformly at random, and you are playing for the second player. Can you win at least 499 games out of 500?

In my previous summary, I have mentioned quite a few problems. The first one was from TopCoder: 500 table tennis players are sitting in a row. Repeatedly, we do the following: pick two adjacent players uniformly at random, play a game between them, and eliminate the player who lost, until there is only one player left. Each player has a rating r_i, and the probability that the i-th player wins against the j-th player is r_i/(r_i+r_j). What is the probability that the given player wins the entire tournament?

The most straightforward dynamic programming would compute for each l<=t<=r the probability that the t-th player wins in a tournament where the players from l-th to r-th participate. This dynamic programming would have O(n³) states though, and each transition would require iterating over the last non-winning player remaining, and the splitting point between the two segments that they beat, resulting in a complexity in the ballpark of O(n⁵) which is too much.

The first observation is to notice that we have two basically independent tournaments happening to the left and to the right of the winning player, so we can compute the probabilities that the t-th player wins in the [l;t] and in [t;r] segments and multiply them. Therefore we can afford to have a quadratic number of states: the probabilities that the first and the last player wins in each segment, and the running time of the entire solution becomes O(n⁴).

In order to get a further speedup, we need to become a bit more explicit with the formulas. If we denote the probability that the l-th player wins in the segment [l;r] as p_l,r, the probability that the r-th player wins in that segment as q_l,r, and the probability that the i-th player beats the j-th player in a direct encounter as w_i,j, then our current solution can be viewed as:

p_l,r=sum_x(sum_y(1/(r-l)*p_l,x*q_x+1,y*p_y,r*w_l,y))

Where y denotes the player we beat the last, and x denotes the right boundary of the segment of players that we beat directly or indirectly before beating y. Note that the value of x does not actually depend on the outcomes of the games, just on which game is the last one: if we imagine that initially we have n players and n-1 "potential games" placed between them, then at every turn we take one potential game and remove it, so the order of games is just a permutation of n-1 elements, and the value of x is the last element of that permutation, so each possible value has the same probability of 1/(n-1), which transforms to 1/(r-l) in our formula.

There are two magical things about this result:

The fact that after we condition on a given x, the games happening in [l;x] and [x+1;r] segments are independent and random in the same sense as in the original problem, so that dynamic programming can work at all.
The fact that the probability for any x is just 1/(r-l) and crucially does not depend on x enables the most exciting magic of this problem to happen, because we can now reorder our summation as

p_l,r=sum_y(1/(r-l)*sum_x(p_l,x*q_x+1,y)*p_y,r*w_l,y))

And notice that the inner sum on x does not depend on r, and can be computed as an additional dynamic programming table u_l,y and reused between different values of r, thus reducing the overall running time to O(n³). Had the probability depended on x, we would not be able to collapse the inner summation, and the problem would be much less interesting.

The official editorial does not mention this magical property at all, so I'm wondering if there's a less magical implied explanation there :)

The second problem I mentioned was from Codeforces: you have 300000 objects, with each object having two attributes: a value between -10⁹ and 10⁹, and a nonzero mask with at most 62 bits. Given another mask s, we compute f(s) as the sum of values of objects which have an even number of shared bits between their mask and s minus the sum of values of objects which have an odd number of shared bits between their mask and s. f(0), which is just the sum of all values, is not equal to zero. Your goal is to find any s such that f(0)*f(s) is negative, in other words to flip the sign of the sum.

Without loss of generality, let's assume that f(0) is positive, so our goal is to make the sum negative.

First, we can notice that the masks simply mean that we have 62 subsets of our objects, we're allowed to flip the sign of values in any of those subsets, and each object belongs to at least one subset.

Had the subsets been disjoint, the solution would have been easy: we could just flip the sign of all subsets which have a positive sum. Then all subsets would have a zero or negative sum, and we would have at least one negative subset since the total sum is not zero.

It turns out that our problem is almost as easy: let's fix an arbitrary order in which we will traverse the subsets, and say that each object truly belongs only to the last subset that contains it, in that order. Then we can go through the subsets in that order, and flip all subsets which have a positive sum of values of objects that truly belong to it. Note that the values we are summing here could have their signs flipped by the previous subsets that those elements belonged but not truly belonged to.

Since the object is never touched after we process the subset it truly belongs to, the total sum of values in the end is equal to the sum of sums of values of truly belonging objects for each subset as we process it, and all those sums are zero or negative by construction, and at least one of them is negative, so the overall sum is negative as needed.

Finally, an AtCoder problem was discussed as well: you are given at most 1000 groups of red balls, and at most 1000 groups of blue balls. Each group contains at most 10 balls in the same location, and the coordinates of those locations are integers not exceeding 10⁹ by absolute value. The total number of red balls is equal to the total number of blue balls. Consider all possible matchings where each red ball is matched with one blue ball and vice versa. You need to find the maximum sum of Manhattan distances between the matched balls.

The naive min-cost-max-flow network would look as follows: edges from the source to each red group with capacity equal to the size of the group and cost 0, edges from each red group to each blue group with infinite capacity and cost equal to negative Manhattan distance between the points, and edges from each blue group to the sink with capacity equal to the size of the group and cost 0.

The negative Manhattan distance has the following form: -abs(x₁-x₂)-abs(y₁-y₂). Suppose we know how each absolute value expands (the signs of its argument), for example: -(x₁-x₂)+(y₁-y₂). This can be reformulated as (y₁-x₁)+(x₂-y₂). Let's insert a vertex in the middle of this edge, and split its cost into two: the edge from the red group to this vertex has cost (y₁-x₁), and the edge from this vertex to the blue group has cost (x₂-y₂).

So far we did not change much, but here comes the key idea: we have only four types of those intermediate vertices, corresponding to the four ways the absolute values can be expanded, and for all intermediate vertices of the same type the costs of the edges going into them from a fixed red group are the same (since the value (y₁-x₁) does not depend on the blue group), and the costs of the edges from them to a fixed blue group are the same. Therefore the natural next step is to collapse all intermediate vertices of the same type into one, and any valid flow in the old network naturally corresponds to a valid flow in the new network with the same cost.

The new network has more valid flows, though: now it is possible for the flow to go from a red group to a blue group through a "wrong" type of intermediate vertex. However, such way has a higher cost than going through the correct type of intermediate vertex (since expanding the absolute value in the wrong way results in a lower value), and therefore will never appear in a minimum cost flow, so the value of the min-cost-max-flow in the new network is the same, and the new network has linear size!

Thanks for reading, and check back for more.

A Kotlin week

Codeforces hosted the first round of its Kotlin Heroes series during the May 27 - Jun 2 week (problems, results, top 5 on the left, my screencast, analysis). All solutions needed to be in Kotlin, and there were two main approaches: either writing in Kotlin from scratch, or writing in Java and using IntelliJ IDEA's built-in Java-to-Kotlin converter. In the top 5, myself and abacabadabacaba have used the converter approach. Some level of Kotlin understanding was still required for us, as the converter did not work flawlessly all the time. If you're interested in details, the screencast should demonstrate some of the issues I had to overcome :)

TopCoder SRM 759 followed a day later (problems, results, top 5 on the left, analysis). Just tourist and ecnerwal were able to solve all three problems, and 200 challenge points gave tourist a clear edge :) Congratulations to both!

The hard problem in this round required a few nice observations to make the dynamic programming fast enough. 500 table tennis players are sitting in a row. Repeatedly, we do the following: pick two adjacent players uniformly at random, play a game between them, and eliminate the player who lost, until there is only one player left. Each player has a rating r_i, and the probability that the i-th player wins against the j-th player is r_i/(r_i+r_j). What is the probability that the given player wins the entire tournament?

Codeforces Global Round 3 took place on Saturday (problems, results, top 5 on the left, my screencast, analysis). I was able to squeeze in a randomized heuristic solution for G, but spent quite a bit more time doing so than mnbvmar and tourist spent implementing the correct solution :) Congratulations for mnbvmar for the win!

Problem F in this round was very easy for me, but quite difficult and with a lot of system test failures overall: you have 300000 objects, with each object having two attributes: a value between -10⁹ and 10⁹, and a nonzero mask with at most 62 bits. Given another mask s, we compute f(s) as the sum of values of objects which have an even number of shared bits between their mask and s minus the sum of values of objects which have an odd number of shared bits between their mask and s. f(0), which is just the sum of all values, is not equal to zero. Your goal is to find any s such that f(0)*f(s) is negative, in other words to flip the sign of the sum. Can you see how to do it?

AtCoder Grand Contest 034 was the last event of this week (problems, results, top 5 on the left, my screencast, analysis). Problem D in this round was not so hard to reduce to min-cost-flow, but the hard part was to make sure that the network has a linear number of edges instead of quadratic. Some people managed to skip the hard part by using a very efficient flow implementation :)

Here's what the problem was about: you are given at most 1000 groups of red balls, and at most 1000 groups of blue balls. Each group contains at most 10 balls in the same location, and the coordinates of those locations are integers not exceeding 10⁹ by absolute value. The total number of red balls is equal to the total number of blue balls. Consider all possible matchings where each red ball is matched with one blue ball and vice versa. You need to find the maximum sum of Manhattan distances between the matched balls. Can you see how to make the network linear?

In my previous summary, I have mentioned an Open Cup problem: you have n (n<=50) cells arranged in a circle and a token in one of the cells. Then you repeatedly do the following: pick a number uniformly at random from the set {-2,-1,+1,+2}, and move the token this many cells along the circle (positive numbers corresponding to the clockwise movement, and negative to the counter-clockwise one). What is the expected number of moves before the token visits the same cell twice? The cells that are passed through in the middle of a -2/+2 move do not count as being visited.

The most straightforward dynamic programming solution would just remember the set of already visited cells and the position of the token, but there are too many possible combinations. The first optimization we can make is to avoid storing the position of the token, instead rotating the set so that the token is always in a fixed position.

This does not make the solution fast enough, so we need another key optimization idea: notice that we can never jump over two consecutive visited cells, so if we find the closest such pair to the left and to the right of the token, everything outside those does not really matter, and we can assume that all outside cells are visited as well, for example replacing the state 0101110T001011010 with 1111110T001011111.

Now the number of reachable states is actually polynomial, since outside the all-one part we will always have a prefix/suffix of alternating 0s and 1s, and maybe some 0s in between those. We could try to encode them effectively using that observation, but we don't actually need to: we can still represent them as bitmasks, and just rely on the fact that the number of reachable bitmasks will be small. Since the largest testcase is obvious, we can run our solution on it to make sure it's fast enough before submitting it. And even if it weren't, we could precalculate the answers.

Thanks for reading, and check back for more!

A past glory week

TopCoder Open 2019 Round 2A was the first round of the May 20 - May 26 week (problems, results, top 5 on the left, parallel round results, my screencast). It was the first time everybody except the three stage winners and 30 Round 4 qualifiers from the SRM stages could compete, nevertheless the top 3 was taken by participants who did not participate in SRMs at all for 6+ months and just came back for the TCO. Congratulations to them and to everybody else who made it to Round 3!

Open Cup 2018-19 Grand Prix of Ural took place on Sunday (results, top 5 on the left). Four teams solved everything, albeit with quite a variety in the number of incorrect attempts :) Congratulations to team Past Glory on the win!

Problem J required a nice but not very difficult observation. You have n (n<=50) cells arranged in a circle and a token in one of the cells. Then you repeatedly do the following: pick a number uniformly at random from the set {-2,-1,+1,+2}, and move the token this many cells along the circle (positive numbers corresponding to the clockwise movement, and negative to the counter-clockwise one). What is the expected number of moves before the token visits the same cell twice? The cells that are passed through in the middle of a -2/+2 move do not count as being visited.

This round wrapped up the Open Cup 2018-19 season, pending the onsite finals in August in Petrozavodsk (overall results, top 5 on the left). With 20 rounds, this season was the biggest to date, and the round creep is not unlike what happens in the other Grand Prix-based competition, Formula 1 :) Congratulations to the top teams, and I'm especially glad to see that the geography of the top of the standings is expanding further. Here's hoping that the next season has even more stronger teams and less rounds :P

Finally, Codeforces Round 562 followed later on Sunday (problems, results, top 5 on the left, analysis). The mysterious user Vn_nV took the first place and was the only contestant to solve all problems. Well done!

Thanks for reading, and check back for more!

Monday, October 14, 2019

An undo week

Google Code Jam 2019 Round 2 was the main event of the May 13 - May 19 week (problems, results, top 5 on the left). The top 1000 all advanced to Round 3, but of course the strongest competitors used this chance to compete for the top spot in a very strong field. mnbvmar came out first with a 15-minute advantage over the second place. Congratulations!

In my previous summary, I have mentioned a TopCoder problem: you are given two sequences a and b of n<=100 elements each, with each element being an integer between 1 and 100, and an integer s between 1 and the sum of all integers in the two sequences. We will permute each of the sequences, and then start taking numbers in the order a₁, b₁, a₂, b₂, ..., until the sum of the numbers already taken is greater than or equal to s, which will happen after taking a number either from a or from b. In how many of the (n!)² possible choices of the two permutations we reach s after taking a number from a? Return the answer modulo 10⁹+7.

Let's iterate over which element x of a will actually cause the sum to exceed s, and the position k at which this element is in the permuted array a. Then we know that the sum p of first k-1 elements of permuted a plus the sum q of the first k-1 elements of permuted b is less than s, and p+q+x is greater than or equal to s. We can use relatively standard dynamic programming to find the number of ways to achieve each sum p with k-1 elements of a without using x and each sum q with k-1 elements of b. We can then iterate over p and q such that s-x<=p+q<s to find our answer.

How quick is this solution? For simplicity, let's use n to denote both the number of items and the maximum value of an item. We iterate over x and k, and then over p and q. There at most n² values of p, and at most n valid values of q for each p, so overall complexity for this part is O(n⁵). We also have to run the dynamic programming for each value of excluded item x and each item being taken, which iterates over k and p to update multiple states, so the complexity for this part is also O(n⁵).

This is a bit too much for n=100, so let's speed up both parts to O(n⁴). In the first part, we can notice that valid values of q for each value of p form a segment, so we can compute prefix sums to be able to find a sum of values for a segment of q's in O(1), making overall running time of the first part O(n⁴).

For the second part, we will avoid recomputing the entire dynamic programming for each excluded item x using a very nice trick. First, let's compute the dynamic programming without excluding any items: now for each k and p we know the number of ways to get p as a sum of k items from a. Now, for each excluded item x we will "undo" one step of this dynamic programming: the number of ways to get p as sum of k items from a without using item x is equal to the number of ways to get p as sum of any k items from a minus the number of ways to get p-x as sum of k-1 items from a without using item x. This allows to handle each value of x in O(n³), and makes the running time of the second part and of the entire solution O(n⁴) overall.

I find this trick particularly nice because normally we compute this dynamic programming in some specific order, and therefore there is only one item for which excluding it from consideration is an undo step. However, since the end result of the dynamic programming does not depend on the order in which we process the items, we can imagine for each item that it was the last to be processed, and undo its addition.

Thanks for reading, and check back for more!

A diameter week

The May 6 - May 12 week contained TopCoder SRM 758 (problems, results, top 5 on the left, analysis). The hard problem in this round was a nice dynamic programming exercise: you are given two sequences a and b of n<=100 elements each, with each element being an integer between 1 and 100, and an integer s between 1 and the sum of all integers in the two sequences. We will permute each of the sequences, and then start taking numbers in the order a₁, b₁, a₂, b₂, ..., until the sum of the numbers already taken is greater than or equal to s, which will happen after taking a number either from a or from b. In how many of the (n!)² possible choices of the two permutations we reach s after taking a number from a? Return the answer modulo 10⁹+7.

Codeforces Round 559 wrapped up the week on Sunday (problems, results, top 5 on the left, analysis). ecnerwala has solved the more difficult problem F, but he lost enough points on the first four problems that mnbvmar could get first place with E. Congratulations to both and also to ainta, the only other contestant to solve five problems!

In my previous summary, I have mentioned two problems. The first one came from AtCoder: two players are playing a game on a tree with 200000 vertices. The players make alternating turns, in one turn a player can choose any vertex as the root of the tree, and then all leaves of the resulting rooted tree are removed (or the root itself if it's the only remaining vertex). The player who has to move after the last vertex has been removed loses the game. Who will win if both players play optimally?

First of all, let's consider the difference between "remove all leaves of a rooted tree" and "remove all leaves of an unrooted tree" operation. It's not very big: the set of leaves is the same, with the possible exception of the root: when we pick a leaf of the unrooted tree as the root, it will not get removed, but all other leaves would. If we pick a non-leaf vertex as the root, then we remove all leaves.

Let's assume for now we never pick a leaf as the root until we have to. The process is then deterministic and can be simulated, but can we do better than plain simulation? It turns out that we can, by using quite typical knowledge about the structure of trees. Each tree has a diameter d, trees with even diameter have one center — a vertex such that its distances to all other vertices are <=d/2, and trees with odd diameter have two centers connected with an edge, such that the entire tree is split into two halves, and the distances from each center to all vertices in its half are <=(d-1)/2. We can now notice that the removal of all leaves decreases the maximum distances to the center(s) by 1, and therefore the diameter of the new tree becomes d-2 (because we can reach any vertex from any other vertex in (d/2-1)+(d/2-1) in the even case, and in (d-1)/2+1+(d-1)/2 in the odd case). We will decrease the diameter by 2 until we reach a tree with 1 or 2 vertices.

Now, what changes when we can pick a leaf as the root? If we pick one end of a diameter as the root, then the diameter will only decrease by 1 (since all its vertices but one will remain). Therefore each player in this game has a choice to decrease the diameter by 1 or by 2, until there are only 1 or 2 vertices remaining, in which case there's no choice anymore. A tree with diameter 0 is winning (since we can make a move, but the opponent can not), a tree with diameter 1 is losing (since only two moves are remaining), a tree with diameter 2 or 3 is thus winning (because we can get to diameter 1 in one move which is a losing position), a tree with diameter 4 is losing, and so on: the losing positions are trees with diameter equal to 1 modulo 3.

This is one of many problems where the understanding of maximum distances on trees through diameters and centers makes the solution possible, so I encourage you to understand the concepts if you haven't done already :)

The second problem was from Codeforces: there are n coins of three different colors, but you don't know which color each coin has. In order to determine that, you have 7 rounds available. In one round, you can choose an arbitrary set of non-intersecting pairs of coins, and you will be told whether the coins in each pair are of the same color or not. After the 7 rounds you need to separate all coins into three groups by color. Can you see how to do it?

We can use the first two rounds to ask about all pairs of adjacent coins (in an arbitrary ordering): in the first round we ask about 1-2, 3-4, 5-6, ..., and in the second round we handle the remaining 2-3, 4-5, 6-7, ... Those answers enable us to split the coins into segments of consecutive coins of the same color, such as: AAABBBBCCDDDEFFF...

We know that A and B are different, and that B and C are different, but we don't know anything about the relation of A and C. So in the next two rounds, we can then compare A-C, B-D, E-G, F-H, ..., and C-E, D-F, G-I, H-J, ...

Now we know that each color is not equal to the previous color, and we know if it's equal to the color before the previous color. Since there are only three colors, this allows us to uniquely determine all colors if we fix the first two colors arbitrarily: a color that is not equal to the color before the previous color is the third color that complements the last two. And we needed just 4 rounds to solve the task out of the allowed 7!

Thanks for reading, and check back for more.

Saturday, October 12, 2019

A null week

Long time no see :)

The Apr 29 - May 5 week had a lot of rounds, starting with Codeforces Round 556 on Monday (problems, results, top 5 on the left, analysis). Just two contestants solved everything, but Um_nik got ahead of other top finishers by solving problem C much faster. Well done!

TopCoder SRM 757 followed early on Tuesday (problems, results, top 5 on the left). This round wrapped up the selection of the third direct TCO19 finalist, and rng_58 has won this honor very convincingly (link, click on "Stage 3"). As for the round itself, ksun48 was the only contestant to solve everything, and even -50 during the challenge phase did not affect his result. Congratulations on the victory!

TCO19 Round 1B on Wednesday selected the final 247 participants of Round 2 (problems, results, top 5 on the left, parallel round results). Even though 750 contestants could advance, only 247 have managed to get a positive score. Just like in Round 1A, the challenge phase was vital to get the first place. Congratulations to homo_sapiens!

Google Code Jam 2019 Round 1C on Saturday was also the last Round 1, this time with 1500 final spots for Round 2 up for grabs (problems, results, top 5 on the left). logicmachine was 4 minutes faster than the competition, and qualified to the Round 2 in style :) Well done!

AtCoder Grand Contest 033 also took place on Saturday (problems, results, top 5 on the left, my screencast, analysis). tourist has produced another dominating performance and completed all problems a good half an hour faster than everybody else — a super impressive result!

Problem C was a bit standard but still cute: two players are playing a game on a tree with 200000 vertices. The players make alternating turns, in one turn a player can choose any vertex as the root of the tree, and then all leaves of the resulting rooted tree are removed (or the root itself if it's the only remaining vertex). The player who has to move after the last vertex has been removed loses the game. Who will win if both players play optimally?

Codeforces Round 557 also happened on Saturday (problems, results, top 5 on the left, onsite round results, analysis). Benq was just a tiny bit faster overall, and has also protected his problem solving highest score with a successful challenge. Congratulations on the first place!

Problem E was of the constructive kind, and I tend to enjoy those disproportionately :) There are n coins of three different colors, but you don't know which color each coin has. In order to determine that, you have 7 rounds available. In one round, you can choose an arbitrary set of non-intersecting pairs of coins, and you will be told whether the coins in each pair are of the same color or not. After the 7 rounds you need to separate all coins into three groups by color. Can you see how to do it?

Finally, Open Cup 2018-19 Grand Prix of Daejeon wrapped up the week on Sunday (results, top 5 on the left). Team MIT NULL has won this time in large part thanks to having just one incorrect attempt, compared to 18 for the World Champion team Red Panda in the second place :) Jumping a bit into the present, 2/3 of that team together with Scott Wu have joined the ranks of other veteran teams like ours for the 2019-20 Open Cup season — always great to see strong teams sticking around after their second ICPC World Finals!

Thanks for reading, and check back for more as I hope to climb back into the present soon :)