## Sunday, August 27, 2023

### An onsite week

## Sunday, August 20, 2023

### An arborescence week

The first step is to notice that all edges are of three types: type 2 are the edges where both ends have the same color as the edge, type 1 are the edges where one end has the same color as the edge, and type 0 are the edges where no end has the same color as the edge. We will also direct the type 1 edges from the vertex of the correct color to the vertex of the wrong color.

Clearly type 2 edges are the most useful for reaching our goal. In fact, if every vertex has at least one type 2 edge adjacent to it, then we can simply find the spanning forest among the type 2 edges. If it is a spanning tree, then we have solved the problem. If it is not connected, since we have already satisfied the color constraints, we can augment the forest to a tree using the remaining edges in any way.

What if there are some vertices that do not have type 2 edges adjacent to them? If there is a vertex with no outgoing type 1 edges, or in other words a vertex where all adjacent edges have the wrong color, then there is clearly no solution. So we only have to consider the case where some vertices have type 2 edges adjacent to them, and all remaining vertices have at least 1 outgoing type 1 edge. We will call the vertices with type 2 edges adjacent to them type 2 vertices, and the ones without them type 1 vertices.

The key remaining observation is that we cannot solve the problem using only type 1 edges. Each type 1 edge satisfies the color constraint for 1 vertex, but overall we have *n* vertices but only *n*-1 edges. So a logical idea is to start with the spanning forest of the type 2 edges that will cover all type 2 vertices, and then repeatedly try to find a type 1 edge that goes from a yet uncovered vertex to an already covered vertex (building something similar to an arborescence). If we manage to cover all vertices this way, then we can once again augment the resulting forest to a tree using the remaining edges. If we get stuck at some point, then there is no solution.

But why is this correct? We have proven that we need at least one type 2 edge, but why is it always correct to take ~all of them? This is a very typical situation for problems involving the spanning trees, as they all usually end up relying on some kind of greedy argument, so one could just bet on intuition here in my opinion. Still, here is the formal proof: consider the situation where our solution gets stuck. At this moment, the set *S* of uncoverted vertices contains only type 1 vertices, and there is no type 1 edge going from this set to the set of covered vertices. But then how could *S* be covered in any other solution? Since we have at most |*S*|-1 edges within *S*, and only edges within *S* can cover its vertices, one of them will have to remain uncovered in any tree.

Thanks for reading, and check back next week!

## Sunday, August 13, 2023

### An apiad week

I was one of those starting with the four easier problems. Somewhat unexpectedly, I did not get stuck in them and did actually have a bit more than an hour remaining for E, and made some significant progress on paper: I realized that if the sum of all *a _{i}* is zero, and the sum of all

*b*is zero, and we can place the

_{j}*n*

^{2}negated sums -(

*a*+

_{i}*b*) in the matrix in such a way that every column and row has exactly one of each

_{j}*a*and

_{i}*b*, then the sum of the 2

_{j}*n*-1 cells in a cross, which is the sum of the row plus the sum of the column minus the middle cell, will be exactly

*a*+

_{i}*b*. I've then realized that if the sum of all

_{j}*a*plus the sum of all

_{i}*b*is divisible by 2

_{j}*n*-1, then we can get both of those sums equal to zero with some constant shifts. Finally, I've correctly hypothesized that in case it's not divisible by 2

*n*-1, then we can only achieve score

*n*

^{2}-1, and figured out how to do so with some more constant shifts for the first row and the first column.

Therefore the only remaining step was to place the *n*^{2} negated sums -(*a _{i}*+

*b*) in the matrix in such a way that every column and row has exactly one of each

_{j}*a*and

_{i}*b*. This does not depend on the values of

_{j}*a*and

*b*, so this simply needs to be done once for every

*n*. It seemed quite doable as it felt like a more advanced derangement, and derangements are quite dense. I've tried some random placements and noticed that I can find such a placement for odd

*n*, but not for even

*n*. And this is pretty much how far I've got in an hour :)

Judging from the editorial (which I do not understand), it seems that for even *n* we need to use more degrees of freedom, therefore while I enjoyed coming up with my approach, I needed another huge step to finish the solution, and therefore needed at least another hour.

I found problem B to be a cute test of how well one understands the spanning tree mechanics (it is amazing how many nice problems can be derived from a relatively simple concept of a spanning tree, and the fact that it can be found greedily!): you are given a connected graph where each edge is either red or blue, and each vertex is also either red or blue. You need to find a spanning tree in this graph such that for each vertex it has at least one adjacent edge of the same color as the vertex, or report that there isn't any. Can you see a way to do it?

Thanks for reading, and check back next week!

## Friday, August 4, 2023

### A power of two week

*S*of integers between 1 and

*M*(

*M*<=500). In one operation, you pick an element of the set uniformly at random, and increment it by 1. If the incremented element coincides with another element of the set, we keep only one copy, so the size of the set decreases by 1. If the incremented element exceeds

*M*, we discard it, so the size of the set also decreases by 1 in that case. We are given the starting set

*S*, and need to find the expected number of steps before it becomes empty.

When I was thinking about this problem, I have convinced myself that any solution will need to keep track of the number of elements in the set *S* as we divide the probabilities by that number on each step, and unlike multiplication, division does not lend itself to some nice linearity of expectation arguments. Well, it turns out it does :)

Consider two adjacent numbers from the initial set *S*. How do they move? Well, either the lower number increases by 1, in which case it might actually merge with the higher number, or the higher number increases by 1, in which case it might disappear if it exceeds *M*, and then the lower number will also disappear after a few more steps. What is the expected number of steps that the lower number makes? Here comes the stunning observation: we do not actually need to know anything about the rest of the set *S* to compute this expected number of steps! No matter what happens to the rest of the set, every time one of those two numbers moves, it will be the lower number with probability 50%, and the higher number with probability 50%. So we can implement a relatively standard dynamic programming to compute the expectation.

And then, as you have probably guessed by this point, thanks to the linearity of expectation the answer to the problem can be computed as the sum of those expected numbers of steps over all pairs of adjacent numbers in the input, plus the expected number of steps for the highest number (which is just *M*+1 minus that number).

The really unexpected aspect of this solution for me is that we end up only ever dividing by 2, even though on the surface it looks like we need to divide by numbers up to |*S*|. Having learned this from the editorial, I immediately remembered one thing that I considered doing during the contest: can we try to reconstruct the answers to the sample cases in the fraction form from the values modulo 1000000007 that are given in the problem statement (750000009, 300277731 and 695648216)? Knowing the solution, I realized that those fractions would have a power of two in the denominator, and that might have been just the clue I needed to solve this problem.

I've tried doing this now, and it turns out this reconstruction is not so easy. Just trying to find a fraction with the smallest numerator and denominator yields:

750000009 = 15/4

300277731 = 26062/39247

695648216 = 35459/32906

Which is clearly incorrect for the last two cases, as the answer cannot be less than 1 or so close to 1. Adding some reasonable boundaries on the value of the fraction (it must be between 10 and 50 for the last two cases), we get:

750000009 = 15/4 = 3.75

300277731 = 133137/8642 = 15.40580884054617

695648216 = 133345/10938 = 12.19098555494606

Which actually looks plausible, but is still incorrect as we know from the above. However, we know from the problem statement that the denominator cannot have prime factors greater than |*S*|, which in our case is 5. Adding that constraint produces:

750000009 = 15/4 = 3.75

300277731 = 1241063/65536 = 18.937118530273438

695648216 = 582323/32768 = 17.771087646484375

Which looks even more plausible, to the point that it might be correct (I did not check), and in any case might have delivered the key hint about only ever dividing by 2.

Now, some questions to the reader :) Do you routinely try to convert the sample answers from the "modulo prime" form to the fraction form? Did it ever help you solve a problem? Do you have some tricks on how to do this more reliably? If the problemsetters are reading this, did you consider this trick when choosing the samples for this problem?

Now, on to an AtCoder problem: you are given*N*<=1000 pairs of integers (

*A*,

_{i}*B*), each integer between 0 and 10

_{i}^{9}. In one operation, you choose two integers

*x*<

*y*between 0 and 10

^{18}, and replace each

*A*with (

_{i}*A*+

_{i}*x*) mod

*y*(note that you apply the same

*x*and

*y*to all

*A*). You need to make it so that

_{i}*A*=

_{i}*B*for all i after at most

_{i}*N*operations, or report that it is impossible.

During the round, I guessed correctly that this is almost always possible, except for the cases like *A _{1}*=

*A*, but B

_{2}*≠*

_{1}*B*. However, I could not find a nice way to reorder the numbers. I was thinking mostly about operations where

_{2}*y*is bigger than the biggest number, to have at least some control over what is happening. In that case, by choosing the appropriate

*x*, we can split the numbers into two parts and interleave those parts, but that does not give us enough control to achieve an arbitrary ordering.

One other thing that came to mind when thinking about this interleaving operation is that it can only make the numbers closer to each other, but not pull them apart. However, when no interleaving actually happens, in other words when we just do a cyclic shift of the numbers, then we are not reordering, but we can pull the two halves apart as much as we'd like.

This is how far I've got during the round, and it turns out that this is the right direction, but I missed one final trick. Here is a working plan: in the first *N*-1 operations, we do *N*-1 cyclic shifts, which allows us to create almost arbitrary gaps between pairs of adjacent numbers, but still keeps the cyclic order intact. And then in the *N*-th operation we do a massive reodering and everything falls into place.

Knowing the plan, it is relatively easy to figure out the details. Assuming the last operation uses *x*=0 and some *y*=*Y* that is bigger than all *B _{i}*, we just need to make

*A*=

_{i}*Y**

*k*+

_{i}*B*. And now we can choose

_{i}*k*in such a way that the cyclic order of those

_{i }*A*coincides with the initial cyclic order of

_{i}*A*, moreover we can choose any particular cyclic shift of this cyclic order, which then makes planning the first

_{i}*N*-1 operations easy. You can find more details in the editorial.

However, it is still unclear to me how I could have arrived at this final trick without just "seeing" it out of nowhere. Even though the operation used in this problem is simple, the space of ways to use it is endless, and it was hard for me to somehow narrow down the search. For many problems, it is more or less clear where the difficulty lies and that you must use a certain resource optimally to arrive at the solution, and this helps cut less promising directions quickly when trying to solve the problem. However, in problems like this one, one cannot be sure that the direction is correct until they see a full solution.

To the 174 people who solved this problem during the round: how did you arrive at the solution? Is there some intermediate reasoning step that makes the last leap of faith smaller, or is it just small enough for you as it is?

Thanks for reading, and check back next week!