Algorithms Weekly by Petr Mitrichev

Sunday, December 30, 2018

A probably prime week

This week had one contest from each of the major platforms to wrap up 2018. First off, TopCoder held an experimental SRM 745 on Friday (problems, results, top 5 on the left, analysis). I did not participate myself, but I'd guess that solving 6 problems in 75 minutes felt a bit like SnarkNews Winter/Summer Series. The change of format did not stop Stonefeang from continuing his string of excellent results and winning this round. Well done!

AtCoder held its last contest of the year (and thus the last contest included in the onsite finals selection) on Saturday, the Grand Contest 030 (problems, results, top 5 on the left, my screencast, analysis, onsite finals selection standings, top 8 below on the right). Reading all problems really paid off in this round, as different contestants found different problems the hardest. For me problems B and C were the most difficult — in fact, I could not come up with a solution for any of them. It seems that the top two contestants cospleermusora and tourist had a similar story with problem C, but since it was worth less than problems E and F they ended up on top of the contestants with a different set of 5 problems solved. Congratulations to cospleermusora on the win!

Here is that problem C that was obvious for some and impossible to get for others: you are given a number k which is at most 1000. You need to come up with any nxn toroidal grid (the first row is adjacent to the last row, and the first column is adjacent to the last column), where you choose the number n but it must be at most 500, such that each cell of the grid is colored with one of the k colors, each color is used at least once, and for all pairs of colors i and j all cells with color i must have the same number of neighbors with color j. Can you see a way?

Codeforces held the Good Bye 2018 round on Sunday (problems, results, top 5 on the left, my screencast, analysis). tourist had one more great performance, going through the problems in order and finishing all of them more than 30 minutes faster than the only other contestant to solve everything, MakeRussiaGreatAgain. Congratulations to both!

My round was going pretty well right up to the point when I googled the main idea of the solution for problem G, but then made a bug in the implementation (I forgot that the square roots are always returned modulo n, and not modulo the "remaining" number that I still need to factorize), and instead of looking for it I assumed there's a problem with BigInteger.isProbablePrime in Codeforces and tried to work around it for quite some time. I've found this post with no answers and the post linked from it, and the fact that some of my submissions were getting "Denial of judgement" strongly supported the isProbablePrime failure theory.

Problem H had a pretty convoluted statement which boiled down to: there is a pile with n<=200000 stones, and two players are playing Nim with a fixed set of possible moves a₁, a₂, ..., a_k: in each turn a player can take a number of stones that is equal to one of the a_i, and the player who can't make a move loses. Your goal is to find the nimbers for all values of n between 1 and 200000.

I've tried to solve this problem at the end of the contest, but my mental state after fighting with G for an hour was not ideal :) I realized that I needed to speed up the standard quadratic approach 64 times using bitsets, but could not find a good way to do that. Can you see a way?

Thanks for reading, and check back [hopefully] tomorrow for my take on the best problems of 2018!

Saturday, December 29, 2018

A wave week

The Dec 17 - Dec 23 week contained a surprise extra Open Cup round: Open Cup 2018-19 Grand Prix of Xi'An (results, top 5 on the left, official round results, overall Open Cup standings so far). The Moscow SU team solved three problems in the last hour, including problem A which was solved by only three teams out of 150 attempts, and ended up winning by a problem. Congratulations on the great performance!

Codeforces Round 528 followed on Sunday evening (problems, results, top 5 on the left, analysis). It was mnbvmar's turn to be the only contestant to solve the hardest problem F, and thus win with a huge margin. Congratulations on the victory!

In my previous summary, I have mentioned several problems. The first one came from TopCoder: you are given a rooted tree with 100000 vertices. Each vertex has a demand d_i and a cost c_i. Your goal is to assign some non-negative value v_i to each vertex so that the demand of each vertex is less than or equal to the sum of values on the path from this vertex to the root of the tree, and the sum of c_i*v_i is minimized.

Let's look at the top-most vertices with non-zero value in the solution (such that all their parents and ancestors have zero value). Those vertices are not parents of one another, and thus they satisfy exactly one unit of demand of themselves and their descendants. If we now reduce their value by 1 and repeat, we can split into our solution into "waves" each wave satisfying one unit of demand in all remaining vertices with non-zero demand.

We can also notice that each wave can be constructed as follows: first, start with a wave that contains only the root vertex. Then repeatedly do the following: take any vertex with zero demand that is in the wave, remove it from the wave but add all its children. Each such operation changes the cost of the wave by the sum of costs of the children minus the cost of the removed vertex. Let's call this value the delta of the removed vertex.

Now we'll build the waves one by one while always maintaining the minimum cost wave. Initially such wave is just a root vertex. As soon as the number of waves we've built reaches the demand of a vertex, this vertex becomes available for replacement, which would change the weight of the wave by its delta. If the delta is positive, we don't need to do anything. If the delta is negative, and the vertex is in the minimum cost wave, then we need to do the replacement. If the delta is negative but the vertex is not yet in the minimum cost wave, we will need to do its replacement as soon as it gets into the wave, in other word we should add its delta to the delta of its parent. If the delta of the parent becomes negative, then we propagate it to its parent, and so on, so that at every moment we have only the optimal wave plus some non-negative deltas remaining.

In order for this to run fast, we need to directly propagate to the closest non-zero ancestor instead of going through all parents in the chain to it, which can be accomplished by keeping a disjoint-set union data structure where each set has a vertex with non-zero delta as its representative, and contains all vertices with zero deltas which have it as the closest non-zero ancestor.

This was quite tricky to come up with, but the implementation was rather simple and without any corner cases or data structures more complex than disjoint-set union.

The second problem I mentioned came from AtCoder: you are given n vertices, and n-1 sets of those vertices, each of size at least 2. Your goal is to pick exactly two vertices in each of the n-1 sets in such a way that if we draw the n-1 edges connecting each chosen pair to each other, we get a tree. n is at most 100000, and the total size of the given sets is at most 200000.

The first step is to think when is this actually impossible. The sample cases provided a helpful example: there was a case where there were two vertices that belonged to just one set, and moreover to the same set. Since only one edge will come out of this set, we will not be able to connect both of those vertices to the rest of the graph.

This obstacle can be generalized as follows: if there's a set of vertices of size k which is less than n such that the number of input sets containing at least one of those vertices is less than k, there's no solution.

But wait, we have seen such obstacles before! Consider the bipartite graph where the left part is the input vertices and the right part is the input sets, with edges connecting each set with its members. If there's no obstacle as described above, then by Hall's theorem in this graph there's a matching covering any (n-1)-element subset of the left part.

Let's find any such matching, it covers some (n-1)-element subset of vertices. The fact that all other (n-1)-element subsets can also be covered by a matching means that if we run one more step of the maximum matching algorithm, it must find augmenting paths from the only remaining uncovered vertex to all covered vertices (so that flipping this augmenting path would result in the solution for the corresponding (n-1)-element subset). These augmenting paths form a subtree of the residual network.

Now we can see that if we take any vertex covered by matching, its previous vertex in the augmenting path tree (which will be in the right part, so represent a set), and the set's previous vertex in the augmenting path tree (which will be in the left part, so represent a vertex), then we get two vertices belonging to the corresponding set, and those n-1 pairs are exactly the solution to our problem as they necessarily form a tree!

Finally, there was an Open Cup problem: two players are playing a game on a rooted tree with 100000 vertices. They make moves in turn. In each move of the first player, she colors one of the leaves of the tree black (initially all leaves are colored white). The root of the tree also initially contains a token. In each move of the second player, she can move the token from a vertex to any adjacent vertex. The second player wins if she moves the token to a leaf that is still white. The first player wins when all leaves are colored black and the second player has not yet won. Who will win if both players play optimally?

First, we can notice that it doesn't make sense for the second player to go up, as they will end up in a strictly worse state than before. So for each subtree, there's at most one moment in the game when the token enters this subtree, and then the game continues just in this subtree.

Moreover, suppose that the first player has made x moves into this subtree before the token enters it. Then it's clear that there's some boundary b such that when x<b the second player wins if the token enters this subtree, and when x>=b the first player wins if the token enters this subtree. This finally gives us a linear-time dynamic programming solution: we will compute this boundary b for each subtree going from the bottom to the top.

Thanks for reading, and check back tomorrow!

A Kuhn's week

The Dec 10 - Dec 16 week was livelier than a few previous ones. Codeforces Round 526 started the fun right on Monday (problems, results, top 5 on the left, analysis). Radewoosh continued his string of excellent results, was the only contestant to solve five problems, got the first place with a huge margin and also overtook tourist for the first place in the rating list — very well done!

TopCoder SRM 744 took place on Friday (problems, results, top 5 on the left, my screencast, analysis). Trying to learn from my experience in the TCO final, when approaching the hard problem I have decided to not implement relatively straightforward solutions involving either heavy-light decomposition or propagating piecewise-linear functions up the tree, but instead think a bit more to come up with an easier to implement solution. The strategy has almost backfired as I got my "simple" solution to work with less than two minutes into the contest. However, it did work in the end as a few other contestants going for the straightforward but complicated implementation approaches have failed because of subtle bugs (for example, ainta was above me after the coding phase but his solution failed as he should've used a multiset instead of a set, presumably inside the representation of a piecewise-linear function or something similar).

Here's that problem: you are given a rooted tree with 100000 vertices. Each vertex has a demand d_i and a cost c_i. Your goal is to assign some non-negative value v_i to each vertex so that the demand of each vertex is less than or equal to the sum of values on the path from this vertex to the root of the tree, and the sum of c_i*v_i is minimized. What's the easiest to implement approach, in your opinion?

AtCoder Grand Contest 029 followed on Saturday (problems, results, top 5 on the left, my screencast with commentary, analysis). The problemset was relatively approachable this time, so to win one had to be fast and to minimize incorrect attempts. LHiC executed very well on both fronts and got first place with the margin of 12 penalty minutes. Well done!

I was submitting my last problem around 88-th minute for the first time, but got time limit exceeded on just one testcase. My solution involved Dinic's maximum flow algorithm that I have copy-pasted from a previous solution. Later I submitted the same solution in C++ and it passed in just 0.5s, while the time limit is 4s. Maybe somebody can tell why Java is more than 8x slower this time?

Of course, later I tried submitting Kuhn's maximum matching algorithm in Java instead, which is supposed to be quadratic in the worst case, but it actually passed within the time limit :)

Reducing the problem to maximum matching is also quite fun. Here's the statement: you are given n vertices, and n-1 sets of those vertices, each of size at least 2. Your goal is to pick exactly two vertices in each of the n-1 sets in such a way that if we draw the n-1 edges connecting each chosen pair to each other, we get a tree. n is at most 100000, and the total size of the given sets is at most 200000.

Open Cup 2018-19 Grand Prix of Peterhof took place on Sunday (results, top 5 on the left). Compared to a few previous rounds, the Moscow SU team has cut down dramatically on incorrect attempts, and thus got their first win of the season. Congratulations!

Problem G was a nice dynamic programming exercise. Two players are playing a game on a rooted tree with 100000 vertices. They make moves in turn. In each move of the first player, she colors one of the leaves of the tree black (initially all leaves are colored white). The root of the tree also initially contains a token. In each move of the second player, she can move the token from a vertex to any adjacent vertex. The second player wins if she moves the token to a leaf that is still white. The first player wins when all leaves are colored black and the second player has not yet won. Who will win if both players play optimally? Can you see a way to avoid traversing the entire ~2¹⁰⁰⁰⁰⁰ state space?

Finally, Codeforces ran Avito Cool Challenge 2018 later on Sunday (problems, results, top 5 on the left, analysis). tourist was a bit slower than LHiC but was able to find two challenges and regain the first place in the round, and with it the first place in the rating list. Congratulations!

Thanks for reading, and check back for more!

A lazy week

The Dec 3 - Dec 9 week featured TopCoder SRM 743 (problems, results, top 5 on the left, my screencast, analysis). Stonefeang was quite close to increasing his lead in the overall standings as he was temporarily in first place during the challenge phase after finding a successful challenge. However, ksun48 then found a successful challenge of his own, regained his first place from the coding phase, and joined Stonefeang at the top of the overall leaderboard. Congratulations!

Thanks for reading (was this the shortest post of the year?) and check back for more!

A colorful week

TopCoder SRM 742 was the first event of the Nov 26 - Dec 2 week (problems, results, top 5 on the left, analysis). This was the first round of a new three-month race to a TCO19 onsite spot, and Stonefeang got the best possible start to this race thanks to having the fastest time on two out of three problems. Congratulations!

NEERC 2018, now known as Northern Eurasia Programming Contest 2018, was the main event of the week in Russia and some neighboring countries (problems, results, top 12 on the left, parallel round results, analysis). The world champions from the Moscow SU were trailing during most of the contest, sometimes by more than one problem, but pulled themselves together in the end and still won the contest thanks to being the only team to solve problem H. Congratulations on the great comeback!

The performance of the team White Sprite in the parallel round shows that there is still a lot of room for improvement for the Russian teams, though :)

Thanks for reading, and check back for more!

A center week

The Nov 19 - Nov 25 week had its contests on Sunday, the first one being Open Cup 2018-19 Grand Prix of Dolgoprudny (results, top 5 on the left). For the third time in the last four Open Cup rounds, the team Past Glory was first and the team Moscow SU: Red Santa was second, with quite a huge gap stemming from the large amount of incorrect attempts for the Red Santa. Congratulations to both teams on yet another strong performance!

Codeforces hosted Mail.Ru Cup 2018 Round 3 later that day (problems, results, top 5 on the left, overall results, analysis). Radewoosh continued his fine recent form, solving all problems with 15 minutes to spare and with just two incorrect attempts, compared to V--o_o--V's nine. Congratulations on the win!

In my previous summary, I have mentioned two TCO onsite problems. The first one was constructive: you are given an integer n up to 10¹⁸. You need to return any 4-connected figure which has exactly n different domino tilings, and which fits into a relatively small bounding box: the area of the bounding box must not exceed 14400, and the height of the bounding box must not exceed 13.

It's very easy to build multiplication: given two figures with a and b tilings respectively we can build a figure with a*b tilings by concatenating them possibly with a small insertion in the middle so that no new spurious tilings appear.

However, in order to build arbitrary numbers we usually need both multiplication and addition, and addition is where the main difficulty lies: since the number of domino tilings is a "global" object, it's not clear how we could achieve "either a or b" behavior.

The official analysis shows two great ways to achieve addition, based on the idea that we can build figures that have only one tiling with one "parity" and lots of tilings with other "parity". Please look there for more details (the problem name is "DominoTiling").

The second problem I mentioned was of the more traditional optimization style: you are given n<=200 points on the plane. You need to connect some pairs of points with straight line edges, so that you build exactly n-1 edge and the result forms a tree. What is the minimum possible diameter of the resulting tree (the maximum distance between its vertices, were the length of each edge is equal to the corresponding Euclidean distance)?

Solving this problem required a good understanding of how distances in trees work. More specifically, if we take any diameter of a tree, and find its middle point (which might be a vertex and might be in the middle of an edge), this point will be the center of the tree: all vertices of the tree will be at a distance of at most d/2 from it (where d is the diameter). Moreover, the opposite is also true: if there's a point in the tree such that all vertices are at a distance of at most d/2 from it, then the diameter of the tree is at most d. The important aspect here is that we don't have to consider the distances between pairs of vertices anymore — all that matters are the distances to the center.

Suppose the center lies in the middle of the edge connecting vertices u and v. Then for the part of the tree containing u the distances to the center are going through u. Then we can notice that because Euclidean distance satisfies the triangle inequality, if we rearrange that part of the tree to have all other vertices connected directly to u, the distances to the center will not increase, which means that the diameter will not increase, too.

In other words, we can assume that the optimal tree always looks like an edge u-v, and all other vertices are connected to either u or v.

Moreover, we can see from the above argument that all that matters in the optimal tree is the maximum distance from u to a vertex connected to it, and the maximum distance from v to a vertex connected to it, and thus when deciding whether to connect each other vertex to u or v we just need to think about those two numbers. For example, suppose that w is the vertex farthest from u that is connected to u. Then we can connect to u all vertices that are closer to u than w without increasing the diameter, so the set of vertices connected to u will be a prefix of all vertices sorted by distance to u (during the contest I've sorted by the difference of distances to u and v, which also works).

This allows an O(n³) solution: try all possibilities for u and v, all prefixes of the list vertices sorted by distance to u. Connect the vertices from the prefix to u and the rest to v, and find the minimum diameter of all trees built in this way.

Note that we still need to find the diameters of those trees properly, as we're not guaranteed that the center will be on the u-v edge for any such tree, and finding the diameter properly requires finding the maximum and second maximum edge connected to each of u and v, which can be done by computing prefix and suffix maximums and second maximums.

The "second maximum" part can be avoided if we remember that we don't really care about the cases where the center is on the u-v edge — we just need to make sure that those cases don't incorrectly get a too small diameter computed. For example, Kevin does in fact check in his solution if the diameter is on the u-v edge and throws away the tree if it's not; however, since the distances are floating-point we might get rounding issues here if the optimal center is in fact a vertex, so he had to handle that by adding an epsilon to the comparison. Gennady, on the other hand, just computes the diameter in pessimistic fashion for such trees: if the center ends up being outside the u-v edge, we compute the diameter as 2 times the maximum distance from u and v to other vertices. This will never be less than the actual diameter, and for the cases where the center is u or v this will be equal to it, just as we need.

Thanks for reading, and check back for more!

Friday, December 28, 2018

A 2015 week

TopCoder Open 2018 onsite event in Dallas was headlining the Nov 12 - Nov 18 week. The first semifinal took place earlier on Thursday (problems, results on the left, broadcast, analysis). tourist ended up dominating the proceedings by solving the 1000-pointer in just 15 minutes, but of course the main goal in this round was getting into the top 4. There was quite some drama involved as Egor ended up fifth after resubmitting his solution for the 250-pointer because the original solution could not handle exactly one case correctly: 1x1 grid. Congratulations to tourist, Errichto, Um_nik and ACRush on advancing to the final!

The medium problem in this round was a great combination of simple statement and exciting problem solving, while still being medium difficulty: you are given an integer n up to 10¹⁸. You need to return any 4-connected figure which has exactly n different domino tilings, and which fits into a relatively small bounding box: the area of the bounding box must not exceed 14400, and the height of the bounding box must not exceed 13 (I guess this second condition was required to make it possible to check the output for correctness in reasonable time). Can you see a way?

Semifinal 2 followed very late on Thursday (problems, results on the left, broadcast, my screencast, analysis). Before the semifinal rounds I was telling other contestants that I expect that a fast 250 should be mostly enough to get to 4 out of 8, as you can expect some people to fail the system tests and ending up above them is enough with such generous advancement rules. Well, apparently I had to support my theory by example this time :) You can see from the screencast that I knew that my medium solution was incorrect, but I could not fix it (it turned out that the approach is probably impossible to get to work), and there was just too much implementation involved in the hard problem. During the challenge phase I was very happy when my medium was challenged because the challenger was not Kankuro, and thus I was still solidly in top 4.

The final round took place one day later, on Friday (problems, results on the left, broadcast, my screencast, analysis). There were a couple of important decisions that shaped this round for me.

First, I've decided to implement a quite tedious solution for the easy problem (which was actually incorrect but not caught by system tests). That led to the situation where after submitting the easy I was quite a bit behind, and seeing as the leader tourist has opened the medium problem after solving easy, it felt that going for the hard would increase my chances for the first place.

Then I was actually able to solve the hard problem quite quickly thanks to coming up almost immediately with the key solution idea: that diagonals can be handled independently. This might've been helped by the fact that I have seen this idea before, and in a very similar setting: my previous TopCoder Open victory in 2015 had medium problem with the reference solution using exactly this idea (I did not come up with this idea in 2015, so you can't say that one idea got me two wins, but still the parallels are amusing).

Then I've tried to solve the medium problem, had some good ideas but could not make the solution work, and had almost given up — but then Kevin has submitted his medium with about 9 minutes to go and overtook me in the standings. This was apparently just the extra bit of motivation I needed, as I've then resumed my attempts at solving the medium and managed to submit it with just about 50 seconds remaining in the coding phase! Once again some remarkable parallels to the situation three years ago :)

Here's the medium problem that caused all this excitement: you are given n<=200 points on the plane. You need to connect some pairs of points with straight line edges, so that you build exactly n-1 edge and the result forms a tree. What is the minimum possible diameter of the resulting tree (the maximum distance between its vertices, were the length of each edge is equal to the corresponding Euclidean distance)?

After the dust from the TCO finals had settled a bit, Open Cup Grand Prix of Siberia took place on Sunday (results, top 5 on the left). Both the team Past Glory and the ICPC world champion team from Moscow SU were able to solve 11 problems, but with 13 penalty attempts against 3 and corresponding debugging time the Moscow team had almost 50% higher penalty time. Congratulations to the team Past Glory on the victory!