Algorithms Weekly by Petr Mitrichev

Monday, January 27, 2020

Best problems of 2019

Just like last couple of years (2017, 2018), I've went through the problems I mentioned in 2019 to find the ones I liked the most. I have also looked at some of the problems recommended in this post, in this post, and in various private messages. Of course, this is still an entirely subjective exercise, and it is certainly easier for me to like a problem that I have solved or tried to solve than one that I did not. Here is the shortlist (for those interested, here is a slightly bigger one), as usual in chronological order:

The Open Cup problem "Alien Invasion" about finding an area of a polygon by interactively asking about areas of convex hulls, by ???, with solution in this post.
The AtCoder problem "Triangular Lamps Hard" about finding the original state of a cellular automaton on a triangular grid, by rng_58 and yosupo, with solution in this post.
The Open Cup problem "Equanimous" about inserting + and - between digits of a number to get the smallest positive value and grouping all numbers in a large segment by that value, by tangjz, with solution in this post.
The TopCoder problem "SpanningSubgraphs" about counting spanning subgraphs with each number of edges, by lewin, with solution in this post.
The IOI problem "Split the Attractions" about splitting a graph into three parts of given size such that at least two are connected, by LGM and Saeed_Reza, with solutions discussed in this Codeforces post.

Which one do you think is the very best? Also, please help me fill the unknown problem authors in comments!

TCO20 stage 2 leaderboard

Since the official leaderboard for TCO20 stage 2 is not yet ready, I've put together a small script to compute it. Here's the current top 30:

Rank	Handle	Score	Points
1	Petr	14	3206.22
2	tourist	13	3309.85
3	lyrically	12	2646.81
4	bqi343	12	2301.09
5	Um_nik	10	2383.93
6	hitonanode	10	1588.43
7	yosupo	9	1537.25
8	_aid	9	1506.97
9	natsugiri	9	1485.10
10	kmjp	9	1464.26
11	maroon_kuri	9	1232.54
12	neal_wu	9	1152.15
13	IH19980412	9	1134.90
14	ShadoWsaZ	8	1328.25
15	KevinWan	7	1516.60
16	ksun48	7	1349.06
17	Egor	7	1149.52
18	redocpod	7	1140.82
19	Vasyl[alphacom]	7	1120.55
20	cerberus97	7	781.12
21	socketnaut	7	552.92
22	Kalam132	6	1623.00
23	KKT89	6	1396.76
24	ecnerwal	6	1245.65
25	darnley	6	846.29
26	kuniavski	6	821.25
27	square1001	6	788.87
28	keymoon	6	777.42
29	nwin	6	763.49
30	Jatana	6	640.78

Enjoy! In the future I will most likely just rerun the notebook instead of making new posts, so the updated standings will appear there.

Sunday, January 26, 2020

A cyclotomic week

TopCoder SRM 776 was the main event of this week (problems, results, top 5 on the left, analysis). After the coding phase it seemed as if bqi343 would catch up with me in the TCO20 race, but I was quite lucky twice: first, since my incorrect solution for the 1000 passed the system tests; second, since bqi343's 250 has failed. As a bonus, now I have learned about cyclotomic polynomials (I guess it's more like re-learned — surely my mathematician degree should have got me covered here).

The medium problem was very nice as well. There are n=2*a+b pieces of string, out of which a have both ends red, a have both ends green, and b have one red and one green end, so we have n red ends and n green ends in total. We will randomly pair the red and green ends in one of n! possible ways, and tie the corresponding ends together. What is the expected number of cycles we will get? a and b are up to a million.

In my previous summary, I have mentioned a sub-problem of another TopCoder problem: for which pairs of positive integers a <= b can we split all integers from the set {a, a+1, a+2, ..., b-1, b} into two parts with equal sum?

First of all, the sum of all numbers (a+b)*(b-a+1)/2 must be even. Since the two parts in the product (a+b)*(b-a+1) have different parity, one of the parts must be divisible by 4 for the sum to be even. In case the size of the set (b-a+1) is divisible by 4, we can always make such a split: for each four consecutive numbers, we can split them independently as x+(x+3)=(x+1)+(x+2).

Now, what happens when (a+b) is divisible by 4? The size of the set is odd in this case, so we must split into two unequal parts, the smaller part will have at most (b-a)/2 elements, and the bigger part at least (b-a)/2+1 elements. The sum of (b-a)/2 biggest elements in the set is equal to (b-a)/2*(b+b-(b-a)/2+1)/2=(b-a)*(3b+a+2)/8. The sum of (b-a)/2+1 smallest elements in the set is equal to ((b-a)/2+1)*(a+a+(b-a)/2)/2=(b-a+2)*(3a+b)/8. If the former is smaller than the latter, clearly there's no good split as the smaller part will always have the smaller sum.

It turns out that this condition is not just necessary but also sufficient: if we can somehow get the smaller part to have bigger or equal sum, we can make it have equal sum because we can always repeatedly decrease the sum by 1: find two numbers x and x+1 such that x is in the bigger part and x+1 is in the smaller part, and swap them. This argument is the most beautiful part of the solution in my opinion.

The condition (b-a)*(3b+a+2)/8>=(b-a+2)*(3a+b)/8 can be simplified as b>=a+2*sqrt(a), thus our final answer looks like:

either b-a+1 is divisible by 4, or
a+b is divisible by 4 and b>=a+2*sqrt(a).

Thanks for reading, and check back next week (hopefully for the best problem of 2019 vote as well)!

Tuesday, January 21, 2020

A mathematics week

There were two rounds last week. TopCoder SRM 775 took place on Thursday (problems, results, top 5 on the left, analysis). Tourist has earned a commanding victory while having the fastest time on all three problems, which also meant that nobody could get the 5 points towards the TCO20 qualification. Well done :)

The main part of the hard problem was a nice puzzle that could well appear in a mathematics olympiad: for which pairs of positive integers a <= b can we split all integers from the set {a, a+1, a+2, ..., b-1, b} into two parts with equal sum?

Codeforces Round 614 followed on Sunday (problems, results, top 5 on the left, analysis). There was just one accepted solution for each of the two hardest problems, coming from Um_nik and tourist who have therefore occupied the first two places with a huge margin. Um_nik's problem was worth more points, and he had therefore won the round. Congratulations!

Thanks for reading, and check back next week!

Monday, January 20, 2020

A matroid week

Two weeks ago, TopCoder SRM 774 has started a new race for a TCO20 spot (problems, results, top 5 on the left). The 1000-pointer was tricky both algorithmically and from the implementation standpoint, causing a few resubmissions and a few failed systests for high-scoring solutions. As a result, the total scores were not that high and the importance of the challenge phase was amplified.

In my previous summary, I have mentioned a Codeforces problem: you are given a 20x20 grid colored as a chessboard, with the top-left corner colored black. Some of the cells are removed from the grid, the remaining cells form a 4-connected piece and include the top-left corner. You need to insert some walls between the remaining cells in such a way that we get a good labyrinth: there must be exactly one way to get from each cell to each other cell. Moreover, we want each black cell (remember the chessboard coloring) except the top-left cell to not be a dead-end in the labyrinth: each such cell must have at least two accessible neighbors.

When solving this problem during the round, I have made the correct first step: we want to find a spanning tree where the degree of each black cell is at least two, which is equivalent to finding a spanning forest where the degree of each black cell is at least two as we can always add more edges to get a tree.

But then I've tried to find some greedy approach that takes two edges for each black vertex without forming cycles, realized that it's not always possible, thought that if we want to add an edge that forms a cycle we need to remove some other edge from this cycle and choose another edge for its black vertex, and then somehow failed to notice that I'm just describing finding an alternating path for a matroid intersection problem :) The matroids in question are the cycle matroid and the matroid where independent sets have degree <=2 for each black vertex, and we need to check if the biggest independent set in their intersection has degree of exactly 2 for each black vertex.

In that summary, I have also described a solution to an AtCoder problem that felt like unexplained magic. Um_nik has brought a simpler quadratic solution to my attention: instead of starting from n scores of 1 and adding 1 to a suffix n-1 times, we can start from n scores of n and subtract 1 from a prefix any number of times! The reason we don't need to limit the number of operations to n-1 in this approach is that if the first problem has zero or negative score, then the constraint about the sum of the first k+1 numbers being greater than the sum of the last k numbers would be necessarily violated.

This means we can remove the number of operations dimension from our dynamic programming and it becomes quadratic. This is not directly equivalent to the magical solution, but at least it explains why there's a fertile ground for one.

I have also promised to organize a poll about the best problem of 2019, but for that I need to review all my posts from last year and also the other excellent candidates you shared with me on Codeforces, so this will take some more time. Stay tuned :)

Thanks for reading, and check back for more!

Sunday, January 5, 2020

A red maxflow week

Codeforces ran two contests this week. Hello 2020, as the name suggests, was the first round of the year (problems, results, top 5 on the left, my screencast, analysis). Only four contestants could solve the hardest problem G, and only two of them also solved the remaining problems: mnbvmar and TLE. They had roughly the same speed as well, but mnbvmar only had two attempts that failed pretests compared to TLE's ten, and that's what made the difference. Congratulations to both!

I could solve the first five problems reasonably quickly, and I was quite excited about inventing the randomized solution to problem D and quickly recognizing that problem E is more or less equivalent to a very old problem about counting the number of 4-tuples of points that form a convex quadrilateral (I have a feeling that I wrote about it in this blog, but I seem to be unable to find the entry). However, the last two problems proved insurmountable for me, and I spent most of the time trying to get solutions that were clearly not the intended ones to work: max-flow on a graph of size n*log(n) in F (it turns out it was possible to succeed in this way — check out izban's solution as an example), and repeated randomized search in G. I guess the time might have been better spent just thinking on paper, but then the screencast would not be so exciting :)

On a related note, quite a few people have noticed that I've switched to C++ in the recent contests, and asked why. I don't have much to add to this Egor's comment. In the past I have tried switching to C++ a few times and noticed that I keep fighting with it during the contests instead of solving problems, and I do have a similar feeling now as well despite the better tools. However, I will try to keep using C++ for a longer time to see if things improve :)

Here is the hardest problem from this round for you to try as well: you are given a 20x20 grid colored as a chessboard, with the top-left corner colored black. Some of the cells are removed from the grid, the remaining cells form a 4-connected piece and include the top-left corner. You need to insert some walls between the remaining cells in such a way that we get a good labyrinth: there must be exactly one way to get from each cell to each other cell. Moreover, we want each black cell (remember the chessboard coloring) except the top-left cell to not be a dead-end in the labyrinth: each such cell must have at least two accessible neighbors.

Codeforces Round 612 followed a day later (problems, results, top 5 on the left). The sets of problems solved by the top contestants were very diverse (even though not visible in the top 5 screenshot, problem F was also solved by two contestants), but in the end ainta just solved more problems and won. Well done!

In my previous summary, I have mentioned a couple of problems. The first one came from AtCoder: an assignment of integer scores between 1 and n (not necessarily distinct) to n programming contest problems (n<=5000) is called good if for each k the total score of every set of k problems is strictly less than the total score of every set of k+1 problems. How many good assignments of scores exist, modulo the given prime m?

The first step to solve this problem is to notice that we can get rid of "for each k" qualifier, replacing it with just k=(n-1)/2, rounded down. The reason for this is that in case the constraint is violated for smaller k, we can just add the same problems to both sides to reach k=(n-1)/2, and in case it is violated for larger k, the two sets necessarily have an intersection, so we can remove the intersection until we reach k=(n-1)/2 as well. Also, instead of "every set" we can say: the total score of k problems with the largest scores must be less than the total score of k+1 problems with the lowest scores.

To enforce that last constraint, it would be useful if our problem scores were sorted. This can be achieved with the following more or less standard process: we start with all problem scores equal to 1. Now we do the following n-1 times: add 1 to a suffix of problem scores (this suffix could also be empty or all problems).

Now we can keep track how each such operation affects the value we're interested in: the sum of (n-1)/2+1 smallest elements minus the sum of (n-1)/2 largest elements. Going from the empty suffix to the full suffix, they change that value by 0,-1,-2,...,-(n-1)/2,-(n-1)/2 (if n is even),-((n-1)/2-1),...,-2,-1,0,1. Let's denote that multiset of changes as C. In the end we need the value to be positive, and it starts with 1 (when all problem scores are 1).

Our problem can then be restated as follows: consider all ways to choose n-1 values with replacement from the multiset C (the same value can be chosen as many times as we want). How many ways have a non-negative sum?

Since the sum needs to be non-negative and the only possible positive change is 1, this yields a O(n³) dynamic programming solution that can be sped up to O(n²*logn) and get accepted by skipping the states from which we can't reach the final state: let's process the elements of C in decreasing order, and maintain dp_i,j,k as the number of ways to choose j values from the first i elements of C such that their sum is k. The answer is the sum of dp_n+1,n-1,k over all k>=0.

However, there exists a magical way to speed this up to O(n²). Let's rearrange our dynamic programming slightly: now we will process the elements of C in increasing order, and dp_i,j,k will now be the number of ways to choose j values from the first i elements of C such that their sum is -k. Also, we will stop before we process the only positive element 1 (because that would need special handling anyway as the sums stop being only negative), but also (!) before we process one of the two zeros that we have — in other words, we will only consider the first n-1 elements of C in increasing order.

How to compute the answer from the last row of this dynamic programming? Suppose we have computed dp_n-1,j,k. Now we need to add some number of 0s and some number of 1s, let's denote those as z and o respectively. We need to have j+z+o=n-1 to get n-1 total changes, and k<=o to get a non-negative sum. The solutions to these two constraints look like: o=k, z=n-1-j-k; o=k+1, z=n-1-j-k-1, and so on until z reaches 0. The number of solutions is thus max(0, n-j-k), so our answer is a sum of max(0, n-j-k)*dp_n-1,j,k.

Here comes the magic: since max(0, n-j-k) only depends on j+k and not on j and k separately, and since our transitions just add one to j and the current element of C to k, and so the thing being added does not depend on the values of j and k themselves, we can collapse our dynamic programming states to keep track of j+k only, instead of j and k separately! This means we'll have O(n²) states and O(n²) complexity.

What remains a mystery is: how does one come up with this magic? I guess one could just stumble upon it while trying different approaches. Maybe a more principled way is to use the approach from my old post: if we just implement the O(n³) dynamic programming which processes the elements of C in increasing order, and find out the contribution of each state to the final answer, we can notice that the contributions of states with the same value of j+k are the same and collapse them. Is there any other way that makes this observation look less magical?

The second problem I mentioned was from Codeforces: you are given n integers a_i such that i-n<=a_i<=i-1 (i goes from 1 to n), in other words one integer from [-(n-1),0], one from [-(n-2),1], ..., one from [0,n-1]. You need to find any nonempty subset with a zero sum. You are guaranteed that such subset always exists, which is by itself quite a hint. n is up to a million.

The key idea here is to realize that keeping integers from different segments is a bit clumsy, so let's shift the segments: the first one by n-1, the second one by n-2, and so on. Now all integers are chosen from the segment [0, n-1] which is nice and symmetric, but instead of a zero-sum subset we need to find a subset where the sum of values equals to the sum of shifts.

To restate, we have reduced our problem to the following: you are given n integers a₀, a₁, ..., a_n-1, each between 0 and n-1. You need to find a set S of indices such that Σ_i∈Si=Σ_i∈Sa_i.

When I obtained this reduced problem during the round, it felt really familiar to me, so I've tried to google the answer without much success. It turns out that it's simply quite easy: build a graph from the arrows i->a_i, and just find any cycle in this graph. The indices corresponding to this cycle will satisfy the equality above since the sums will just be the same numbers but in different order.

I will be doing a poll for the best problem of 2019 soon, and I will mostly be picking the candidates from the problems I explained in this blog. However, I realize that there were many great problems in 2019 that I just did not encounter, so if there is a problem you feel should be included in the shortlist that was in a contest that I did not participate in, please mention it in comments! Feel free to also post links to similar discussions, such as this post.

Thanks for reading, and see you next week!

Friday, January 3, 2020

A mobile-first week

Last week has wrapped up the competitive 2019 with two rounds. AtCoder Grand Contest 041 took place on Saturday (problems, results, top 5 on the left, analysis). mnbvmar, ecnerwal and Um_nik in first three places all have a different set of problems solved, and mnbvmar's set was the best one. He also tried to submit a solution that just tries random decisions until time runs out for E in the last minute, but that did not fly. Nevertheless, congratulations on the win!

I was writing this round on the go, and around the middle of the round my laptop shut down because of low battery, which was very exciting as you might guess :) After I've turned it back on it continued to work for a long time somehow, suggesting that the problem lies with the battery level detection. A few minutes before the end of the round it shut down again, so I've tried to fix my solution to problem A from my phone. Unfortunately I was a few seconds too slow, otherwise it would have been a nice achievement :)

Problem D in this round had an awesome intended solution, even though most contestants managed to squeeze in a more boring one: an assignment of integer scores between 1 and n (not necessarily distinct) to n programming contest problems (n<=5000) is called good if for each k the total score of every set of k problems is strictly less than the total score of every set of k+1 problems. How many good assignments of scores exist, modulo the given prime m?

This round has concluded the selection of 8 AtCoder World Tour finalists that would come to Japan in February for the onsite finals (results, top 8 on the right). With this win, mnbvmar has jumped onto a departing train (there is such idiom in Russian, вскочить на подножку уходящего поезда, but I'm not sure if there is a direct English equivalent or a different idiom with the same meaning — is there?) and overtook eatmore by just 6 points. See you all in Japan!

Codeforces Good Bye 2019 followed on Sunday (problems, results, top 5 on the left, my screencast, analysis). Not without help from a notorious coincidence and a bad day for tourist, Radewoosh has solved everything, won the round, ended the 2019 top rated, and was still a bit salty. Still, congratulations!

Problem G generated some conflicting opinions, but I have enjoyed solving it quite a bit: you are given n integers a_i such that i-n<=a_i<=i-1 (i goes from 1 to n), in other words one integer from [-(n-1),0], one from [-(n-2),1], ..., one from [0,n-1]. You need to find any nonempty subset with a zero sum. You are guaranteed that such subset always exists, which is by itself quite a hint. n is up to a million.

Thanks for reading, and see you in the present!