Algorithms Weekly by Petr Mitrichev: August 2015

Monday, August 31, 2015

A week with SSE

TopCoder SRM 666 happened very early on Wednesday (problems, results, top 5 on the left). Zxqfl won this round despite tough competition that included two coders with "target" rating yeputons and Endagorion - congratulations!

The easy problem in this round was a nice graph theory exercise. You're given a tree and one vertex of the tree is selected. What's the maximum number of different vertices a walk of length L starting from the given vertex can visit?

Codeforces Round 318 ("RussianCodeCup Thanks-Round") took place on Saturday (problems, results, top 5 on the left). Marcin_smu has claimed the first place thanks to a solution to problem E that managed to pass the system tests - congratulations! In the post-match discussion it turned out that both passing solutions for problem E were not intended to pass: Marcin's solution gave a wrong answer on some inputs, and the only other passing solution from logicmachine relied on SIMD CPU instructions to speed up a O(n^2) solution, while the intended complexity was O(n*sqrt(n)).

While both issues could probably be avoided by investing more effort into the round preparation, I think they actually highlight one of the main good aspects of competitive algorithmic programming: it's extremely objective. Every solution is judged against the same testcases, under the same time and memory limits, and completely automatically.

Thanks for reading, and check back next week!

Sunday, August 30, 2015

A week with 42

On Saturday morning, TopCoder Open 2015 Round 2D gave everybody the last chance to get into the top 40 and advance to Round 3 (problems, results, top 5 on the left, parallel round results). Of course, getting into top 1 is still better than getting into top 40, so congratulations to tomerun on winning this round!

Here are the statistics on Round 3 advancers by country:

Russian Federation - 42: Kankuro, kuniavski, pashka, Endagorion, qwerty787788, 2rf, niyaznigmatul, VArtem, darnley, yeputons, eatmore, SergeiFedorov, Egor, Baz93, Petr, hmich, meshanya,LoRd_TaPaKaH, andrewzta, Belonogov, Jedi_Knight, Burunduk1, ilyakor, mikhailOK, RiaDWaW, homo_sapiens, knightL, Michael_Levin, Um_nik, KalininN, Dlougach, Merkurev, Akhi_Anton,HellKitsune, Zlobober, Enot, UdH-WiNGeR, malcolm734, Shapo, _aid, Nikita.Podguzov, Gassa
Japan - 24: uwi, ir5, hogloid, chokudai, Komaki, anta, snuke, stqn, __math, zerokugi, sky58, kawatea, semiexp, asi1024, Lepton_s, ynasu, j_gui0121, tomerun, tozangezan, LayCurse, sugim48, kusano,DEGwer, wo[]
China - 22: Blue.Mary, xhm1994, Sevenkplus, jiry_2, xyz111, ACRush, cgy4ever, zhj1997, zyxwvu164, nhzp339, liympanda, sky_love_high, OIerRobbin, wenhanhuang, jiangyouno1, Meriones,UESTC_elfness, panjf1987, matthew99a, justever86, Shizhouxing, Hillboy
Ukraine - 10: RAVEman, Vasyl[alphacom], dzhulgakov, K.A.D.R, maksay, Gluk, vlad89, LeBron, sdya, Deretin
United States - 10: pieguy, theycallhimtom, lg5293, iridescent, scott_wu, aceofdiamonds, xiaowuc1, SourSpinach, Gaytrodex, ecnerwal
South Korea - 6: Kriii, ainu7, RRx, kcm1700, sukh1222, xhae
Belarus - 5: tourist, ivan_metelsky, forest, Arterm, subscriber
Poland - 5: Eryx, Psyho, pparys, Marcin_smu, mnbvmar
Croatia - 4: lovro, IvL, Zuza, ikatanic
Taiwan - 3: dreamoon, ShikXD, tmt514
South Africa - 2: bmerry, HiltonLange
United Kingdom - 2: al13n, PaulJefferys
Netherlands - 2: krijgertje, davidv1992
Czech Republic - 2: fhlasek, cibulka
Bulgaria - 3: espr1t, danch0, anrieff
Slovakia - 2: baklazan4247, Xellos0
Latvia - 2: eduardische, Alex_2oo8
Brazil - 2: ffao, rafaelhirama
Hungary - 1: ecsirke
Australia - 1: John Dethridge
Georgia - 1: nika
Finland - 1: Sisu
Switzerland - 1: W4yneb0t
Romania - 1: klamathix
Sweden - 1: Gullesnuffs
Bangladesh - 1: shakilaust
Indonesia - 1: handojo1
Mexico - 1: macs
Norway - 1: Im2Good
Iran - 1: EbTech
Canada - 1: zxqfl
Singapore - 1: eugenesh

Codeforces Round 317 ("AimFund Thanks-Round") took place on Saturday evening (problems, results, top 5 on the left). Subscriber has continued his excellent recent form with another victory - great job!

Problem C was a very nice exercise that involved careful reduction to a graph problem, and the observation that the graph problem in question is actually not difficult at all :) The problem statement was: you're given a boolean expression as an AND of several OR-clauses, where each OR-clause is an OR of several terms, and each term is either a variable, or a negation of a variable, and you need to check if the expression evaluates to true for some variable values. Sounds familiar? Well, the familiar problem is NP-complete, whereas this one had one additional restriction: each variable appears at most twice in the formula (including appearances in negated form).

In order for the formula to evaluate to true, each OR-clause has to evaluate to true, meaning that at least one term in each OR-clause has to evaluate to true. If a variable has just one appearance, or if both its appearances are the same (both negated or both not negated), then we can just set its value to make all OR-clauses containing this variable evaluate to true, and forget about those OR-clauses. In case one of the variables has only one appearance in the remaining OR-clauses, we can also set its value, and so on. After this process ends, we're left with some OR-clauses and variables that appear exactly twice in them, once in normal form, and once in negated form.

Here's where graphs come into play: let's make the OR-clauses the vertices of an undirected graph, and connect two OR-clauses with an edge if they share a variable. The problem now asks: is there a way to pick a direction for each edge in this graph in such a way that every vertex has at least one incoming edge?

Had the original problem been formulated in graph terms, the number of solutions would probably be much higher. The remaining graph problem is not difficult, can you see its solution?

In last week's summary, I've promised to explain the solution to Merlin QA, so let me recall the problem statement first. You are given 100 8-dimensional vectors, and you're adding them in one of 100! possible orders, but with a small twist: every time one of the coordinates of the sum is negative, it becomes zero. Here's a 2-dimensional 2-vector example: if you have vectors (5, -2) and (-3, 1), then adding them in the given order yields (0, 0) -> (5, -2) -> (5, 0) -> (2, 1), while adding them in the reverse order yields (0, 0) -> (-3, 1) -> (0, 1) -> (5, -1) -> (5, 0). Your goal is to pick the order of addition that maximizes the sum of coordinates of the resulting vector.

The key idea is: let's consider a slightly different problem! Instead of making each coordinate equal to zero whenever it becomes negative, let's say that we can change any coordinate to zero at any time.

But is this problem really that much different? It's not hard to see that the answers to those problems will always be the same. On one side, setting coordinates to zero whenever they become negative is included in setting them to zero at any desired time, so the answer to the first problem is less than or equal to the answer to the second problem. On the other hand, if we ever reset a positive coordinate to zero or skip resetting a negative coordinate to zero, the answer will not increase, and thus the answer to the second problem is exactly equal to the answer to the first problem!

The newly formulated equivalent problem is much easier to solve, because there's nothing "conditional" happening. For each vector, its contribution to the answer is equal to the sum of its coordinates that will not be reset to 0 after this vector is added. If we consider the last time each coordinate is reset to 0, and reorder the coordinates in decreasing order of that time, then each vector will always contribute a sum of its first k coordinates for some k. Of course, we should now pick k greedily to maximize the contribution. Since we don't know the last time each coordinate is reset to 0, we should consider all 8! possible permutations of coordinates, and apply the above greedy solution to each of them.

Thanks for reading, and check back soon for this week's summary!

Tuesday, August 18, 2015

A week with Merlin

Codeforces Round 315 gave this week a great start on Monday (problems, results, top 5 on the left). Congatulations to Nikolay on his second Codeforces victory!

Problem D was a really nice opportunity for everybody to demonstrate their creativity. You were given 100000 lines on a plane, and needed to check if it's possible to pick 5 points such that every given line passes through at least one chosen point.

Even finding all pairwise intersections of those lines would not fit into the time limit, so we have to somehow reduce the search space. There are two main approaches to do it: randomized and deterministic. The main idea of the randomized approach is: if we just take two random lines from our set, how likely is it that they intersect in one of the 5 sought points? The main idea of the deterministic approach is: how many lines do we actually need out of 100000 to find one of the sought cover points or to determine that it's impossible to cover them with 5 points?

By the way (you might already know this, of course) for most programming contest problems you can submit your solutions after the contest ends, just for fun and to check your ideas - and that includes the problems I mention in this post, so feel free to take a stab at them without the contest pressure. Solving the problems after the contest ends is usually called "upsolving" in the Russian community, but the word doesn't feel right to me. How would you call it?

Google Code Jam 2015 Onsite Finals in Seattle was the main event of the week (problems, results, top 5 on the left, live stream recording). Gennady has continued his unbelievable victory streak well into the second year running - amazing talent and consistency! Bruce Merry has solved the same amount of problems including the most difficult one, but two of his solutions turned out to contain small bugs. Both of those problems, somewhat unusually for Code Jam, had large inputs that were substantially different from the small inputs, and thus one could not use the small's instant feedback to verify the solution for the large. That has cost Bruce dearly.

I was the author of problem A "Costly Binary Search", but I think problem E "Merlin QA" was the most beautiful in this round. Without the background story (which was very nice by iself), here's what the task boiled down to. You are given 100 8-dimensional vectors, and you're adding them in one of 100! possible orders, but with a small twist: every time one of the coordinates of the sum is negative, it becomes zero. Here's a 2-dimensional 2-vector example: if you have vectors (5, -2) and (-3, 1), then adding them in the given order yields (0, 0) -> (5, -2) -> (5, 0) -> (2, 1), while adding them in the reverse order yields (0, 0) -> (-3, 1) -> (0, 1) -> (5, -1) -> (5, 0). Your goal is to pick the order of addition that maximizes the sum of coordinates of the resulting vector.

I encourage you to try solving this problem by yourself for now, as the main idea is so beautiful, and I'll come back with the solution next week.

Google Distributed Code Jam 2015 Onsite Finals took place in Seattle one day later (problems, results, top 5 on the left). Once again had Bruce failed two of his large inputs, but this time he was so far ahead of the rest of the field that it hardly mattered. In fact, he could've failed one more large input and still win the contest!

Thanks for reading, and see you next week.

Sunday, August 16, 2015

A week with zero as a divisor

Yandex.Algorithm final round took place online last week (problems in Russian, results, top 5 on the left). Gennady has won the round thanks to a very important quality: he didn't make any bugs in his solutions :) Congratulations!

It was nicely pointed out that I fail to possess this quality for the third year in a row in exactly the same way: by having a bug in problem C.

In 2013, I've submitted C in a hurry without any testing, and it was not a big surprise that it failed. Here's the diff between my submission and the one that passes - as you can see, it's essentially one more corner case not covered by the original submissions:

+ int extraDelta;
...
+ extraDelta = 0;

...

+ if (totalP + 2 + 4 * (n - total) == p && oldTotalP - 2 + 4 * (n - total + 1) == p && total >= 2) {

+ have[id] = false;

+ have[99 * MAX + 100] = true;

+ totalP += 2;

+ extra = n - total;

+ extraDelta = 1;

+ printBoundary();

+ return true;

+ }

...

- return (r == c && 100-r <= extra);

+ return (r + extraDelta == c && 100-c <= extra);

In 2014, the bug was very stupid, and the best recipe for avoiding its repetition seemed to be simply not making such stupid bugs :) Of course, testing would help tremendously, but the contest format almost rules that out.

- if (hexamino[r][c] == 1) {

+ if (hexamino[r][c] == sides[1][1][0]) {

This year, the bug is equally stupid, and the fix is equally short:
+ for (int i = 0; i < n; ++i) if (pr[i] == 0) pr[i] = i;

Here's the code snippet where that line makes a difference - its goal is to find any prime divisor for all numbers up to n:
int n = (int) 1e6;
int[] pr = new int[n + 1];
Arrays.fill(pr, 0);
for (int i = 2; i * i <= n; ++i)
if (pr[i] == 0) {
pr[i] = i;
for (int j = i * i; j <= n; j += i) pr[j] = i;
}

A very typical question about programming contests goes like this: do the skills you've learned help you in your job as a software engineer? However, my very ugly snippet above begs for the opposite question: do the skills learned on your job as a software engineer help achieve better results in algorithmic programming competitions?

Here are some ways I could've avoided this bug (and thus would've walked away with some more cash), and I think all of them can be learned in a software engineering job:

Hardcoding n=1e6 is a bad idea. Had I not done that, I would've noticed the bug on the sample input where n=11 and prime number 11 is actually used in the output. I did have the array up to the value of n from the input at one point, but then changed to the fixed 1e6 boundary to make sure a possible timeout shows immediately on the first testcase.
Using two different types of special values (pr[i] == 0 and pr[i] == i) to denote the same thing is a bad idea. I should've just filled with pr[i] == i initially.
The previous issue has appeared because I've initially had the code written with boolean array "pr", then converted it to integer array since I needed any divisor in the solution below. Had I thought the solution through before starting coding, this wouldn't have happened.

But why did all those bugs still happen, despite the 8 year software engineering experience? It seems to be a case of over-confidence to me. I was so confident that I could write the solution for this easy problem quickly, that I didn't think it through, and I've made quick patches without thinking about the consequences. Hopefully this is a lesson learned, both for the next year and for the actual software engineering work!

IOI 2015 took place in Almaty one week earlier (problems, results, top 5 on the left). Congratulations to Jeehak Yoon on the full score and the victory, and to Mikhail Ipatov on the close pursuit! This year's problemset consisted of 4 relatively normal format problems, and two somewhat unusual ones.

The first unusual problem, scales, asked to come up with a strategy to sort 6 coins using a very weird comparison that takes 3 or even 4 coins as inputs and has three possible results of each comparison (see the problem statement for more details). Since the number of coins was extremely small - just 6 - solving this problem boiled down to manually experimenting on your own machine with various comparisons and sets of outcomes that appear, as I understand.

The second unusual problem, towns, challenged one with learning some properties about a hidden tree using a small amount of requests of distances between tree's leaves. This problem also requires one to come up with an algorithm, and the unusual side is the fact that you get to pick which part of input you want to see.

I love that the IOI continues to pick such unusual problems, and hope that they'll continue to do so!

Thanks for reading, and check back soon for this week's summary.

Wednesday, August 5, 2015

A week that's never sorry

The problems and results of last week's VK Cup 2015 have been published after the online mirror took place on Thursday (problems, results, top 5 on the left, online mirror results with shuffled problems). Congratulations to Team Never Sorry who have squeezed out the first place by being the only team to solve problem C, taking advantage of the dynamic scoring system. The problem in question wasn't extremely difficult - it required one nice idea that gives most of the solution and then a lot of careful logical reasoning to figure out all corner cases: consider a tree, and for each vertex find the set of vertices at distance at most 2 from it. You're given those n sets in random order (so you don't know which vertex each set corresponds to), and you need to reconstruct the orignial tree. The size of the tree is at most 1000.

TopCoder SRM 664 wrapped up this week on Saturday with three mathy problems (problems, results, top 5 on the left, my screencast). The easiest one went like this: we have two numbers, x and y, and repeatedly double the smaller number, subtracting it from the larger number at the same time, so that the sum stays the same. For example, if we start from (1, 6), then we get (2, 5), then (4, 3), then (1, 6) again and so on. Which two number are we going to get after two billion steps? x and y are up to 1 billion.

Many people have noticed that several examples all lead to a periodic sequence quickly, and implemented general period-finding algorithms. However, it turns out that the period can be rather large, and we got an eventful challenge phase.

In fact, it's an interesting (and not very difficult) question by itself: what is the largest period length under the given constraints?

Thanks for reading, and check back next week!