This week had the now traditional last contest of the year: Good Bye 2017 at Codeforces (problems, results, top 5 on the left, analysis, my screencast). The hardest problem H, after removing some layers, ended up being about finding a proper vertex coloring for a graph with n<=23 vertices with the smallest number of colors. My first thought was: surely for such classical problem the best approaches are well-known? However, it turned out that the algorithms I could remember were a O(3n) subset dynamic programming and various heuristics to speed up backtracking, both of which seemed too slow. Thanks to this paper I have now learned how to do this in O(n*2n), see point 4.2 "Inclusion-exclusion". Always nice to discover better approaches to classical problems!
In my previous summary, I have mentioned a TopCoder problem: you are given n=2000000 tasks, to be done in m=2000 days, at most one task per day (so most of the tasks will end up not being done). For each task you know three numbers: the segment of days when it can be done (from li to ri), and the profit ci from doing it. You need to pick the task to be done in each day in such a way that the total profit is maximized.
This can be naturally viewed as a weighted maximum matching problem, which thanks to the transversal matroid can be solved by trying to add tasks to the matching in decreasing order of profit, trying to find the augmenting path each time, and only resetting the "visited" state of each vertex once the matching is increased (this is called "Kuhn's algorithm" in the Russian-speaking community, but I couldn't find an authoritative link for it). Since the matching is only increased m times, we process each edge of the graph at most m times. The problem is that we have O(n*m) edges, so the total running time seems to be O(n*m2). However, we can notice that for each task that we end up not adding to the matching we process its edges only once (since we will never visit it in subsequent iterations), so the number of actually revisited edges is O(m2), and the running time is only O(n*m+m3), which is of course still too slow.
But now we can use another relatively standard trick: let's reduce the number of "active" edges from O(m2) to O(m*logm) in the following manner: instead of a matching problem we'll now have a flow problem, and we'll add auxiliary vertices between left and right parts. The vertices will correspond to segments. First, we'll pick the middle point b=m/2, and add vertices corresponding to segments [1,b], [2,b], ..., [b-1,b], [b,b], and infinite capacity edges in this manner: [1,b]->[2,b]->...->[b,b]; [1,b]->1; [2,b]->2; ..., [b,b]->b. The idea is that if we add an edge from some vertex in the left part to a segment [a,b], then the flow can then go to any vertex in the right part between a and b using those infinite edges, and only to those vertices. We handle the other half in a similar way: [b+1,b+1]<-[b+1,b+2]<-...<-[b+1,m-1]<-[b+1,m]. These two sets of segments already allow to handle any task that can be done in days b and b+1 using only two edges: to [li,b] and to [b+1,ri].
We can then apply the same procedure recursively to segments [1,b] and [b+1,m]: for example, we'll pick the middle point c=b/2, and build the auxiliary vertices [1,c]->[2,c]->...->[c,c] and [c+1,c+1]<-[c+1,c+2]<-...<-[c+1,b], and so on. Overall we'll have logm sets of m auxiliary vertices each (one per level of recursion), with two outgoing edges for each vertex, and every task can now be handled with at most two edges to auxiliary vertices instead of O(m) edges to the right part. This approach resembles the centroid decomposition of a tree when applied to a chain; we could also have used the sparse table to achieve the same result.
Now our graph has O(mlogm) active edges, and additional O(n) edges that are visited only once, so the total running time is O(n+m2*logm), which is fast enough.
That's it for the contests of 2017 — thanks for reading, and check back tomorrow (hopefully) for the best problems of 2017!
In my previous summary, I have mentioned a TopCoder problem: you are given n=2000000 tasks, to be done in m=2000 days, at most one task per day (so most of the tasks will end up not being done). For each task you know three numbers: the segment of days when it can be done (from li to ri), and the profit ci from doing it. You need to pick the task to be done in each day in such a way that the total profit is maximized.
This can be naturally viewed as a weighted maximum matching problem, which thanks to the transversal matroid can be solved by trying to add tasks to the matching in decreasing order of profit, trying to find the augmenting path each time, and only resetting the "visited" state of each vertex once the matching is increased (this is called "Kuhn's algorithm" in the Russian-speaking community, but I couldn't find an authoritative link for it). Since the matching is only increased m times, we process each edge of the graph at most m times. The problem is that we have O(n*m) edges, so the total running time seems to be O(n*m2). However, we can notice that for each task that we end up not adding to the matching we process its edges only once (since we will never visit it in subsequent iterations), so the number of actually revisited edges is O(m2), and the running time is only O(n*m+m3), which is of course still too slow.
But now we can use another relatively standard trick: let's reduce the number of "active" edges from O(m2) to O(m*logm) in the following manner: instead of a matching problem we'll now have a flow problem, and we'll add auxiliary vertices between left and right parts. The vertices will correspond to segments. First, we'll pick the middle point b=m/2, and add vertices corresponding to segments [1,b], [2,b], ..., [b-1,b], [b,b], and infinite capacity edges in this manner: [1,b]->[2,b]->...->[b,b]; [1,b]->1; [2,b]->2; ..., [b,b]->b. The idea is that if we add an edge from some vertex in the left part to a segment [a,b], then the flow can then go to any vertex in the right part between a and b using those infinite edges, and only to those vertices. We handle the other half in a similar way: [b+1,b+1]<-[b+1,b+2]<-...<-[b+1,m-1]<-[b+1,m]. These two sets of segments already allow to handle any task that can be done in days b and b+1 using only two edges: to [li,b] and to [b+1,ri].
We can then apply the same procedure recursively to segments [1,b] and [b+1,m]: for example, we'll pick the middle point c=b/2, and build the auxiliary vertices [1,c]->[2,c]->...->[c,c] and [c+1,c+1]<-[c+1,c+2]<-...<-[c+1,b], and so on. Overall we'll have logm sets of m auxiliary vertices each (one per level of recursion), with two outgoing edges for each vertex, and every task can now be handled with at most two edges to auxiliary vertices instead of O(m) edges to the right part. This approach resembles the centroid decomposition of a tree when applied to a chain; we could also have used the sparse table to achieve the same result.
Now our graph has O(mlogm) active edges, and additional O(n) edges that are visited only once, so the total running time is O(n+m2*logm), which is fast enough.
That's it for the contests of 2017 — thanks for reading, and check back tomorrow (hopefully) for the best problems of 2017!
No comments:
Post a Comment