The first step is to notice that all edges are of three types: type 2 are the edges where both ends have the same color as the edge, type 1 are the edges where one end has the same color as the edge, and type 0 are the edges where no end has the same color as the edge. We will also direct the type 1 edges from the vertex of the correct color to the vertex of the wrong color.
Clearly type 2 edges are the most useful for reaching our goal. In fact, if every vertex has at least one type 2 edge adjacent to it, then we can simply find the spanning forest among the type 2 edges. If it is a spanning tree, then we have solved the problem. If it is not connected, since we have already satisfied the color constraints, we can augment the forest to a tree using the remaining edges in any way.
What if there are some vertices that do not have type 2 edges adjacent to them? If there is a vertex with no outgoing type 1 edges, or in other words a vertex where all adjacent edges have the wrong color, then there is clearly no solution. So we only have to consider the case where some vertices have type 2 edges adjacent to them, and all remaining vertices have at least 1 outgoing type 1 edge. We will call the vertices with type 2 edges adjacent to them type 2 vertices, and the ones without them type 1 vertices.
The key remaining observation is that we cannot solve the problem using only type 1 edges. Each type 1 edge satisfies the color constraint for 1 vertex, but overall we have n vertices but only n-1 edges. So a logical idea is to start with the spanning forest of the type 2 edges that will cover all type 2 vertices, and then repeatedly try to find a type 1 edge that goes from a yet uncovered vertex to an already covered vertex (building something similar to an arborescence). If we manage to cover all vertices this way, then we can once again augment the resulting forest to a tree using the remaining edges. If we get stuck at some point, then there is no solution.
But why is this correct? We have proven that we need at least one type 2 edge, but why is it always correct to take ~all of them? This is a very typical situation for problems involving the spanning trees, as they all usually end up relying on some kind of greedy argument, so one could just bet on intuition here in my opinion. Still, here is the formal proof: consider the situation where our solution gets stuck. At this moment, the set S of uncoverted vertices contains only type 1 vertices, and there is no type 1 edge going from this set to the set of covered vertices. But then how could S be covered in any other solution? Since we have at most |S|-1 edges within S, and only edges within S can cover its vertices, one of them will have to remain uncovered in any tree.
Thanks for reading, and check back next week!