LeetFree - Leaked Interview questions from Google Facebook Amazon Microsoft LinkedIn

Question
Solution

Given a collection of intervals, merge all overlapping intervals.

For example,
Given [1,3],[2,6],[8,10],[15,18],
return [1,6],[8,10],[15,18].

Approach #1 Connected Components [Time Limited Exceeded]
Approach #2 Sorting [Accepted]

Approach #1 Connected Components [Time Limited Exceeded]

Intuition

If we draw a graph (with intervals as nodes) that contains undirected edges\nbetween all pairs of intervals that overlap, then all intervals in each\nconnected component of the graph can be merged into a single interval.

Algorithm

With the above intuition in mind, we can represent the graph as an adjacency\nlist, inserting directed edges in both directions to simulate undirected\nedges. Then, to determine which connected component each node is it, we\nperform graph traversals from arbitrary unvisited nodes until all nodes have\nbeen visited. To do this efficiently, we store visited nodes in a Set,\nallowing for constant time containment checks and insertion. Finally, we\nconsider each connected component, merging all of its intervals by\nconstructing a new Interval with start equal to the minimum start among\nthem and end equal to the maximum end.

This algorithm is correct simply because it is basically the brute force\nsolution. We compare every interval to every other interval, so we know\nexactly which intervals overlap. The reason for the connected component\nsearch is that two intervals may not directly overlap, but might overlap\nindirectly via a third interval. See the example below to see this more\nclearly.

Components Example

Although (1, 5) and (6, 10) do not directly overlap, either would overlap\nwith the other if first merged with (4, 7). There are two connected\ncomponents, so if we merge their nodes, we expect to get the following two\nmerged intervals:

(1, 10), (15, 20)

\n\n

Complexity Analysis

\n
Time complexity : $O(n^2)$ \n
\n
Building the graph costs $O(V + E) = O(V) + O(E) = O(n) + O(n^2) = O(n^2)$ \ntime, as in the worst case all intervals are mutually overlapping.\nTraversing the graph has the same cost (although it might appear higher\nat first) because our visited set guarantees that each node will be\nvisited exactly once. Finally, because each node is part of exactly one\ncomponent, the merge step costs $O(V) = O(n)$ time. This all adds up as\nfollows:
\n
\n $\n O(n^2) + O(n^2) + O(n) = O(n^2)\n$ \n
\n
\n
Space complexity : $O(n^2)$ \n
\n
As previously mentioned, in the worst case, all intervals are mutually\noverlapping, so there will be an edge for every pair of intervals.\nTherefore, the memory footprint is quadratic in the input size.
\n

Approach #2 Sorting [Accepted]

Intuition

If we sort the intervals by their start value, then each set of intervals\nthat can be merged will appear as a contiguous "run" in the sorted list.

Algorithm

First, we sort the list as described. Then, we insert the first interval into\nour merged list and continue considering each interval in turn as follows:\nIf the current interval begins after the previous interval ends, then they\ndo not overlap and we can append the current interval to merged. Otherwise,\nthey do overlap, and we merge them by updating the end of the previous\ninterval if it is less than the end of the current interval.

A simple proof by contradiction shows that this algorithm always produces the\ncorrect answer. First, suppose that the algorithm at some point fails to\nmerge two intervals that should be merged. This would imply that there exists\nsome triple of indices $i$ , $j$ , and $k$ in a list of intervals\n $ints$ such that $i < j < k$ and ( $ints[i]$ , $ints[k]$ ) can be\nmerged, but neither ( $ints[i]$ , $ints[j]$ ) nor ( $ints[j]$ , $ints[k]$ )\ncan be merged. From this scenario follow several inequalities:

\n $\n\\begin{aligned}\n ints[i].end < ints[j].start \\\\\n ints[j].end < ints[k].start \\\\\n ints[i].end \\geq ints[k].start \\\\\n\\end{aligned}\n$ \n

We can chain these inequalities (along with the following inequality, implied\nby the well-formedness of the intervals: $ints[j].start \\leq ints[j].end$ ) to\ndemonstrate a contradiction:

\n $\n\\begin{aligned}\n ints[i].end < ints[j].start \\leq ints[j].end < ints[k].start \\\\\n ints[i].end \\geq ints[k].start\n\\end{aligned}\n$ \n

Therefore, all mergeable intervals must occur in a contiguous run of the\nsorted list.

Sorting Example

Consider the example above, where the intervals are sorted, and then all\nmergeable intervals form contiguous blocks.

\n\n

Complexity Analysis

\n
Time complexity : $O(nlgn)$ \n
\n
Other than the sort invocation, we do a simple linear scan of the list,\nso the runtime is dominated by the $O(nlgn)$ complexity of sorting.
\n
\n
Space complexity : $O(1)$ (or $O(n)$ )
\n
If we can sort intervals in place, we do not need more than constant\nadditional space. Otherwise, we must allocate linear space to store a\ncopy of intervals and sort that.
\n

Analysis and solutions written by: @emptyset

56. Merge Intervals

Approach #1 Connected Components [Time Limited Exceeded]

Approach #2 Sorting [Accepted]