The Merge Intervals Pattern
Sort by start, sweep and merge — the one instinct that cracks meeting rooms, calendar conflicts, and range overlap problems in O(n log n). Master the overlap test and you'll recognize this pattern before you finish reading the problem statement.
The mental model: a sweep across a timeline
Stop thinking about intervals as pairs of numbers. Think of them as paint strokes on a number line. Some strokes overlap. Some touch. Some have gaps between them. Your job is usually one of:
- Merge — combine overlapping strokes into a single longer stroke.
- Count — how many strokes are active at once? (conference rooms needed)
- Select — which strokes can you keep so none of them overlap?
- Insert — drop a new stroke in and re-merge where it lands.
Every one of these problems has the same first move: sort by start time. After that, you're running a sweep line from left to right, and the algorithm falls out naturally from the question you're answering.
flowchart LR
A["Raw intervals\n(any order)"] -->|"sort by start"| B["Ordered by start time"]
B --> C{"next.start\n<= current.end?"}
C -->|"yes — overlap"| D["Merge: extend end\nto max(cur.end, next.end)"]
C -->|"no — gap"| E["Commit current,\nstart new interval"]
D --> C
E --> F["Done"]
style A fill:#7c5cff,color:#fff
style D fill:#22d3ee,color:#0a0a0f
style E fill:#00ff9d,color:#0a0a0f
The sweep is O(n) after the sort. So the whole pattern costs O(n log n) — dominated by sorting, not the merge logic. That's also the floor: you can't do better, because a comparison-based sort of n intervals can't be faster than O(n log n). See sorting basics if you need a refresher on why.
The overlap test — and the edge case that bites everyone
After sorting by start, two adjacent intervals [a, b] and [c, d] (where a <= c because sorted) overlap if and only if:
c <= b
That's it. The next interval's start is at or before the current interval's end. If c > b, there's a gap and they're disjoint.
When you merge, the new end is max(b, d) — not just d. This is the bug people ship: they assume the later interval must end later. It doesn't. Consider [1, 10] and [2, 4]: the second interval is fully contained inside the first. After merging, the end is still 10, not 4. Taking max handles both cases — partial overlap and full containment.
# The core merge logic — internalize this
merged_end = max(current_end, next_end) # NOT just next_end
The other gotcha is the touching-endpoint question. Do [1, 3] and [3, 5] overlap? They share exactly the point 3. Most interview problems say yes — treat c <= b as overlapping. A few (especially half-open interval problems) say no — use c < b. Read the problem statement carefully and add a comment in your code.
Problem 1: Merge Intervals
Given a list of intervals, merge all overlapping ones and return the minimal non-overlapping set.
The tell: the word "merge" or "combine overlapping." You're not counting or selecting — you're collapsing the list.
Skeleton:
- Sort by start.
- Initialize result with the first interval.
- For each subsequent interval: if it overlaps the last one in result, extend; otherwise append.
def merge(intervals: list[list[int]]) -> list[list[int]]:
# Sort by start — this is almost always step 1
intervals.sort(key=lambda x: x[0])
merged = [intervals[0]]
for start, end in intervals[1:]:
last_start, last_end = merged[-1]
if start <= last_end: # overlap: extend
merged[-1][1] = max(last_end, end)
else: # gap: commit new interval
merged.append([start, end])
return merged
Time: O(n log n). Space: O(n) for the output (O(1) extra if you're writing in-place, but in Python that gets messy).
Notice merged[-1] is always the "current" interval you're building. The loop never looks backward beyond that — sorting gives you this luxury.
Problem 2: Insert Interval
Insert a new interval into a sorted, non-overlapping list. Return the merged result.
This one trips people up because they reach for a re-sort, which wastes the structure you already have. Don't. The existing list is already sorted — use a sweep.
The pattern: three phases — collect intervals that end before the new one starts, merge everything that overlaps the new interval, collect everything that starts after it ends.
def insert(intervals: list[list[int]], new_interval: list[int]) -> list[list[int]]:
result = []
i = 0
n = len(intervals)
start, end = new_interval
# Phase 1: intervals that end before new_interval starts — no overlap
while i < n and intervals[i][1] < start:
result.append(intervals[i])
i += 1
# Phase 2: merge all overlapping intervals into new_interval
while i < n and intervals[i][0] <= end:
start = min(start, intervals[i][0])
end = max(end, intervals[i][1])
i += 1
result.append([start, end])
# Phase 3: intervals that start after new_interval ends — no overlap
while i < n:
result.append(intervals[i])
i += 1
return result
Time: O(n). The list is already sorted, so no sort step — the whole thing is one pass. This is the rare intervals problem that's O(n), not O(n log n).
The phase 2 condition is intervals[i][0] <= end — the next interval's start is within (or touching) our running end. We keep expanding end as we absorb overlapping intervals. When the loop exits, [start, end] is the fully merged new interval.
Problem 3: Non-Overlapping Intervals (the greedy angle)
Given a set of intervals, find the minimum number you need to remove so that the rest don't overlap.
This is the classic interval scheduling problem, and it calls for a different sort order. Sort by end time instead of start time.
Why? Because greedy instinct tells you: if you're trying to keep as many intervals as possible, always keep the one that finishes soonest. It leaves the most room on the timeline for future intervals. Remove everything that conflicts with your choice, then repeat.
def erase_overlap_intervals(intervals: list[list[int]]) -> int:
if not intervals:
return 0
# Sort by END time — this is the key insight
intervals.sort(key=lambda x: x[1])
removals = 0
prev_end = intervals[0][1] # end of the last interval we kept
for start, end in intervals[1:]:
if start < prev_end:
# Overlap — remove this interval (keep the one with earlier end,
# which is intervals[i-1] since we sorted by end)
removals += 1
# prev_end stays: we're keeping the earlier-ending interval
else:
# No overlap — keep this interval
prev_end = end
return removals
Time: O(n log n). The counter-intuitive part: when there's an overlap, we increment removals but don't update prev_end. We're implicitly discarding the current (later-ending) interval and keeping the previous one — because the earlier end gives more runway.
Problem 4: Meeting Rooms II — bring the heap
How many conference rooms does your company need?
Meeting Rooms I is a warmup: sort by start, check if any adjacent pair overlaps — if yes, you need at least two rooms. But Meeting Rooms II asks for the actual minimum count. That requires tracking how many meetings are simultaneously happening.
The heap is the right tool here. A min-heap of end times tells you, at any moment, the earliest time a room becomes free.
{ "type": "heap", "variant": "min", "title": "Min-heap of meeting end times (room availability)", "data": [10, 14, 20] }
Watch how the heap root is always the room that frees up soonest. When a new meeting arrives, we check whether it can reuse that room.
import heapq
def min_meeting_rooms(intervals: list[list[int]]) -> int:
if not intervals:
return 0
# Sort by start time
intervals.sort(key=lambda x: x[0])
# Min-heap of end times — each entry represents one booked room
end_times = []
for start, end in intervals:
if end_times and end_times[0] <= start:
# The earliest-ending room is free by the time this meeting starts
# Reuse it: pop the old end time, push the new one
heapq.heapreplace(end_times, end)
else:
# No room is free — open a new one
heapq.heappush(end_times, end)
# Number of rooms currently in the heap = rooms we opened
return len(end_times)
Time: O(n log n) — sort plus n heap operations, each O(log n). Space: O(n) worst case (all meetings overlap, one room each).
The key line is end_times[0] <= start: the earliest-finishing room's end time is at or before this meeting's start. heapreplace pops the min and pushes the new end in one O(log n) operation — cleaner than a separate pop and push.
Note the edge case: <= vs <. If a room's meeting ends exactly when the next one starts, can you reuse it? Depends on whether your calendar handles back-to-back meetings. Most problems say yes (<=), but double-check.
Complexity summary
| Problem | Sort by | Time | Space | Key idea |
|---|---|---|---|---|
| Merge Intervals | Start | O(n log n) | O(n) | Extend end on overlap |
| Insert Interval | Pre-sorted | O(n) | O(n) | Three-phase sweep |
| Non-Overlapping Intervals | End | O(n log n) | O(1) | Greedy: keep earliest-ending |
| Meeting Rooms I | Start | O(n log n) | O(1) | Any overlap? return false |
| Meeting Rooms II | Start | O(n log n) | O(n) | Min-heap of end times |
The patterns at a glance
Sort by start when you're merging or counting. Sort by end when you're selecting (maximizing kept intervals / minimizing removals).
This isn't arbitrary. Sorting by end serves the greedy proof: keeping the earliest-finishing interval is always at least as good as any other choice, because it constrains future decisions the least. That proof doesn't work for start times. See greedy for the full treatment of why greedy strategies need this kind of exchange argument.
Pitfalls people actually ship
Forgetting to sort first. I've seen this in production code. Someone reads the input, assumes it's already sorted (it was in their test data), and deploys. The overlap check silently skips real conflicts. Always sort, or assert the list is sorted.
Using next.end instead of max(current.end, next.end) when merging. Kills you on contained intervals like [1,10], [2,4]. The merged interval must be [1,10], not [1,4]. This is the single most common bug in merge intervals implementations.
Treating Meeting Rooms I and II as the same problem. They're not. Meeting Rooms I is a yes/no question answerable with a sort and one comparison loop. Meeting Rooms II needs the heap. Know which one you're solving before you write a line of code.
Off-by-one on the overlap condition. start < end vs start <= end. If [1,3] and [3,5] are in your input and you're using <, you'll leave them unmerged when the problem expects them merged. Read the problem, pick your operator, comment it.
Re-sorting in Insert Interval. The input list is already sorted. If you throw it into a sort before inserting, you've turned an O(n) problem into O(n log n) for no reason — and signaled to your interviewer that you didn't notice the constraint.
When NOT to use this pattern
Your "intervals" don't have a total order. Merge intervals assumes your ranges live on a single dimension (time, numbers, etc.) and you can sort them. Multi-dimensional ranges (2D rectangles, time + priority) don't sort cleanly — you need different tools.
You're doing this repeatedly on a mutable set. If meetings are being added and removed continuously and you need the room count after every change, a static sort-and-sweep is the wrong shape. Look at a balanced BST or a segment tree that supports dynamic updates.
You have a massive sorted dataset and sparse updates. Re-sorting from scratch after every insert is O(n log n) each time. Binary search the right position and insert — O(log n) find, O(n) insert into an array, but with a structure like a sorted linked list or a balanced BST you can do O(log n) total. Still, for most interview-scale inputs (n ≤ 10⁵), the sort is fine.
You need the union or intersection of two large interval lists efficiently. Two-pointer merge of already-sorted lists is O(n + m) — faster than re-sorting. If you're given two sorted lists, don't concatenate and sort; use two pointers advancing through each list independently.
The merge intervals pattern is deceptively simple — sort, sweep, one condition, done. What makes it stick in interviews is the discipline to recognize when a problem is an intervals problem (meetings, bookings, ranges, schedule conflicts), and the fluency to pick the right variant: merge vs. count vs. select vs. insert. Get the overlap test tattooed on your brain (next.start <= current.end), remember when to swap to sort-by-end, and you'll recognize the shape before you finish reading the problem.
Frequently asked questions
▸How do you check if two intervals overlap?
Two intervals [a, b] and [c, d] overlap if and only if a <= d AND c <= b — equivalently, neither one ends before the other starts. The cleanest single-condition form after sorting by start: the next interval's start is <= the current interval's end (c <= b). If c > b, there's a gap and they don't overlap. Watch the edge case: [1,3] and [3,5] share just the point 3 — whether that counts as an overlap depends on the problem. Most LeetCode-style problems treat touching endpoints as overlapping.
▸Why do you sort by start time before merging intervals?
Sorting by start time gives you a sweep-line guarantee: once you've processed interval i, no future interval can start earlier than it. That means you only ever need to compare the current interval against the last one you committed to your result — no need to look backward. Without sorting, you'd have to check every pair (O(n²)). Sorting costs O(n log n) and reduces the merge itself to a single O(n) pass.
▸What is the difference between the Meeting Rooms I and Meeting Rooms II problems?
Meeting Rooms I asks: can one person attend all meetings? Sort by start, check if any adjacent pair overlaps — O(n log n), no heap needed. Meeting Rooms II asks: what is the minimum number of rooms (or workers) needed? Now you need to track how many meetings are happening simultaneously at any point. Sort by start, use a min-heap of end times — each time a new meeting starts, if the earliest-ending room is already free, reuse it; otherwise open a new room. Heap size at the end is your answer.
▸When should I sort by end time instead of start time?
Sort by end time when you're optimizing for how many intervals you can keep — classic greedy activity selection. The logic: always pick the interval that finishes earliest, because it leaves the most room for future intervals. This is the engine behind the non-overlapping intervals problem (minimum removals to make all intervals disjoint). If you're merging or inserting, sort by start. If you're selecting/removing, sort by end.
▸What are common signs that a problem is an interval problem in disguise?
Look for: meetings, bookings, schedules, calendar events, tasks with start/end times, numeric ranges, IP address ranges, server availability windows. Any time a problem gives you pairs of numbers that represent a span and asks about conflicts, coverage, or count — that's intervals. The tell in the problem text is usually 'overlapping' or 'no two X share a Y' or 'minimum number of Z needed.'
You may also like
The Top-K Elements Pattern
How a heap of size k gives you the k largest (or smallest, or most frequent) items in O(n log k) — and why a min-heap is the counterintuitive tool for finding the biggest things.
Prefix Sums
Precompute running totals so any range sum becomes one subtraction. The prefix sum technique turns O(n) per query into O(1) — and the hashmap combo unlocks a whole class of subarray counting problems.
Monotonic Stacks
The monotonic stack is the O(n) trick for finding the next greater or smaller element — a sorted stack you maintain on the fly, with one deceptively simple pop-while loop at its core. Used in Daily Temperatures, Largest Rectangle in Histogram, and Trapping Rain Water.