Cyclic Sort
The O(n) time, O(1) space trick for arrays that hold numbers in a known range 1..n: place each value at its natural index, then scan once for anything out of place. Missing numbers, duplicates, and corrupt pairs fall out as a natural consequence.
The moment you should recognize this
You're looking at a LeetCode problem. The array has n integers. The problem says something like "the integers are in the range 1 to n" or "the array contains n+1 integers each in the range 1 to n." Then it asks one of: find the missing number, find all duplicates, find the duplicate and the missing, find the first missing positive. And the follow-up whispers "try to do it in O(1) extra space."
That is cyclic sort. The hash-map solution appears immediately to most people (and works), but the O(1) space constraint exists specifically to push you toward this technique.
Two conditions, both need to be true:
- The values have a known, bounded range directly tied to the array length.
- The problem asks about membership in that range — what is missing, what appeared twice, what broke the sequence.
If you see both, you are almost certainly looking at a cyclic sort problem. The two pointers pattern also gives you O(1) space, but it works on sorted arrays with a pair condition — a structurally different shape. Cyclic sort owns the "values in a range, find the anomaly" shape.
How it works
Here's the core sort in twelve lines:
def cyclic_sort(nums):
i = 0
while i < len(nums):
# where does nums[i] belong?
correct = nums[i] - 1 # value v belongs at index v-1
if nums[i] != nums[correct]: # not already in the right spot
nums[i], nums[correct] = nums[correct], nums[i] # swap it home
else:
i += 1 # current slot is correct, advance
return nums
The invariant: do not advance i until nums[i] is in its correct slot. Every swap places exactly one element correctly. Since there are at most n elements, there are at most n swaps total, regardless of how the inner loop appears to nest. That is the whole O(n) argument.
The guard nums[i] != nums[correct] instead of i != correct is load-bearing. If two elements share the same value (a duplicate), nums[i] == nums[correct] even though i != correct — the swap would be infinite. The equality check on values breaks the cycle.
flowchart TD
A["i = 0"] --> B{"nums[i] == nums[correct]?"}
B -- "No (not in place, not a duplicate)" --> C["swap nums[i] ↔ nums[correct]<br/>(stay at i)"]
C --> B
B -- "Yes (correct slot or duplicate)" --> D["advance i++"]
D --> E{i < n?}
E -- Yes --> B
E -- No --> F["sort complete — scan for anomalies"]
style C fill:#7c5cff,color:#fff
style F fill:#00ff9d,color:#0a0a0f
After the sort, the answer scan is always the same shape: walk the array, check each slot. The exact question determines what you report.
Worked problems
Problem 1: Missing Number (easy warm-up)
Given an array
numsof n distinct integers in range[0, n], find the one missing number.
This is the gentlest introduction. The range is [0, n] — that's n+1 possible values but only n slots, so exactly one is missing.
def missing_number(nums):
i, n = 0, len(nums)
while i < n:
correct = nums[i] # value v belongs at index v (range is 0..n)
if correct < n and nums[i] != nums[correct]:
nums[i], nums[correct] = nums[correct], nums[i]
else:
i += 1
# scan: which index doesn't hold its expected value?
for i in range(n):
if nums[i] != i:
return i
return n # 0..n-1 are all present, so n is missing
Two things worth noting. First, the range is [0, n] not [1, n], so value v goes to index v (not v-1). Adjust for the range you're given. Second, n itself can't fit in any slot (the array only has indices 0 to n-1), so we skip it with the correct < n guard. After the sort, the first misaligned index is the answer. If every index i holds value i, then n is missing.
Time: O(n). Space: O(1). The Gauss formula n*(n+1)/2 - sum(nums) also solves this in one line — but only for the single-missing case. The cyclic sort generalizes.
Problem 2: Find All Duplicates
Given an array of n integers where each integer is in
[1, n], some elements appear twice. Find all duplicates in O(n) time and O(1) extra space.
def find_all_duplicates(nums):
i = 0
while i < len(nums):
correct = nums[i] - 1
if nums[i] != nums[correct]:
nums[i], nums[correct] = nums[correct], nums[i]
else:
i += 1
# scan: wherever arr[i] != i+1, arr[i] is a duplicate
duplicates = []
for i in range(len(nums)):
if nums[i] != i + 1:
duplicates.append(nums[i])
return duplicates
After the sort, slot i should hold i+1. If it doesn't, the value sitting there is a duplicate (it has a twin already parked at its home slot, which is why this one couldn't get there).
{ "type": "array", "title": "after cyclic sort: slot 1 and slot 4 hold wrong values — those are the duplicates", "data": [1, 3, 3, 4, 5] }
Concretely: if nums = [4, 3, 2, 7, 8, 2, 3, 1], after sorting you get something like [1, 2, 3, 4, 3, 2, 7, 8]. Indices 4 and 5 are wrong — they hold 3 and 2, which are the duplicates.
The result array is O(k) where k is the number of duplicates, which the problem expects. The algorithm itself uses no extra space for the sort.
Problem 3: Find the Corrupt Pair
An array contains n integers in
[1, n], but one number appears twice and one is missing. Find[duplicate, missing].
This is the "one element stole another's slot" scenario. The sort reveals both crimes at once.
def find_corrupt_pair(nums):
i = 0
while i < len(nums):
correct = nums[i] - 1
if nums[i] != nums[correct]:
nums[i], nums[correct] = nums[correct], nums[i]
else:
i += 1
for i in range(len(nums)):
if nums[i] != i + 1:
return [nums[i], i + 1] # [duplicate, missing]
return [-1, -1]
The mismatch index i tells you two things: i+1 is missing (nothing placed it), and nums[i] is the duplicate (it showed up when its own home slot was already occupied). One scan, two answers.
Problem 4: First Missing Positive (hard)
Given an unsorted array of integers — including negatives and values larger than n — find the smallest positive integer not in the array. O(n) time, O(1) space.
This is the problem where cyclic sort gets challenged. The values are not clean — you have negatives, zeros, and numbers well above n. The trick: only values in [1, n] matter. Values outside that range can never be the answer (if 1 through n are all present, the answer is n+1; if any of them is absent, the answer is smaller than or equal to n). So treat out-of-range values as squatters — place them anywhere that keeps a valid value from being blocked.
def first_missing_positive(nums):
n = len(nums)
i = 0
while i < n:
correct = nums[i] - 1 # value v belongs at index v-1
if 1 <= nums[i] <= n and nums[i] != nums[correct]:
nums[i], nums[correct] = nums[correct], nums[i]
else:
i += 1 # out of range or duplicate: skip
# scan: first slot where arr[i] != i+1
for i in range(n):
if nums[i] != i + 1:
return i + 1
return n + 1 # 1..n all present
The guard 1 <= nums[i] <= n is doing real work. A negative number, zero, or a number above n will never occupy its "correct" slot since that slot doesn't exist. Without the guard you'd swap them into positions that corrupt slots you care about, or loop forever on an out-of-range value that can't reach any valid home.
Input [3, 4, -1, 1] after sort becomes [1, -1, 3, 4]. Index 1 has -1 instead of 2, so the answer is 2. Input [1, 2, 0] after sort becomes [1, 2, 0]. Index 2 has 0 instead of 3, so the answer is 3. Every case reduces to "first broken slot."
Complexity, summarized
| Phase | Time | Space |
|---|---|---|
| Cyclic sort pass | O(n) | O(1) |
| Answer scan | O(n) | O(1) |
| Total | O(n) | O(1) |
The sort is better than comparison-based sorting (O(n log n) lower bound) because it cheats: it is not comparing elements against each other, it is computing a destination directly from each element's value. That destination-from-value trick only works when values are in a known bounded range — so it is strictly a special case, not a general replacement for sorting.
Pitfalls people actually hit
The infinite loop on duplicates. The swap guard must compare values, not indices. If you write while i != correct: swap, you spin forever when nums[i] == nums[correct] with i != correct. This is the #1 bug in cyclic sort implementations. Always guard on nums[i] != nums[correct].
Off-by-one from the range. A 1-indexed range (values 1..n) maps value v to index v-1. A 0-indexed range (values 0..n-1) maps value v to index v. Mix these up and every slot will look wrong. Read the problem statement twice, write the mapping once as a comment, commit to it.
Mutating the input without permission. Cyclic sort rearranges the input array in-place. In an interview, say this explicitly — "I'm going to modify the input array; is that okay?" Most problems allow it; if not, copy first (which costs O(n) space, defeating the point unless you need the original for some other reason).
Treating out-of-range values in "first missing positive." This is the subtle one. A value like -7 or n+100 must be skipped during the sort, not swapped into an arbitrary slot. The 1 <= nums[i] <= n guard handles it — do not write nums[i] > 0 and nums[i] <= n without checking your edge cases for nums[i] == 0.
Forgetting that the sort mutates the array. If your problem asks you to return indices (not values), you might need to record where things moved. Cyclic sort does not track provenance — it gives you sorted values, not a permutation log.
When NOT to use cyclic sort
Cyclic sort is a specialist, not a generalist. It breaks down immediately outside its niche.
| Situation | Better approach |
|---|---|
| Values not in a bounded integer range | General sorting or a different technique |
| Need original indices preserved (unsorted output) | Hash map instead of in-place sort |
| Finding one missing number (simple case) | Gauss formula or XOR — fewer lines, same complexity |
| Membership test in general set | Hash table — O(1) average lookup |
| Array has floats, strings, or mixed types | Cyclic sort cannot apply; needs a comparison-based sort |
The arrays article notes that in-place rearrangement is one of the four core array interview patterns. Cyclic sort is the canonical representative of that pattern when the values carry implicit positional information. Outside that specific shape, you are better off with a different tool.
The two structures cyclic sort most directly replaces for these problems are: a hash set (O(n) space, O(n) time — same time, worse space) and sorting (O(n log n) time, O(1) space with an in-place sort — worse time, same space). Cyclic sort wins on both dimensions simultaneously, which is why interviewers love it for the in-range-values problems.
If you are not sure which technique to reach for, ask yourself these questions in order:
- Are the values integers in a known range tied to the array length? → cyclic sort candidate.
- Is the problem about what is present or missing in that range? → almost certainly cyclic sort.
- Does the constraint say O(1) extra space? → confirmed: cyclic sort.
Three yeses and you should be writing the sort loop before you finish reading the problem.
Frequently asked questions
▸What is the cyclic sort pattern?
Cyclic sort is a technique for arrays where every element is a number in a known range — typically 1 to n, or 0 to n-1. Because you know where each number belongs (value v goes to index v-1), you can sort the array in-place by repeatedly swapping each element to its correct slot. After the sort, a single linear scan reveals anything out of place: an index i where arr[i] != i+1 means either i+1 is missing or arr[i] is a duplicate. The whole thing is O(n) time and O(1) extra space.
▸Why is cyclic sort O(n) even though it has a nested loop structure?
The outer loop visits every index, and the inner swap can run multiple times before moving on — that looks like it could be O(n²). But each swap moves exactly one element to its correct final position, and no element is moved twice (once an element is in the right slot, the algorithm never touches it again). So the total number of swaps across the entire sort is at most n, which means the total work is O(n), not O(n²). It is the same amortized argument as dynamic array resizing.
▸What is the 'tell' that a problem wants cyclic sort?
Two signals together: (1) the input is an array of integers and (2) the problem says the values are in a known range — usually 1 to n or 0 to n-1, where n is the array's length. Any follow-up asking you to find missing numbers, find duplicates, or identify the 'corrupt pair' without extra space is almost certainly cyclic sort. The constraint 'O(1) extra space' closes the door on hash maps and makes cyclic sort the intended solution.
▸How does cyclic sort find missing or duplicate numbers?
After the sort pass, every slot i should hold the value i+1 (for a 1..n range). Scan the array. If arr[i] != i+1, then i+1 is the missing number and arr[i] is the duplicate (or simply missing, depending on the problem). The mismatch is the anomaly. For multiple missing/duplicate values, collect all the mismatched indices in one scan.
▸What is the difference between cyclic sort and the XOR / Gauss formula tricks for missing number?
The XOR trick and Gauss's sum formula (n*(n+1)/2 minus the actual sum) find a single missing number in O(n) time and O(1) space, and they are slightly simpler to code for that one case. Cyclic sort is more general: it handles multiple missing numbers, multiple duplicates, and variants where out-of-range values exist (like first missing positive). If the problem is just 'find the one missing number,' either approach works. If the problem is harder — all duplicates, all missing, the corrupt pair — reach for cyclic sort.
You may also like
The Top-K Elements Pattern
How a heap of size k gives you the k largest (or smallest, or most frequent) items in O(n log k) — and why a min-heap is the counterintuitive tool for finding the biggest things.
Prefix Sums
Precompute running totals so any range sum becomes one subtraction. The prefix sum technique turns O(n) per query into O(1) — and the hashmap combo unlocks a whole class of subarray counting problems.
Monotonic Stacks
The monotonic stack is the O(n) trick for finding the next greater or smaller element — a sorted stack you maintain on the fly, with one deceptively simple pop-while loop at its core. Used in Daily Temperatures, Largest Rectangle in Histogram, and Trapping Rain Water.