Beginnerasked at Amazonasked at Googleasked at Metaasked at Microsoft

Arrays

The data structure everything else is built on. Why arr[i] is instant, why inserting in the middle hurts, and the handful of patterns that turn arrays into interview wins.

6 min read2026-01-10Ironclad Academy
Complexity cheat sheet
Operation
Average
Worst
Access by indexone multiply-and-add
O(1)
O(1)
Search (unsorted)
O(n)
O(n)
Insert at end (push)amortized; O(n) on a resize
O(1)
O(n)
Insert at front / middleeverything after shifts
O(n)
O(n)
Delete at front / middle
O(n)
O(n)
// DEPTH
the full breakdown — requirements, capacity, evolution, trade-offs

The mental model: a row of numbered lockers

Picture a hallway of identical lockers, numbered starting at 0. That is an array. Two properties make it what it is:

  1. The lockers are adjacent — locker 5 is physically next to locker 6.
  2. Every locker is the same size — so you can compute exactly where locker i is without walking the hallway.

That second point is the magic trick. If you know the hallway starts at address base, and each locker is size bytes wide, then locker i lives at base + i × size. The computer does one multiply and one add and lands on the exact byte. No searching. This is why arr[1000000] is exactly as fast as arr[3].

flowchart LR
    subgraph "One contiguous block"
      direction LR
      A0["arr[0]<br/>5"] --- A1["arr[1]<br/>2"] --- A2["arr[2]<br/>9"] --- A3["arr[3]<br/>1"] --- A4["arr[4]<br/>7"]
    end
    B["address = base + i × size"] -.-> A2
    style A2 fill:#7c5cff,color:#0a0a0f

Contrast that with a linked list, where each node only knows where the next node is. To reach the 1000th node you have to hop through 999 others. The array's contiguous layout is what makes index access O(1) — and it is also what makes the middle painful.

Where arrays hurt: the shift

Because there are no gaps, inserting in the middle is not a quiet local edit. To put a value at index i, you first have to open a hole by shifting every element from i onward one slot to the right. Delete is the mirror image: remove the element, then shift everything after it left to close the hole.

{ "type": "array", "title": "the shift tax", "data": [5, 2, 9, 1, 7] }

Run the insert val@i and delete@i operations and watch the cascade — each shift is one move, and there can be up to n of them. That is the O(n) staring you in the face. Then try access[i]: one jump, no scan. The contrast is the whole lesson.

The practical rule that falls out of this:

You want to…CostWhy
Read or overwrite arr[i]O(1)direct address
Append to the endO(1)*nothing to shift
Insert/delete at the frontO(n)everything shifts
Insert/delete in the middleO(n)the tail shifts
Find a value (unsorted)O(n)must scan

* amortized — see dynamic arrays below.

If your algorithm is doing lots of front insertions, the array is quietly punishing you, and a deque or linked list may fit better. But be honest about how often that actually happens — most code reads and appends.

Static vs. dynamic arrays

A static array has a fixed size decided up front. C's int a[10] and Java's int[] are static — ten slots, no more, ever.

A dynamic array grows on demand. Python's list, JavaScript's Array, Java's ArrayList, and C++'s std::vector are all dynamic. The trick they use is geometric growth:

  1. Start with a small backing array (say, capacity 4).
  2. When it fills, allocate a new block 1.5× or 2× larger.
  3. Copy the old elements over, then continue appending into the new block.

That copy is O(n) — so isn't append O(n)? No, and this is the subtle, interview-favorite part.

Why append is O(1) amortized

Because the array doubles, resizes get exponentially rarer. To grow from 1 to n elements, you copy 1 + 2 + 4 + ... + n/2 + n ≈ 2n elements total across all resizes. Spread that over n appends and each one costs about 2 copies on average — a constant. That is what "amortized O(1)" means: any single append might be expensive, but the average over many is cheap.

flowchart LR
    C4["cap 4<br/>full"] -->|"resize → copy 4"| C8["cap 8"]
    C8 -->|"resize → copy 8"| C16["cap 16"]
    C16 -->|"resize → copy 16"| C32["cap 32"]
    style C4 fill:#ff4d8d,color:#0a0a0f
    style C32 fill:#00ff9d,color:#0a0a0f

The cost you don't see on a Big-O chart: a resize means a fresh allocation and a full copy, which can cause a latency spike at exactly the wrong moment. If you know roughly how many elements you'll have, pre-sizing (new ArrayList<>(10000), vec.reserve(10000)) skips the resizes entirely.

The part interviews actually test

Knowing that arr[i] is O(1) won't get you hired. Recognizing which array pattern a problem wants will. Almost every array question is a disguised version of one of these:

  • Two pointers — one index from each end (or a slow/fast pair) walking toward each other. Reverses, palindrome checks, pair-sum on a sorted array, dedup in place. Turns many O(n²) brute forces into O(n).
  • Sliding window — a moving range [left, right] over the array, expanding and contracting to maintain some property. "Longest substring without repeats," "max sum of k consecutive elements." O(n) instead of O(n·k).
  • Prefix sums — precompute running totals so any range sum becomes a single subtraction. Range-query problems drop from O(n) per query to O(1).
  • In-place rearrangement — swap elements around using O(1) extra space. Dutch-national-flag, move-zeroes, rotate.

Each of these has its own deep dive in the Techniques section. The point here: when you see an array problem, your first question shouldn't be "what data structure?" — it's already an array — it should be "which of these four patterns is hiding in it?"

A worked example: remove duplicates in place

Given a sorted array, drop the duplicates without allocating a second array. This is the two-pointer pattern in miniature.

def remove_duplicates(arr):
    if not arr:
        return 0
    write = 1                      # next slot to write a unique value
    for read in range(1, len(arr)):
        if arr[read] != arr[write - 1]:
            arr[write] = arr[read]  # keep it
            write += 1
    return write                    # new logical length

One pointer read scans every element; the other, write, marks where the next unique value goes. Because both only ever move forward, it is a single O(n) pass with O(1) extra space — no shifting, no second array. That "two indices moving with different rules" instinct is exactly what the visualizer's search walk is building in you.

When not to reach for an array

Arrays are the right default, but they have natural enemies:

  • Constant front-insertion → use a deque (double-ended queue) for O(1) at both ends.
  • Frequent membership tests ("is x in here?") → a hash set gives O(1) lookups vs. the array's O(n) scan.
  • Always need the min/max → a heap keeps the extreme element one step away.
  • Key→value lookups → a hash table, not a parallel pair of arrays.

But notice every one of those structures is itself usually built on top of an array underneath. Hash tables use an array of buckets; heaps are arrays in disguise; dynamic arrays back most "lists." Learn the array cold and the rest of the course is a series of clever arrangements of this one idea.

// FAQ

Frequently asked questions

Why is accessing an array element by index O(1)?

Because the elements sit in one contiguous block of memory and every element is the same size. To find element i, the CPU computes a single address — base_address + i × element_size — and reads it directly. There is no scanning or following of pointers, so the cost is constant no matter how big the array is or which index you ask for.

Why is inserting into the middle of an array O(n)?

An array has no gaps. To insert at index i, every element from i to the end must shift one slot to the right to open a hole, which is up to n moves. The same is true for deletion in reverse — the elements after the hole shift left to close it. Only insertions and deletions at the very end avoid the shift.

What is the difference between a static array and a dynamic array?

A static array has a fixed capacity chosen when it is created (C arrays, Java `int[]`). A dynamic array (Python `list`, Java `ArrayList`, C++ `std::vector`, JavaScript `Array`) grows automatically: when it fills up it allocates a bigger block — usually 1.5× or 2× — and copies everything over. That copy is O(n), but it happens rarely enough that appends are O(1) amortized.

When should I use an array instead of a linked list?

Reach for an array when you need fast random access by index, you mostly read or append at the end, and you care about cache performance — contiguous memory is dramatically faster to scan. Prefer a linked list only when you insert and delete in the middle constantly and rarely index by position. In practice arrays win the vast majority of the time.

What does "amortized O(1)" mean for appending to a dynamic array?

It means that although a single append can occasionally cost O(n) (when the array is full and must resize and copy), the cost averaged over many appends is constant. Because the array doubles its capacity, resizes happen exponentially less often, so n appends cost O(n) total — O(1) each on average.

// RELATED

You may also like