Tries: Efficient String Search and Autocomplete

You're building a messaging app for mobile phones. Users type messages, and you want to add a spell checker, that red squiggly line under misspelled words that appears as they type.

Sounds simple enough. But here are the constraints:

Check spelling on every keystroke. You have ~10ms before it feels laggy. The English dictionary has 170,000 words, and your app gets 100MB of RAM total. Users can add custom words, names, slang, brand names, so the dictionary is dynamic. You can't send words to a server for privacy reasons, so everything runs locally. And of course, no false positives or negatives.

Your first instinct: store all the words in an array and scan through it. But 170,000 comparisons per keystroke? That's 1.7 million string comparisons per second if the user types fast. Your app will drain the battery and lag behind their typing.

You need something faster. Much faster.

The first wall: strings aren't numbers

"Binary search!" you think. "That's O(log n) lookups. Problem solved."

You store 170,000 words in a sorted array. Binary search means log₂(170,000) ≈ 17 comparisons per lookup. That's way better than 170,000.

But here's the catch: strings aren't numbers.

When you compare two numbers, it's instant. When you compare two strings, you have to check character by character. Comparing "antique" to "monkey" requires checking at least one character to know that 'a' < 'm'. But comparing "antique" to "anthem" requires checking four characters before you find the difference.

// Searching for "antique" in a sorted dictionary
//
// Comparison 1: "antique" vs "monkey"
//   Check: 'a' vs 'm' → 'a' < 'm' → go left (1 char checked)
//
// Comparison 2: "antique" vs "elephant"
//   Check: 'a' vs 'e' → 'a' < 'e' → go left (1 char checked)
//
// Comparison 3: "antique" vs "anthem"
//   Check: 'a' vs 'a' → equal
//   Check: 'n' vs 'n' → equal
//   Check: 't' vs 't' → equal
//   Check: 'i' vs 'h' → 'i' > 'h' → go right (4 chars checked)
//
// ... and so on for 17 comparisons

On average, you're comparing about 6 characters per string (the average English word length). With 17 binary search steps, that's ~102 character comparisons per word.

Step through a binary search below to see the hidden cost. Search for "antique" and watch how many characters the algorithm actually examines. Unlike comparing numbers (which is instant), string comparison is proportional to the length of the common prefix.

That's acceptable for checking one word. But spell checkers need more than exact match. When the user types "th", you need to suggest words starting with "th" (that's autocomplete). With binary search, finding all words starting with "th" means checking potentially thousands of words. Too slow.

Binary search works for numbers, but strings are sequences. We need something that exploits the structure of strings themselves.

The second wall: BSTs duplicate everything

What if we use a binary search tree? Self-balancing, O(log n) lookups, well-understood...

Let's insert these 8 words: "a", "an", "and", "ant", "anthem", "antique", "at", "any".

              "anthem"
             /        \
         "and"      "antique"
         /    \          \
      "an"   "ant"       "at"
      /         \
    "a"        "any"

This works. Search time is O(log n × m) where n is the number of words and m is the word length. For 170,000 words, that's about 17 × 6 = 102 character comparisons. Same as binary search on an array.

But look at the tree more carefully. Notice the waste.

The prefix "an" appears in "an", "and", "ant", "anthem", "antique", and "any". That's 6 nodes all storing "an" as part of their key. The prefix "ant" appears in "ant", "anthem", and "antique" across 3 nodes. Each node stores the complete word, even though most words share prefixes.

Let's count the bytes:

Word	Length	Stored in node
a	1 byte	a
an	2 bytes	an
and	3 bytes	and
ant	3 bytes	ant
anthem	6 bytes	anthem
antique	7 bytes	antique
any	3 bytes	any
at	2 bytes	at
Total	27 bytes	just for strings

Plus each node needs 8 bytes for a left pointer, 8 bytes for a right pointer, and 4 bytes for metadata (balance factor, etc.). That's 20 bytes per node × 8 nodes = 160 bytes of overhead, for a total of ~187 bytes for 8 words.

For the full 170,000-word dictionary: average word length of 6 characters gives ~1 MB for strings, plus 170,000 × 20 bytes = ~3.4 MB for node overhead. Total: ~4.4 MB.

That fits in memory. But the real problem is the redundancy. The prefix "ant" is stored 3 times. "An" is stored 6 times. Every shared prefix is duplicated across every word that uses it.

Build the BST yourself below. Add and remove words to see how the tree structure changes. Notice how each node stores the complete word, duplicating shared prefixes.

BSTs store complete words in each node, duplicating every shared prefix. For natural language text where many words share prefixes (un-, re-, anti-, pre-), this wastes significant space. And search time is still O(log n × m), dependent on dictionary size.

What if we could store common prefixes only once? If "an" is a prefix of "and", "ant", and "anthem", why store it three separate times?

What if the path IS the data?

Let's think differently. In a BST, each node stores a complete word. But what if the word isn't stored in the node, but in the path to the node?

Here's the idea: instead of storing "ant" in a single node, we store it as a path of three edges: 'a' → 'n' → 't'.

Step through the insertion of all 8 words below and watch how the trie reuses shared prefixes. When we insert "and", we don't create new nodes for 'a' and 'n', we reuse the existing path.

This is a trie (pronounced "try", from retrieval). Each edge represents one character. The path from root to any node spells out a prefix. Nodes marked as "end of word" indicate complete words.

In a trie, search time is O(m) where m is the word length. It doesn't matter if your dictionary has 100 words or 100 million words. Searching "cat" always takes exactly 3 character checks.

For a 170,000-word dictionary: a BST needs log₂(170,000) × 6 = ~102 character comparisons while a trie needs just 6 (the word length). The trie is 17× faster.

Start with an empty trie below. Insert "ant", then "and", then "at". Watch how the 'a' node is reused for all three words. Now insert "anthem" and see how it reuses the entire "ant" path, only adding new nodes for "hem".

So what does this structure actually look like in code?

What does a trie actually look like?

A trie is an n-ary tree. Unlike a binary tree (2 children), a trie has one potential child per character in your alphabet.

For lowercase English letters, each node can have up to 26 children. For ASCII, 128. The key insight: one child per unique next character.

class TrieNode {
  // Does a complete word end at this node?
  isEndOfWord: boolean = false;
 
  // Map from character to child node
  // Only stores children that actually exist
  children: Map<string, TrieNode> = new Map();
}
 
class Trie {
  // Root represents the empty string ""
  root: TrieNode = new TrieNode();
}

That's the entire structure. Each node needs just two things: a boolean ("Does a word end here?") and a map of children ("What characters can follow?").

The root node is special. It represents the empty string and is never marked as a complete word (unless you want to store the empty string in your dictionary).

Why use a Map instead of an array of 26 slots? Sparse storage. Most nodes only have 1-3 children. A Map only stores children that exist, while an array would waste 23+ empty slots per node.

Now that we know the structure, let's see how to use it.

Walking the path

Searching a trie is beautifully simple. Start at the root and follow one edge per character. For each character in the search word, check if the current node has a child for that character. If it does, move to that child. If it doesn't, the word isn't in the trie. After consuming all characters, check if the current node is marked as "end of word" to distinguish complete words from mere prefixes.

function search(word: string): boolean {
  let node = this.root;
 
  for (const char of word) {
    // Does this character have an edge?
    if (!node.children.has(char)) {
      return false; // No path for this character
    }
    node = node.children.get(char)!;
  }
 
  // We've followed the full path, is it a complete word?
  return node.isEndOfWord;
}

There are two ways a search can fail.

The first is when the path doesn't exist. Search for "any" in a trie containing only ["ant", "and"]. We follow 'a' and 'n' successfully, but then there's no 'y' edge. Step through it:

The search fails immediately at the missing character. The trie has edges for 't' (leading to "ant") and 'd' (leading to "and"), but nothing for 'y'.

The second failure mode is subtler. The path exists, but it's not a complete word. Search for "an" in the same trie. We successfully follow the path 'a' → 'n'. All characters are consumed, but the 'n' node is NOT marked as "end of word" because we never inserted "an" by itself, only "ant" and "and".

The prefix exists, but it's not a word in our dictionary. This is why the isEndOfWord flag is essential.

Time complexity: O(m) where m = word length. Independent of dictionary size.

Try different searches below. Search for "ant" (success), then "an" (fails: path exists but node isn't marked), then "any" (fails: no 'y' edge). Step through character by character to see exactly where each search succeeds or fails.

Searching is just walking a path. But how do we build those paths in the first place?

Building new paths

Insertion follows the same path-walking logic, but creates nodes when needed. Start at the root. For each character, check if a child exists. If not, create one. Move to the child (existing or newly created). After all characters, mark the final node as "end of word".

function insert(word: string): void {
  let node = this.root;
 
  for (const char of word) {
    // Create child if it doesn't exist
    if (!node.children.has(char)) {
      node.children.set(char, new TrieNode());
    }
    node = node.children.get(char)!;
  }
 
  // Mark this node as end of a word
  node.isEndOfWord = true;
}

Three scenarios show how this works in practice.

When extending an existing path (insert "anthem" into a trie containing "ant"), we follow 'a', 'n', 't' (all exist), then create new edges for 'h', 'e', 'm'. We reused the "ant" prefix and only created 3 new nodes.

When branching from an existing path (insert "and" into a trie containing "ant"), we follow 'a', 'n' (both exist), then create 'd' as a new branch from the 'n' node. Just one new node.

When marking an existing node (insert "an" into a trie containing "ant"), we follow 'a', 'n' (both exist), and simply mark the 'n' node as end of word. No new nodes at all.

Time complexity: O(m) where m = word length.

Start with "ant" already in the trie below. Insert "and" and watch how 'a' and 'n' are reused. Insert "a" itself and see that no new nodes are created at all.

We can walk paths and build paths. But the real payoff comes from something neither BSTs nor sorted arrays can do efficiently.

The killer feature: prefix queries

This is where tries demolish every other data structure.

In a BST, finding all words starting with "an" requires traversing the entire tree and checking each word's prefix. That's O(n × m) in the worst case: checking all 170,000 words, comparing up to 6 characters each. For autocomplete on every keystroke, that's way too slow.

In a trie, words with the same prefix are already grouped together. They're all in the same subtree. Finding them is a two-step process: walk to the node representing the prefix (O(p) where p = prefix length), then collect all complete words in that subtree via DFS.

function autocomplete(prefix: string): string[] {
  // Step 1: Navigate to the prefix node
  let node = this.root;
  for (const char of prefix) {
    if (!node.children.has(char)) {
      return []; // Prefix doesn't exist
    }
    node = node.children.get(char)!;
  }
 
  // Step 2: Collect all words from this subtree
  const results: string[] = [];
 
  function collect(node: TrieNode, currentWord: string) {
    if (node.isEndOfWord) {
      results.push(currentWord);
    }
    for (const [char, child] of node.children) {
      collect(child, currentWord + char);
    }
  }
 
  collect(node, prefix);
  return results;
}

Consider finding all words starting with "an" in a trie containing ["a", "an", "and", "ant", "anthem", "antique", "any", "at"]. Navigate root → 'a' → 'n' (2 steps). DFS from the 'n' node finds "an", "and", "ant", "anthem", "antique", "any". The 't' branch (for "at") is never visited at all.

For a 170,000-word dictionary, prefix "an" matching ~1,000 words:

Approach	Operations	Calculation
BST	~1,020,000	170,000 words × 6 chars each
Trie	~6,002	2 (navigate) + 1,000 × 6 (collect matches)

The trie is 170× faster.

Autocomplete isn't a search, it's a traversal. Tries pre-organize words by prefix, so finding all matches means walking a pre-filtered subtree. No scanning, no checking, no wasted comparisons.

Type "an" below and watch all matching words appear instantly. The trie walks to the 'n' node in O(2) time, then collects everything below. Type "ant" to narrow results further.

Click on any node in the trie below to see all words that share that prefix. Every node represents a potential autocomplete query, and the subtree below contains all matching results.

The numbers don't lie

Let's put concrete numbers on the BST vs trie comparison.

With natural language dictionaries (high prefix overlap), tries save 30-40% memory and are 17× faster for search. With random strings (low overlap), BSTs can be more space-efficient, but tries are still faster for search. The sweet spot: any text with common prefixes.

Metric	BST	Trie	Winner
Search time	O(log n × m)	O(m)	Trie (17× faster for 170k words)
Insert time	O(log n × m)	O(m)	Trie
Space (high prefix overlap)	~4.4 MB	~3 MB	Trie (30% smaller)
Space (low prefix overlap)	~4.4 MB	~5+ MB	BST
Prefix queries	O(n × m)	O(p + k×m)	Trie (170× faster)
Sorted traversal	Natural	Requires extra work	BST

Tries win for natural language dictionaries, prefix-heavy workloads (autocomplete, spell check, search suggestions), real-time constraints (mobile apps, search bars), and large dictionaries where O(m) vs O(log n × m) matters. BSTs might be better for random strings with no prefix overlap, when you need sorted iteration, for very small dictionaries where simplicity wins, or when memory is abundant and prefix queries are rare.

But we've only added words so far. What happens when we need to remove them?

Tearing down paths

Deleting from a trie has a tradeoff: do you want fast or memory-efficient?

When you delete "ant" from a trie that also contains "anthem", do you remove the 't' node? No, "anthem" still needs it. But if "ant" was the only word using that path, leaving empty nodes wastes memory. So you have a choice: fast deletion (just unmark the node) or correct deletion (prune empty branches).

The fast way simply walks to the node and clears the flag:

function delete(word: string): boolean {
  let node = this.root;
  for (const char of word) {
    if (!node.children.has(char)) return false;
    node = node.children.get(char)!;
  }
 
  if (!node.isEndOfWord) return false;
 
  node.isEndOfWord = false; // Just unmark it
  return true;
}

This runs in O(m) but leaves "dangling branches," nodes that have no words below them.

The correct way prunes empty branches by walking back up the tree after unmarking. A node gets deleted only if it's not a complete word AND has no children. The cascade stops when it hits a node that's either a complete word or has other children.

function deleteWithPruning(word: string): boolean {
  return this.deleteHelper(this.root, word, 0);
}
 
function deleteHelper(node: TrieNode, word: string, index: number): boolean {
  if (index === word.length) {
    if (!node.isEndOfWord) return false;
    node.isEndOfWord = false;
    return node.children.size === 0; // Can parent delete us?
  }
 
  const char = word[index];
  const child = node.children.get(char);
  if (!child) return false;
 
  const shouldDeleteChild = this.deleteHelper(child, word, index + 1);
 
  if (shouldDeleteChild) {
    node.children.delete(char);
    // Can we also be deleted?
    return !node.isEndOfWord && node.children.size === 0;
  }
 
  return false;
}

Step through deleting "ant" from a trie containing ["ant", "anthem"]. Watch how pruning stops at the 't' node because "anthem" still needs it.

The 't' node isn't deleted because it has a child 'h' that "anthem" needs. Pruning only removes nodes that are both not marked as complete words and have no children.

Now step through deleting "anthem" from a trie containing only ["anthem"]. This time, every node cascades away.

Every node from 'm' back to 'a' is deleted because none of them have children or are marked as words. This cascading deletion is what makes pruning memory-efficient.

The tradeoff is straightforward: if deletions are frequent and memory is constrained, use pruning. If deletions are rare and speed is critical, skip it.

But even with pruning, standard tries have one more inefficiency worth addressing.

Squeezing out more space

Standard tries have a subtle problem: long chains of single-child nodes.

If your trie contains only "anthem", you have a chain: root → a → n → t → h → e → m. That's 6 nodes, each with exactly one child. Why store 6 nodes when we could store 1 node with the edge labeled "anthem"?

A radix trie (also called compressed trie or Patricia trie) collapses these single-child chains.

Standard trie for ["anthem", "antique"]:

root → a → n → t → h → e → m ✓
                 → i → q → u → e ✓

Nodes: root + a + n + t + h + e + m + i + q + u + e = 11 nodes

Radix trie for the same words:

root → "ant" → "hem" ✓
            → "ique" ✓

Nodes: root + "ant" + "hem" + "ique" = 4 nodes. That's 60% fewer nodes.

When you insert a word that requires splitting an edge (say, inserting "ant" into a radix trie with only "anthem"), the "anthem" edge splits into "ant" + "hem", and the "ant" node is marked as end of word.

Before: root → "anthem" ✓
After:  root → "ant" ✓ → "hem" ✓

Toggle between Standard Trie and Radix Trie below. Notice how radix compresses the "ant" prefix into a single edge, then branches to "hem" and "ique". This is what production systems actually use.

Standard Trie vs Radix Trie

Standard Trie

(root)

✓ abc

Words: abc
Nodes: 4

Radix Trie

(root)

abc

✓ abc

Words: abc
Nodes: 2

Aspect	Standard	Radix
Memory	Higher	30-50% less
Complexity	Simpler	More complex
Search time	O(m)	O(m)
Prefix queries	Fast	Still fast

Radix tries save 40-60% memory compared to standard tries. Insertion and deletion are more complex (edge splitting and merging), but search is still O(m). You'll find them in the Linux kernel's radix tree for page cache lookups, Git's packfile index, IP routing tables for CIDR block lookups, and many database indexes.

Bringing it together: the mobile spell checker

Let's solve the original problem. Here's a complete spell checker using a trie:

class SpellChecker {
  private trie: Trie;
 
  constructor(words: string[]) {
    this.trie = new Trie();
    for (const word of words) {
      this.trie.insert(word.toLowerCase());
    }
  }
 
  // Is this word spelled correctly?
  check(word: string): boolean {
    return this.trie.search(word.toLowerCase());
  }
 
  // Suggest corrections for a misspelled word
  suggest(misspelled: string): string[] {
    const word = misspelled.toLowerCase();
 
    // Find the longest valid prefix
    const prefix = this.longestValidPrefix(word);
    if (!prefix) return [];
 
    // Return all words starting with that prefix
    return this.trie.autocomplete(prefix).slice(0, 10);
  }
 
  private longestValidPrefix(word: string): string | null {
    let node = this.trie.root;
    let longestValid = "";
 
    for (let i = 0; i < word.length; i++) {
      const char = word[i];
      if (!node.children.has(char)) break;
 
      node = node.children.get(char)!;
      if (node.isEndOfWord) {
        longestValid = word.slice(0, i + 1);
      }
    }
 
    return longestValid || null;
  }
}

Usage:

const checker = new SpellChecker(englishDictionary); // 170,000 words
 
// User types "teh"
checker.check("teh");     // false
checker.suggest("teh");   // ["the", "tea", "ten", "team", ...]
 
// User types "antiq"
checker.check("antiq");   // false (incomplete word)
checker.suggest("antiq"); // ["antique", "antiquity", "antiquarian", ...]

Performance with a 170,000-word dictionary:

Operation	Trie	BST (for comparison)
Load dictionary	~100ms	~150ms
Memory usage	~1.7 MB (radix: ~1.2 MB)	~4.4 MB
Single lookup	~1 μs	~100 μs
Autocomplete (1000 results)	~100 μs	~100 ms

It fits in 2 MB (well under the 100 MB budget), responds in < 1 ms per keystroke (well under the 10 ms deadline), works completely offline, and supports adding custom words with a single insert() call.

Tries transform the mobile spell checker from an impossible problem into a trivial one. By exploiting prefix sharing, 170,000 words fit in 1.7 MB and search takes microseconds. The right data structure makes hard problems easy.

Where tries shine (and where they don't)

Use tries when:

You need prefix queries (autocomplete, spell check, search suggestions)
Dictionary is large (10,000+ words)
"Starts with" queries are common
Memory is constrained and prefixes overlap (mobile, embedded)
Dictionary is dynamic (add/remove words)
Lookup time must not depend on dictionary size

Don't use tries when:

Dictionary is tiny (<100 words), a hash set is simpler
Only exact match, never prefix search, a hash table is O(1)
Keys are integers or numbers, a BST or hash table is better
Need sorted traversal, a BST is naturally sorted
Random strings with no prefix overlap, a BST uses less memory

Real-world examples: Google Search autocomplete, IDE code completion (IntelliSense), phone keyboard autocorrect, IP routing with radix tries for CIDR block matching, DNS domain name lookups, spell checkers in word processors and browsers, and DNA sequence databases in genomics.

The journey we took

We started with a mobile spell checker: 170,000 words, 10ms deadline, 100MB budget. Binary search on a sorted array gave us O(log n × m), but autocomplete was too slow. BSTs had the same time complexity and wasted space duplicating shared prefixes. The solution was to store prefixes as paths in a trie, giving us O(m) search independent of dictionary size.

Three insights emerged along the way.

Prefix sharing saves memory. In natural language text where words share common prefixes (un-, re-, anti-, pre-), tries save 30-50% memory compared to storing complete words in each node.

O(m) time is independent of dictionary size. Whether your dictionary has 100 words or 100 million, searching "cat" always takes 3 character checks. This makes tries uniquely suited for large, growing dictionaries.

Prefix queries are traversals, not searches. Tries pre-organize words by prefix. Autocomplete for "th" doesn't search. It walks to the "th" node and collects everything below. That's why it's 100-1000× faster than scanning.

The next time you type in a search box and see suggestions appear before you finish typing, that's a trie working. A data structure designed around a simple insight: strings have prefixes, and we can share them.