Hindley-Milner Type Inference

Some programming languages make you write types everywhere:

Map<String, List<Integer>> buildIndex(List<String> words) {
    Map<String, List<Integer>> index = new HashMap<>();
    // ... 50 more lines of type annotations
}

Others let you skip the types entirely, but crash at runtime when things go wrong:

def build_index(words):
    index = {}
    # ... works until someone passes None

But a third category does something remarkable: they figure out all the types automatically and catch errors before your code runs. No annotations needed. No runtime type errors possible.

buildIndex words =
    -- Compiler infers: [String] -> Map String [Int]
    -- Type error? Won't compile.

How? The answer is the Hindley-Milner (HM) type inference algorithm. It's one of the most elegant algorithms in computer science, and it treats your code as a constraint puzzle: every operation adds a constraint, and solving those constraints reveals the types.

In this post, we're going to build this algorithm from scratch. We'll start with simple expressions, add function types, then tackle the tricky part: polymorphism (generics that work automatically). Along the way, we'll see:

Why type inference is constraint solving at its core
How unification makes two types compatible
What let-polymorphism means and why it matters
How the algorithm guarantees to find the most general type
Why this is decidable when full polymorphic lambda calculus isn't

Think of this as "how arrays work" but for type systems. We're not learning a language. We're understanding the algorithm that powers ML, OCaml, Haskell, Rust's inference, and more.

Let's build it.

The Problem: Types Hide Until Runtime

Consider this Python function:

def calculate_discount(price, percent):
    discount = price * percent
    return price - discount
 
# Works fine
calculate_discount(100, 0.1)  # Returns 90.0
 
# Silently wrong
calculate_discount("100", "0.1")  # Returns "100" - no error!
 
# Crashes later
result = calculate_discount(100, "10%")
# TypeError: unsupported operand type(s) for *: 'int' and 'str'

The bug doesn't reveal itself until runtime, potentially days after deployment when a user enters "10%" instead of 0.1.

What We Need

To catch this bug earlier, we need to:

Track what type each variable is
Check operations are valid (can't multiply int and string)
Do this before running the code

Python's approach: write type hints manually, run mypy separately.

OCaml's approach: the compiler figures out types automatically and refuses to compile invalid code.

Let's see how we might invent this ourselves.

Building a Type Checker: First Attempt

Imagine we're designing a simple type checker. We'll start with basic types:

int - integers like 42
string - text like "hello"
bool - true or false

Rule 1: Literal Values Have Obvious Types

When we see a literal value, its type is clear:

42        (* int *)
"hello"   (* string *)
true      (* bool *)

Easy enough. Let's formalize this:

Type Rule: Literals
42 : int
"hello" : string
true : bool

Rule 2: Operations Constrain Types

When we see an operation, it constrains what types are allowed:

5 + 3        (* Both must be int, result is int *)
"hi" ^ "!"   (* Both must be string, result is string *)
5 > 3        (* Both must be int, result is bool *)

Now we have a problem to solve:

The Problem: What if we don't know the types yet? How do we check if operations are valid when variables could be anything?

let add a b = a + b

We don't know what a and b are, but we know:

The + operator requires both arguments to be int
Therefore, a : int and b : int
And the result must be int

The operation constrains the unknown types!

Type Inference: Solving a Puzzle

Here's the key insight: type inference is constraint solving.

Try It Out

Step through the visualization above. Notice how each operation adds a constraint, and by the end the compiler has figured out the complete type. The function never needed type annotations!

Key Insight

Type inference works by collecting constraints from how you use variables, then solving those constraints like a puzzle. You never write the types. The compiler figures them out by analyzing your code.

Why Python Can't Do This

You might wonder: why doesn't Python infer types like this?

The answer: Python's semantics make type inference impossible in the general case.

Python Allows Type Confusion

def add(a, b):
    return a + b
 
add(1, 2)         # Returns 3 (int)
add("hi", "!")    # Returns "hi!" (string)
add([1], [2])     # Returns [1, 2] (list)

The same function works with multiple types because + is overloaded. To infer a single type, the compiler would need to see every call site. That's impractical.

Python Mutates Types

x = 5           # x is int
x = "hello"     # Now x is string!
x = [1, 2, 3]   # Now x is list!

Python allows variables to change type. OCaml doesn't:

let x = 5 in      (* x : int *)
let x = "hello"   (* This creates NEW binding, doesn't change x *)

In OCaml, x stays int. The second let shadows it with a different variable that happens to have the same name.

Key Insight: Type inference requires type stability: variables can't change type, and operations can't work with multiple types. This is a fundamental design choice, not a limitation. Python chose flexibility; OCaml chose safety.

The Power of Generics (Polymorphism)

Now for the clever part. What about functions that work with ANY type?

OCaml automatically makes functions generic when they don't constrain types. Try it below:

Try It Out

Click different type buttons to see how the same function works with integers, strings, booleans, and even lists. The type variable α (alpha) gets instantiated to whatever type you pass in.

Key Insight

Python requires you to use TypeVar and complex annotations for generic functions. OCaml infers polymorphism automatically. If a function doesn't constrain the type, it works with all types.

Catching Bugs: Type Errors Before Runtime

Now we get to the payoff. Type inference prevents entire categories of bugs:

Try It Out

Click through the examples above. For each error, try to figure out what the type checker caught, then click "Show Fix" to see the corrected code. Notice how the compiler knows the exact problem.

Key Insight

Type inference catches entire categories of bugs at compile time: null references, wrong argument counts, type mismatches, missing cases. These bugs can't exist in compiled OCaml code.

How It Actually Works: Hindley-Milner

The type inference algorithm OCaml uses is called Hindley-Milner. Here's how it works step-by-step:

Try It Out

Click through each step or use "Auto Play" to watch the algorithm unfold. Notice how the code, variables, and constraints evolve together until we have the complete type.

Key observation: This algorithm is fast (linear time for most programs) and complete (finds the most general type possible).

The Secret Sauce: Unification

The key step in Hindley-Milner is unification, which makes two types compatible by replacing variables. Try it:

Key Insight

Unification is the core mechanism that makes type inference work. By systematically replacing type variables with concrete types or other variables, the algorithm figures out what types make the code consistent.

What You Can't Do (And Why That's Good)

Type inference has limitations by design. Some things you simply can't write:

1. Truly Dynamic Code

Python:

def mystery(x):
    if isinstance(x, int):
        return x + 1
    elif isinstance(x, str):
        return x.upper()
    else:
        return None

This can't exist in OCaml because the return type changes based on the input type. And that's good! This function is hard to reason about and prone to bugs.

Instead, you'd use explicit types:

type input = Int of int | Str of string
 
let mystery x =
  match x with
  | Int n -> Int (n + 1)
  | Str s -> Str (String.uppercase_ascii s)

Now the type system tracks what you're doing.

2. Implicit Type Conversions

Python:

result = 5 + 5.0  # int + float = float (automatic conversion)

OCaml:

let result = 5 + 5.0
(* COMPILE ERROR: This expression has type float
   but an expression was expected of type int *)

You must be explicit:

let result = 5.0 +. 5.0  (* Different operator for floats *)
let result = float_of_int 5 +. 5.0  (* Explicit conversion *)

This seems annoying until you realize it prevents bugs where conversions lose precision or behave unexpectedly.

Warning: Type inference removes boilerplate, but it doesn't remove precision. You still need to think about types. The compiler just checks your thinking automatically.

Test Your Understanding

Want to check your knowledge? Try the puzzle below. It pulls together everything you've learned about type inference:

Summary: What We Built

Let's recap our type inference system:

Literals have obvious types - 42 is int, "hi" is string
Operations constrain types - + requires both arguments to be int
Unification solves constraints - Type variables get replaced with concrete types
Polymorphism is automatic - Functions generic over types that aren't constrained
Errors caught at compile time - Invalid operations prevented before code runs

This is Hindley-Milner type inference. You get:

✓ No type annotations required
✓ Complete type safety
✓ Zero runtime overhead
✓ Excellent error messages
✓ Fast compilation

Key Insight: Type inference isn't magic. It's constraint solving. The compiler collects facts about how you use variables, then solves those constraints to determine types. You never write types, but you get all the safety of static typing.

Next Steps

Want to try OCaml yourself? Here's how to start:

Install OCaml

# macOS
brew install opam
opam init
opam install utop
 
# Linux
apt-get install opam
opam init
opam install utop

Try It

utop  # Interactive REPL
 
# Type some expressions:
let add x y = x + y;;
# Notice the inferred type: int -> int -> int
 
let identity x = x;;
# Notice: 'a -> 'a (polymorphic!)

Learn More

Real World OCaml (free online) - Comprehensive practical guide
Try OCaml (try.ocamlpro.com) - Browser-based playground
OCaml.org - Official documentation and tutorials

Final Thoughts

Type inference changed how I think about programming. Coming from Python, I thought types were overhead, something you add when the code is "done" to make it "safer."

OCaml showed me that types are the design. When you think about types first, you design better APIs, avoid whole categories of bugs, and refactor with confidence. The compiler figures them out without you writing the types.

Python is still wonderful for scripts, data analysis, and rapid prototyping. But for systems where correctness matters, where you need to refactor fearlessly, where bugs cost real money, type inference is a superpower.

Give OCaml a try. The type system might frustrate you at first. "Why won't this compile?!" But then you'll realize: the compiler is catching bugs you didn't know existed. And you'll wonder how you ever lived without it.

Happy type checking!