Hindley-Milner Type Inference
Some programming languages make you write types everywhere:
Map<String, List<Integer>> buildIndex(List<String> words) {
Map<String, List<Integer>> index = new HashMap<>();
// ... 50 more lines of type annotations
}Others let you skip the types entirely, but crash at runtime when things go wrong:
def build_index(words):
index = {}
# ... works until someone passes NoneBut a third category does something remarkable: they figure out all the types automatically and catch errors before your code runs. No annotations needed. No runtime type errors possible.
buildIndex words =
-- Compiler infers: [String] -> Map String [Int]
-- Type error? Won't compile.How? The answer is the Hindley-Milner (HM) type inference algorithm. It's one of the most elegant algorithms in computer science, and it treats your code as a constraint puzzle: every operation adds a constraint, and solving those constraints reveals the types.
In this post, we're going to build this algorithm from scratch. We'll start with simple expressions, add function types, then tackle the tricky part: polymorphism (generics that work automatically). Along the way, we'll see:
- Why type inference is constraint solving at its core
- How unification makes two types compatible
- What let-polymorphism means and why it matters
- How the algorithm guarantees to find the most general type
- Why this is decidable when full polymorphic lambda calculus isn't
Think of this as "how arrays work" but for type systems. We're not learning a language. We're understanding the algorithm that powers ML, OCaml, Haskell, Rust's inference, and more.
Let's build it.
The Problem: Types Hide Until Runtime
Consider this Python function:
def calculate_discount(price, percent):
discount = price * percent
return price - discount
# Works fine
calculate_discount(100, 0.1) # Returns 90.0
# Silently wrong
calculate_discount("100", "0.1") # Returns "100" - no error!
# Crashes later
result = calculate_discount(100, "10%")
# TypeError: unsupported operand type(s) for *: 'int' and 'str'The bug doesn't reveal itself until runtime, potentially days after deployment when a user enters "10%" instead of 0.1.
What We Need
To catch this bug earlier, we need to:
- Track what type each variable is
- Check operations are valid (can't multiply int and string)
- Do this before running the code
Python's approach: write type hints manually, run mypy separately.
OCaml's approach: the compiler figures out types automatically and refuses to compile invalid code.
Let's see how we might invent this ourselves.
Building a Type Checker: First Attempt
Imagine we're designing a simple type checker. We'll start with basic types:
int- integers like 42string- text like "hello"bool- true or false
Rule 1: Literal Values Have Obvious Types
When we see a literal value, its type is clear:
42 (* int *)
"hello" (* string *)
true (* bool *)Easy enough. Let's formalize this:
Rule 2: Operations Constrain Types
When we see an operation, it constrains what types are allowed:
5 + 3 (* Both must be int, result is int *)
"hi" ^ "!" (* Both must be string, result is string *)
5 > 3 (* Both must be int, result is bool *)Now we have a problem to solve:
The Problem: What if we don't know the types yet? How do we check if operations are valid when variables could be anything?
let add a b = a + bWe don't know what a and b are, but we know:
- The
+operator requires both arguments to beint - Therefore,
a : intandb : int - And the result must be
int
The operation constrains the unknown types!
Type Inference: Solving a Puzzle
Here's the key insight: type inference is constraint solving.
Step through the visualization above. Notice how each operation adds a constraint, and by the end the compiler has figured out the complete type. The function never needed type annotations!
Type inference works by collecting constraints from how you use variables, then solving those constraints like a puzzle. You never write the types. The compiler figures them out by analyzing your code.
Why Python Can't Do This
You might wonder: why doesn't Python infer types like this?
The answer: Python's semantics make type inference impossible in the general case.
Python Allows Type Confusion
def add(a, b):
return a + b
add(1, 2) # Returns 3 (int)
add("hi", "!") # Returns "hi!" (string)
add([1], [2]) # Returns [1, 2] (list)The same function works with multiple types because + is overloaded. To infer a single type, the compiler would need to see every call site. That's impractical.
Python Mutates Types
x = 5 # x is int
x = "hello" # Now x is string!
x = [1, 2, 3] # Now x is list!Python allows variables to change type. OCaml doesn't:
let x = 5 in (* x : int *)
let x = "hello" (* This creates NEW binding, doesn't change x *)In OCaml, x stays int. The second let shadows it with a different variable that happens to have the same name.
Key Insight: Type inference requires type stability: variables can't change type, and operations can't work with multiple types. This is a fundamental design choice, not a limitation. Python chose flexibility; OCaml chose safety.
The Power of Generics (Polymorphism)
Now for the clever part. What about functions that work with ANY type?
OCaml automatically makes functions generic when they don't constrain types. Try it below:
Click different type buttons to see how the same function works with integers, strings, booleans, and even lists. The type variable α (alpha) gets instantiated to whatever type you pass in.
Python requires you to use TypeVar and complex annotations for generic functions. OCaml infers polymorphism automatically. If a function doesn't constrain the type, it works with all types.
Catching Bugs: Type Errors Before Runtime
Now we get to the payoff. Type inference prevents entire categories of bugs:
Click through the examples above. For each error, try to figure out what the type checker caught, then click "Show Fix" to see the corrected code. Notice how the compiler knows the exact problem.
Type inference catches entire categories of bugs at compile time: null references, wrong argument counts, type mismatches, missing cases. These bugs can't exist in compiled OCaml code.
How It Actually Works: Hindley-Milner
The type inference algorithm OCaml uses is called Hindley-Milner. Here's how it works step-by-step:
Click through each step or use "Auto Play" to watch the algorithm unfold. Notice how the code, variables, and constraints evolve together until we have the complete type.
Key observation: This algorithm is fast (linear time for most programs) and complete (finds the most general type possible).
The Secret Sauce: Unification
The key step in Hindley-Milner is unification, which makes two types compatible by replacing variables. Try it:
Unification is the core mechanism that makes type inference work. By systematically replacing type variables with concrete types or other variables, the algorithm figures out what types make the code consistent.
What You Can't Do (And Why That's Good)
Type inference has limitations by design. Some things you simply can't write:
1. Truly Dynamic Code
Python:
def mystery(x):
if isinstance(x, int):
return x + 1
elif isinstance(x, str):
return x.upper()
else:
return NoneThis can't exist in OCaml because the return type changes based on the input type. And that's good! This function is hard to reason about and prone to bugs.
Instead, you'd use explicit types:
type input = Int of int | Str of string
let mystery x =
match x with
| Int n -> Int (n + 1)
| Str s -> Str (String.uppercase_ascii s)Now the type system tracks what you're doing.
2. Implicit Type Conversions
Python:
result = 5 + 5.0 # int + float = float (automatic conversion)OCaml:
let result = 5 + 5.0
(* COMPILE ERROR: This expression has type float
but an expression was expected of type int *)You must be explicit:
let result = 5.0 +. 5.0 (* Different operator for floats *)
let result = float_of_int 5 +. 5.0 (* Explicit conversion *)This seems annoying until you realize it prevents bugs where conversions lose precision or behave unexpectedly.
Warning: Type inference removes boilerplate, but it doesn't remove precision. You still need to think about types. The compiler just checks your thinking automatically.
Test Your Understanding
Want to check your knowledge? Try the puzzle below. It pulls together everything you've learned about type inference:
Summary: What We Built
Let's recap our type inference system:
- Literals have obvious types -
42isint,"hi"isstring - Operations constrain types -
+requires both arguments to beint - Unification solves constraints - Type variables get replaced with concrete types
- Polymorphism is automatic - Functions generic over types that aren't constrained
- Errors caught at compile time - Invalid operations prevented before code runs
This is Hindley-Milner type inference. You get:
- ✓ No type annotations required
- ✓ Complete type safety
- ✓ Zero runtime overhead
- ✓ Excellent error messages
- ✓ Fast compilation
Key Insight: Type inference isn't magic. It's constraint solving. The compiler collects facts about how you use variables, then solves those constraints to determine types. You never write types, but you get all the safety of static typing.
Next Steps
Want to try OCaml yourself? Here's how to start:
Install OCaml
# macOS
brew install opam
opam init
opam install utop
# Linux
apt-get install opam
opam init
opam install utopTry It
utop # Interactive REPL
# Type some expressions:
let add x y = x + y;;
# Notice the inferred type: int -> int -> int
let identity x = x;;
# Notice: 'a -> 'a (polymorphic!)Learn More
- Real World OCaml (free online) - Comprehensive practical guide
- Try OCaml (try.ocamlpro.com) - Browser-based playground
- OCaml.org - Official documentation and tutorials
Final Thoughts
Type inference changed how I think about programming. Coming from Python, I thought types were overhead, something you add when the code is "done" to make it "safer."
OCaml showed me that types are the design. When you think about types first, you design better APIs, avoid whole categories of bugs, and refactor with confidence. The compiler figures them out without you writing the types.
Python is still wonderful for scripts, data analysis, and rapid prototyping. But for systems where correctness matters, where you need to refactor fearlessly, where bugs cost real money, type inference is a superpower.
Give OCaml a try. The type system might frustrate you at first. "Why won't this compile?!" But then you'll realize: the compiler is catching bugs you didn't know existed. And you'll wonder how you ever lived without it.
Happy type checking!