How encryption actually works
Alice wants to send Bob a secret message. Eve is watching their entire conversation. How do they communicate privately?
This problem has been around for millennia. We'll start with one simple operation and build up to the system that protects every password, every bank transfer, and every private message on the internet.
No advanced math. Just a few building blocks and the problems they solve.
The foundation: XOR
Every encryption system relies on one operation: XOR.
XOR compares two bits and asks: "Are these different?" If yes, output 1. If no, output 0. Click the bits below to try it.
What makes XOR useful for encryption is that it's reversible. If you XOR some data with a secret value, XORing the result with that same value gives you the original data back. The same operation encrypts and decrypts.
Building your first encryption system
To encrypt text, you first convert each character into bits, then XOR those bits with a key. Type a message and a key below to see this in action.
Your readable message becomes gibberish. XOR it with the same key again, and the original message comes back.
This is called a one-time pad. When the key is random, never reused, and the same length as the message, it's mathematically unbreakable. No computer, no matter how powerful, can crack it without the key.
But there's a practical problem.
We need encryption that works with short keys.
Making short keys work
Algorithms like AES solve this by taking a short key (just 256 bits, or 32 characters) and using it to generate a long stream of pseudorandom bits through repeated rounds of substitution and shuffling. The output looks random, but it's entirely determined by the key, so the same key always produces the same stream. You XOR this stream with your message, and it works for messages of any length.
But using a short key introduces new problems.
Problem 1: Patterns leak through
You can use an unbreakable encryption algorithm and still leak your secrets. The problem isn't the algorithm itself but how you apply it. Type some repeating text below, like "AAABBBCCC".
In ECB mode (Electronic Codebook), each block of text is encrypted independently with the same key. The same input block always produces the same output. Patterns in your data become patterns in your ciphertext.
Encrypt an image in ECB mode, and you can still see the image. The famous "ECB penguin" demonstrates this: the encrypted image clearly shows a penguin shape.
CBC mode (Cipher Block Chaining) fixes this by feeding each block's output into the encryption of the next block. The same letter in different positions produces different outputs because each block depends on everything before it. The pattern disappears.
The demo uses many colors intentionally to make the pattern visible. In real encryption, each "color" would be a block of pseudorandom bytes.
Problem 2: Can short keys be guessed?
If our key is only 256 bits, can't an attacker just try every possible key?
Each additional bit doubles the number of possible keys:
- 64 bits: a trillion guesses per second cracks it in months
- 128 bits: longer than the universe has existed
- 256 bits: forget about it
This exponential growth is why key length matters. "256-bit AES" isn't marketing; it's the difference between crackable and impossible.
So we can encrypt with a short key, and that key is too long to guess. But we still have our original problem.
Sharing secrets in public
For centuries, this seemed unsolvable. To encrypt, you need a shared secret. To share a secret, you need encryption. It's circular.
Then, in 1976, Whitfield Diffie and Martin Hellman figured out that two strangers can agree on a secret while being watched.
Imagine Alice and Bob want to agree on a secret color while Eve watches their entire conversation.
- Alice and Bob agree on a public starting color (Eve sees this)
- Each secretly picks their own private color (Eve doesn't see these)
- Each mixes their private color with the public one and sends the result (Eve sees the mixed colors)
- Each takes what they received and mixes in their private color again
Both arrive at the same final color. Eve saw every message they exchanged, but she can't figure out the result because mixing colors is easy while un-mixing them is practically impossible.
The real math
In practice, Diffie-Hellman uses numbers instead of colors. The "mixing" operation is modular exponentiation, and the "un-mixing" problem is called the discrete logarithm, which is computationally infeasible for large numbers.
For the small numbers here, you could work it out by hand. In real systems, these numbers have hundreds of digits. No computer can reverse them.
Two strangers can now establish a shared secret key over a public channel. Once they have that key, they use it with a fast algorithm like AES to encrypt the actual messages.
But we have one more problem.
Proving who you are
Diffie-Hellman lets two strangers agree on a secret, but it doesn't tell you who you're agreeing with. We need a way for Bob to prove his identity. This requires two new tools.
Tool 1: Hash functions
A hash function takes any input and produces a fixed-size "fingerprint." Change a single character, and the fingerprint changes completely.
This is the avalanche effect. You can't predict what change will produce what output, and you can't reverse it: given a hash, you can't figure out the original input.
The demo uses a simplified hash for illustration. Real systems use SHA-256 or similar cryptographic hash functions.
Hashes alone don't prove identity, but they let you verify that data hasn't been tampered with. We'll need this property in a moment.
Tool 2: Asymmetric keys
So far, every encryption scheme we've looked at uses one key that both encrypts and decrypts. This is called symmetric encryption. There's another approach: generate two mathematically linked keys that work as a pair.
With asymmetric encryption, the public key encrypts and the private key decrypts. Anyone can encrypt a message to you using your public key, but only you can read it.
Symmetric encryption is fast but requires both sides to have the same key. Asymmetric encryption eliminates the key sharing problem, but it's roughly 1000x slower. Real systems use both: asymmetric to exchange keys, symmetric for the actual data.
RSA is the most famous asymmetric system. It's based on a simple asymmetry: multiplying two large primes is easy, but factoring their product back into those primes is incredibly hard.
For small numbers, factoring is trivial. For 2048-bit RSA where the number has 600+ digits, it would take billions of years. The numbers in this demo are intentionally small for illustration.
Putting them together: digital signatures
Now we have both pieces we need. Hash functions create a fingerprint of any data. Asymmetric keys let you encrypt with one key and decrypt with the other. If you combine them, you can prove identity.
With asymmetric keys, you can work in reverse: encrypt with your private key, and anyone can decrypt with your public key. This seems useless since anyone can read it. But it proves you wrote it, because only you have the private key.
A digital signature works like this:
- Hash your document (creates a fingerprint)
- Encrypt the hash with your private key (this is the signature)
- Anyone can decrypt with your public key and compare hashes
If the hashes match, two things are proven:
- Authenticity: only you could have created that signature
- Integrity: the document hasn't changed since you signed it
Now Bob can prove he's Bob. He signs a message with his private key. Alice verifies it with his public key. Eve can't forge it because she doesn't have Bob's private key.
The chain of trust
Digital signatures let Bob prove his identity, but only if Alice already has Bob's public key. How does she get it? If Eve intercepts that too, we're back to the man-in-the-middle problem.
The solution is a chain of trust.
Bob's public key comes wrapped in a certificate: a statement saying "this key belongs to Bob," signed by someone Alice already trusts. That signer's key might be signed by someone else, forming a chain back to a root certificate authority that's pre-installed in Alice's browser.
If any signature in the chain is invalid, the browser shows a warning.
Everything together: TLS
When you see the padlock in your browser, all of these pieces work together in milliseconds.
- Your browser and the server agree on encryption methods
- The server sends its public key, signed by a trusted authority
- Diffie-Hellman establishes a shared secret
- All data flows through symmetric encryption (AES) using that shared secret
The asymmetric operations (slow) happen once at the start. The symmetric operations (fast) handle all the actual data. Every real security protocol, from HTTPS to Signal to file encryption, is built from the same small set of primitives: hashing for integrity, symmetric encryption for bulk data, and asymmetric keys for key exchange and signatures.
What you've built
You started with one operation, XOR, and solved problem after problem:
XOR is reversible. The same operation encrypts and decrypts.
Short keys work. Algorithms like AES turn a short key into a long pseudorandom stream. With proper cipher modes and sufficient key length, 256 bits is unbreakable.
Key exchange is possible. Diffie-Hellman lets strangers agree on secrets while being watched. The asymmetry between "easy to compute, hard to reverse" makes this work.
Identity can be proven. Hash functions fingerprint data. Asymmetric keys enable signatures. Certificate chains establish trust.
Real systems layer everything. TLS combines key exchange, symmetric encryption, hashing, and signatures into a single handshake that completes in milliseconds.
Next time you see that padlock, you'll know what's happening: a certificate is verified, a key exchange occurs, and your data flows through an encrypted channel.