Every device on the internet has an address: an IP address like 192.0.2.1 or 2001:db8::1. These addresses are what TCP and IP use to route packets. But humans don't navigate the web by memorizing numbers. We type domain names: google.com, github.com, example.org.
Someone, or something, has to translate those names into addresses. That something is the Domain Name System (DNS), one of the most important yet often invisible systems on the internet. It's a distributed database that spans the entire globe, handling more requests per second than any other protocol.
Without DNS, you'd need a central server knowing every domain on Earth. That server would be a single point of failure. One outage would break the entire internet. Instead, DNS distributes responsibility across thousands of servers, each managing their own piece of the name space.
How do we map human-readable names to IP addresses at internet scale, without a single point of failure?
Create a hierarchical, distributed database where different organizations manage different parts of the name space. Use a query protocol that routes requests through a series of servers until the authoritative answer is found.
The names in DNS are hierarchical, arranged as a tree with an unnamed root at the top. Below the root are the top-level domains (TLDs): com, org, net, edu, uk, de, and hundreds more. Each TLD can contain domains, which can contain subdomains, nested as deep as you want.
google.comexample.commicrosoft.comTip: Click a TLD to see example domains beneath it. The DNS hierarchy allows millions of domains to be organized efficiently without central coordination.
A domain name like mail.google.com. is read right-to-left:
This hierarchy is key to DNS's scalability. Want to add a server at newhost.cs.berkeley.edu? You only need to change records at cs.berkeley.edu. The organization managing berkeley.edu doesn't need to know about it, and neither does the organization managing edu.
com, org, net, edu, gov, mil. Unrestricted or restricted to specific uses.uk, de, jp, ca. Two-letter codes from the ISO 3166 standard.arpa. Special-purpose for reverse DNS lookups.Some countries have sold names in their ccTLDs for creative uses. The domain cnn.tv is technically a registration in Tuvalu's TLD. flic.kr is a truncated Flickr URL using Kiribati's TLD. These are sometimes called "domain hacks."
Each part of the name space is managed by name servers. The organization responsible for a domain arranges for at least two name servers to hold information about the names they manage.
A zone is a subtree of the name space that can be administered separately. For example, UC Berkeley administers berkeley.edu as a zone. The Computer Science department administers cs.berkeley.edu as a separate zone. This delegation allows each level to manage its own names without coordinating with every other level.
Zone information exists in at least two places: a primary name server and one or more secondary name servers. If the primary goes down, secondaries can still answer queries.
Secondaries get their zone data through a process called a zone transfer. The primary sends a complete copy of the zone database to each secondary. This copy is then cached and served to clients.
A name server is authoritative for a zone if it contains the primary copy of that zone's data. It can also cache data learned from other servers. When answering a query, the server indicates whether the data came from its authoritative database or from the cache.
DNS is queried constantly. Every web request starts with a DNS lookup. If every lookup had to traverse to the root server, then to a TLD server, then to the authoritative server, the system would collapse under its own load.
The solution: caching. Name servers cache answers they learn and serve them for future queries, up to a time limit called the TTL (Time To Live). A record might have a TTL of 3600 seconds. For that hour, any server that learns the answer can serve it without asking the authoritative server again.
How caching works:
DNS even caches failures. If a query for nonexistent-domain.com returns "name not found," that negative result is cached too. Applications that repeatedly query for names that don't exist won't hammer the DNS system, they'll get the cached negative answer.
Client-side systems like Windows, macOS, and Linux maintain their own DNS caches, often configured to limit how long cached entries persist.
Zone administrators can lower the TTL before making changes. Lower TTL means cached entries expire faster, so the old data doesn't linger in caches worldwide. After the change is deployed, they raise the TTL again to reduce query load.
DNS is a protocol, and protocols need rules. The DNS query/response protocol is elegant: simple queries sent over UDP, with TCP as a fallback for larger responses.
When a UDP response is truncated (exceeds 512 bytes), the server sets a flag, and the client can re-issue the query over TCP.
A typical DNS query takes a journey through the system:
The steps:
example.comThis is called a recursive query because each server asks the next one on behalf of the client. Root servers and TLD servers typically don't do recursion, they return a referral instead. Recursive servers (like those at ISPs) handle the work of chasing referrals.
All DNS operations, queries, responses, zone transfers, use a single message format. Understanding it shows how much information is packed into each message.
Every DNS message starts with a 12-byte header containing:
// DNS Header (12 bytes)
[Transaction ID: 2 bytes]
[Flags: 2 bytes]
[Question Count: 2 bytes]
[Answer Count: 2 bytes]
[Authority Count: 2 bytes]
[Additional Count: 2 bytes]
// Followed by variable-length sections:
[Questions...]
[Answers...]
[Authority records...]
[Additional records...]
DNS names can be long. A response might reference the same domain multiple times. Instead of repeating mail.example.com in every record, DNS uses compression: a pointer to where example.com appears earlier in the message.
If example.com starts at byte 32 in the message, a pointer is encoded as two bytes with the top 2 bits set to 1, and the remaining 14 bits forming an offset. This saves significant space in large responses.
DNS isn't just for looking up IP addresses. The system stores many types of records, each identified by a type code.
example.com A 192.0.2.1example.com AAAA 2001:db8::1example.com NS ns1.example.com.www.example.com CNAME example.com.example.com MX 10 mail.example.com.example.com TXT "v=spf1 include:_spf.example.com ~all"_sip._tcp.example.com SRV 10 60 5060 sipserver.example.com.Each record has its own TTL value. An A record might have TTL 3600 (one hour), while an MX record has TTL 86400 (one day). Short TTLs for records that change frequently; long TTLs for stable records. This reduces query load while keeping data fresh.
Let's trace what happens when you visit example.com:
example.com, so it calls the system resolver.Throughout this journey, each server that handles the query caches the result. If another client queries within the TTL window, they get the cached answer immediately.
DNS is the glue that makes the domain-name-based web possible. It's also why domain names cost money, they're certificates of uniqueness in a hierarchical namespace that everyone agrees on.
It's why DNS attacks are so damaging. Compromise DNS and you control what IP addresses users are directed to. DNS spoofing tricks clients into believing false answers. DNS hijacking takes control of a domain's records. DDoS attacks on DNS servers disrupt resolution worldwide.
This is why DNSSEC (DNS Security Extensions) exists, to verify that answers are authentic. But that's another story.
Next time you browse the web, DNS is silently working in the background. It's fast, reliable, and built on the principle that distributed systems are more resilient than centralized ones.
Try it yourself: Open your terminal and run dig example.com to see a real DNS query and response. Use dig +trace example.com to see the full journey from root to authoritative server.