Domain Name System: How Names Become Addresses

Every device on the internet has an address: an IP address like 192.0.2.1 or 2001:db8::1. These addresses are what TCP and IP use to route packets. But humans don't navigate the web by memorizing numbers. We type domain names: google.com, github.com, example.org.

Someone, or something, has to translate those names into addresses. That something is the Domain Name System (DNS), one of the most important yet often invisible systems on the internet. It's a distributed database that spans the entire globe, handling more requests per second than any other protocol.

Without DNS, you'd need a central server knowing every domain on Earth. That server would be a single point of failure. One outage would break the entire internet. Instead, DNS distributes responsibility across thousands of servers, each managing their own piece of the name space.

Problem

How do we map human-readable names to IP addresses at internet scale, without a single point of failure?

Solution

Create a hierarchical, distributed database where different organizations manage different parts of the name space. Use a query protocol that routes requests through a series of servers until the authoritative answer is found.

The DNS Name Space

The names in DNS are hierarchical, arranged as a tree with an unnamed root at the top. Below the root are the top-level domains (TLDs): com, org, net, edu, uk, de, and hundreds more. Each TLD can contain domains, which can contain subdomains, nested as deep as you want.

DNS Hierarchy Visualization

Root (.)

Domains under .com:

google.com

example.com

microsoft.com

Tip: Click a TLD to see example domains beneath it. The DNS hierarchy allows millions of domains to be organized efficiently without central coordination.

Hierarchical Organization

A domain name like mail.google.com. is read right-to-left:

com: The top-level domain
google: A domain under com
mail: A subdomain of google.com
. (the dot): Marks the root (often omitted)

This hierarchy is key to DNS's scalability. Want to add a server at newhost.cs.berkeley.edu? You only need to change records at cs.berkeley.edu. The organization managing berkeley.edu doesn't need to know about it, and neither does the organization managing edu.

Types of Top-Level Domains

Generic TLDs (gTLDs): com, org, net, edu, gov, mil. Unrestricted or restricted to specific uses.
Country-Code TLDs (ccTLDs): uk, de, jp, ca. Two-letter codes from the ISO 3166 standard.
Infrastructure TLD: arpa. Special-purpose for reverse DNS lookups.

Domain hacks

Some countries have sold names in their ccTLDs for creative uses. The domain cnn.tv is technically a registration in Tuvalu's TLD. flic.kr is a truncated Flickr URL using Kiribati's TLD. These are sometimes called "domain hacks."

Name Servers and Zones

Each part of the name space is managed by name servers. The organization responsible for a domain arranges for at least two name servers to hold information about the names they manage.

A zone is a subtree of the name space that can be administered separately. For example, UC Berkeley administers berkeley.edu as a zone. The Computer Science department administers cs.berkeley.edu as a separate zone. This delegation allows each level to manage its own names without coordinating with every other level.

Redundancy and Zone Transfers

Zone information exists in at least two places: a primary name server and one or more secondary name servers. If the primary goes down, secondaries can still answer queries.

Secondaries get their zone data through a process called a zone transfer. The primary sends a complete copy of the zone database to each secondary. This copy is then cached and served to clients.

Authoritative vs. Cached Data

A name server is authoritative for a zone if it contains the primary copy of that zone's data. It can also cache data learned from other servers. When answering a query, the server indicates whether the data came from its authoritative database or from the cache.

Caching: Reducing Load

DNS is queried constantly. Every web request starts with a DNS lookup. If every lookup had to traverse to the root server, then to a TLD server, then to the authoritative server, the system would collapse under its own load.

The solution: caching. Name servers cache answers they learn and serve them for future queries, up to a time limit called the TTL (Time To Live). A record might have a TTL of 3600 seconds. For that hour, any server that learns the answer can serve it without asking the authoritative server again.

DNS Caching with TTL

Time to Live (TTL)3600s remaining

Original: 3600s

Elapsed: 0s

Status: ✓ Valid

Total Requests

Cache Hits

Hit Rate

How caching works:

First request: DNS server queries authoritative server (slow)
Cache stores result with TTL: 3600 seconds
Subsequent requests: Served from cache (fast) until TTL expires
After expiry: Entry removed, next request must query again

Negative Caching

DNS even caches failures. If a query for nonexistent-domain.com returns "name not found," that negative result is cached too. Applications that repeatedly query for names that don't exist won't hammer the DNS system, they'll get the cached negative answer.

Client-side systems like Windows, macOS, and Linux maintain their own DNS caches, often configured to limit how long cached entries persist.

TTL strategy

Zone administrators can lower the TTL before making changes. Lower TTL means cached entries expire faster, so the old data doesn't linger in caches worldwide. After the change is deployed, they raise the TTL again to reduce query load.

The DNS Protocol

DNS is a protocol, and protocols need rules. The DNS query/response protocol is elegant: simple queries sent over UDP, with TCP as a fallback for larger responses.

UDP vs. TCP

UDP (port 53): Most queries use UDP. It's connectionless and fast. The entire DNS message fits in one datagram, limited to 512 bytes.
TCP (port 53): Used when responses are too large for UDP (usually during zone transfers). Some clients use TCP for all queries.

When a UDP response is truncated (exceeds 512 bytes), the server sets a flag, and the client can re-issue the query over TCP.

The Query Process

A typical DNS query takes a journey through the system:

Recursive Name Resolution

A.HOME

> What is the IP for example.com?

→

GW.HOME

⏳ Processing...

Step 1: Client Query

The steps:

The client resolver asks its local name server for example.com
The local server doesn't know, so it asks a root server
The root server responds with the address of a com TLD server
The local server asks the com TLD server
The TLD server responds with the address of the authoritative server for example.com
The local server asks the authoritative server
The authoritative server responds with the IP address of example.com
The local server caches the result and returns it to the client

This is called a recursive query because each server asks the next one on behalf of the client. Root servers and TLD servers typically don't do recursion, they return a referral instead. Recursive servers (like those at ISPs) handle the work of chasing referrals.

DNS Message Format

All DNS operations, queries, responses, zone transfers, use a single message format. Understanding it shows how much information is packed into each message.

DNS Message Structure

Query Type

QUERY

Transport

UDP:53

Record Type

Max Size

512B

DNS Packet Layout

Header (12 bytes)

Transaction ID, Flags, Counts

+12

Questions (variable)

Domain name, Record type (A/AAAA/MX/etc)

+var

Answers (variable)

Resource records with TTL and data

+var

Authority & Additional (variable)

Authoritative servers, glue records

The Fixed Header

Every DNS message starts with a 12-byte header containing:

Transaction ID (2 bytes): Set by the client, returned in the response. Lets the client match responses to requests.
Flags (2 bytes): Indicates query or response, recursive, truncated, authoritative, etc.
Counts (8 bytes): How many questions, answers, authority records, and additional records follow.

Important Flags

QR (1 bit): Query (0) or Response (1)?
RD (1 bit): Recursion Desired. Client sets this to ask for recursion.
RA (1 bit): Recursion Available. Server sets this if it supports recursion.
AA (1 bit): Authoritative Answer. Server sets this if the answer comes from its zone database (not the cache).
TC (1 bit): Truncated. Response exceeded 512 bytes; only first 512 bytes are shown.

// DNS Header (12 bytes)
[Transaction ID: 2 bytes]
[Flags: 2 bytes]
[Question Count: 2 bytes]
[Answer Count: 2 bytes]
[Authority Count: 2 bytes]
[Additional Count: 2 bytes]

// Followed by variable-length sections:
[Questions...]
[Answers...]
[Authority records...]
[Additional records...]

Name Compression

DNS names can be long. A response might reference the same domain multiple times. Instead of repeating mail.example.com in every record, DNS uses compression: a pointer to where example.com appears earlier in the message.

If example.com starts at byte 32 in the message, a pointer is encoded as two bytes with the top 2 bits set to 1, and the remaining 14 bits forming an offset. This saves significant space in large responses.

Resource Record Types

DNS isn't just for looking up IP addresses. The system stores many types of records, each identified by a type code.

DNS Record Types

Address Record

Maps domain name to IPv4 address

Example

example.com A 192.0.2.1

Use Case

Primary way to look up IPv4 addresses for websites

Typical TTL

Usually 3600s (1 hour)

Common Record Types

A (Address): Maps a domain name to an IPv4 address. example.com A 192.0.2.1
AAAA (Quad-A): Maps a domain name to an IPv6 address. example.com AAAA 2001:db8::1
NS (Name Server): Lists the authoritative name servers for a zone. example.com NS ns1.example.com.
CNAME (Canonical Name): An alias for another domain. www.example.com CNAME example.com.
MX (Mail Exchange): Routes email to mail servers. example.com MX 10 mail.example.com.
TXT (Text): Arbitrary text, used for SPF, DKIM, DMARC. example.com TXT "v=spf1 include:_spf.example.com ~all"
SOA (Start of Authority): Contains zone metadata: primary server, responsible person, serial number, refresh intervals.
PTR (Pointer): Reverse DNS lookup. Maps IP addresses back to domain names.
SRV (Service): Locates services within a domain. _sip._tcp.example.com SRV 10 60 5060 sipserver.example.com.

TTL in records

Each record has its own TTL value. An A record might have TTL 3600 (one hour), while an MX record has TTL 86400 (one day). Short TTLs for records that change frequently; long TTLs for stable records. This reduces query load while keeping data fresh.

Putting It Together: A Real Query

Let's trace what happens when you visit example.com:

Your browser needs the IP address of example.com, so it calls the system resolver.
The resolver checks its local cache. If found and not expired, return immediately.
If not in cache, the resolver queries its configured recursive name server (often provided by your ISP).
The recursive server checks its cache. If not found, it performs a recursive resolution:
- Query root server → get TLD server address
- Query TLD server → get authoritative server address
- Query authoritative server → get A record for example.com
The recursive server caches the answer and returns it to your resolver.
Your resolver caches the answer and returns it to the browser.
Your browser now has the IP address and can connect to the web server.

Throughout this journey, each server that handles the query caches the result. If another client queries within the TTL window, they get the cached answer immediately.

Why DNS Matters

DNS is the glue that makes the domain-name-based web possible. It's also why domain names cost money, they're certificates of uniqueness in a hierarchical namespace that everyone agrees on.

It's why DNS attacks are so damaging. Compromise DNS and you control what IP addresses users are directed to. DNS spoofing tricks clients into believing false answers. DNS hijacking takes control of a domain's records. DDoS attacks on DNS servers disrupt resolution worldwide.

This is why DNSSEC (DNS Security Extensions) exists, to verify that answers are authentic. But that's another story.

What We Learned

DNS is a distributed, hierarchical database translating names to IP addresses
The name space is a tree with an unnamed root, top-level domains, domains, and subdomains
Different organizations manage different zones, avoiding a single point of failure
Recursive queries traverse multiple servers to find authoritative answers
Caching dramatically reduces load on authoritative servers
The protocol is simple: UDP for queries, TCP for large transfers
Messages contain questions, answers, authority records, and additional information
Many record types exist beyond A and AAAA: MX, CNAME, TXT, SRV, and more
TTL values control how long data can be cached before expiring

Next time you browse the web, DNS is silently working in the background. It's fast, reliable, and built on the principle that distributed systems are more resilient than centralized ones.

Try it yourself: Open your terminal and run dig example.com to see a real DNS query and response. Use dig +trace example.com to see the full journey from root to authoritative server.