The Internet Protocol: How Data Travels Across the World
Understanding IP, the workhorse protocol that carries all Internet traffic. Learn how IPv4 and IPv6 deliver datagrams across networks, handle fragmentation, and ensure reliable communication.
Introduction
The Internet Protocol (IP) is the workhorse of the TCP/IP protocol suite. Every piece of data that travels across the Internet, whether it's a TCP connection, UDP message, ICMP ping, or IGMP multicast, gets wrapped in an IP datagram for transmission. Understanding IP is fundamental to understanding how the Internet works.
IP provides a best-effort, connectionless datagram delivery service. Let's break down what that means:
- Best-effort: IP tries to deliver packets, but makes no guarantees. If a router is congested, IP simply drops packets. There's no promise that packets will arrive at all.
- Connectionless: IP doesn't maintain connection state. Each datagram is handled independently. Two consecutive packets to the same destination can take different paths and arrive out of order.
- Datagram delivery: IP sends self-contained packets, not streams. This is different from TCP, which provides a reliable, ordered stream.
Any reliability requirements are handled by higher layers. TCP adds sequence numbers, acknowledgments, and retransmissions. UDP leaves error handling to the application. ICMP can report problems but not fix them.
IP's simplicity is actually a feature. By making no guarantees, IP can be implemented efficiently everywhere in the network. Routers don't need to remember anything about packets. This simplicity enabled the Internet to scale globally.
The IP Header
Every IP datagram starts with a header containing essential information about the packet: where it came from, where it's going, what protocol it contains, and more. Let's examine both IPv4 and IPv6 headers.
IPv4 Header
The IPv4 header is typically 20 bytes (five 32-bit words), though it can be up to 60 bytes if options are included. The header contains:
Key IPv4 Header Fields Explained
Version and IHL (4 bits each)
The Version field contains the IP version: 4 for IPv4, 6 for IPv6. The IHL (Internet Header Length) field specifies how many 32-bit words the header contains. Since this is a 4-bit field, the maximum header size is 15 × 4 = 60 bytes. With no options, IHL = 5 (20 bytes). This seems like a strict limitation, and it is, it's one reason why IPv4 options are rarely used today.
DS Field and ECN (8 bits total)
The Differentiated Services (DS) Field originally came from the Type of Service byte. Six bits form the Differentiated Services Code Point (DSCP), which tells routers how to prioritize this packet. Higher values get better treatment. Two bits are used for Explicit Congestion Notification (ECN), where routers can mark packets when congested, signaling the sender to slow down before packets are dropped.
Total Length (16 bits)
This specifies the total length of the IP datagram, including header and data. Since it's a 16-bit field, the maximum IP datagram is 65,535 bytes. However, most networks (like Ethernet) have a smaller Maximum Transmission Unit (MTU). When a datagram is larger than the MTU, it must be fragmented.
Identification, Flags, and Fragment Offset (32 bits total)
When a datagram is fragmented, the Identification field ensures all fragments are recognized as belonging to the same original datagram. The 3-bit Flags field controls whether fragmentation is allowed and whether more fragments follow. The 13-bit Fragment Offset specifies this fragment's position in the original datagram (in 8-byte units).
TTL (Time-to-Live) - 8 bits
The TTL field prevents packets from circulating forever in routing loops. Originally intended to count seconds, it now simply counts hops. Each router decrements TTL by 1 before forwarding. When TTL reaches 0, the packet is discarded and the sender receives an ICMP Time Exceeded message. Recommended initial value is 64; values of 128 or 255 are common.
Protocol (8 bits)
This field identifies the protocol carried in the IP payload. Common values: 6 = TCP, 17 = UDP, 1 = ICMP, 2 = IGMP. This provides demultiplexing so IP can carry multiple protocol types.
Header Checksum (16 bits)
IP computes a checksum over the header only, not the payload. This is different from TCP/UDP, which checksum both header and payload. The checksum is calculated using the Internet checksum algorithm (one's complement sum). If a header is corrupted, the receiving IP implementation discards the packet silently (no error message).
Original Header
IPv4 header with checksum field set to 0x0000
Version: 0x4 IHL: 0x5 ToS: 0x00 Total Length: 0x003c (60 bytes) Identification: 0x1234 Flags & Offset: 0x4000 TTL: 0x40 (64) Protocol: 0x06 (TCP) Checksum: 0x0000 (to be filled) Source IP: 0xc0a80001 Destination IP: 0x0a000001
Source and Destination IP Addresses (32 bits each)
IPv4 addresses are 32 bits, typically written in dotted decimal notation (e.g., 192.168.1.1). With 2³² possible addresses (about 4.3 billion), the address space was thought to be adequate when IPv4 was designed in 1981. That assumption didn't age well.
IPv6: The Next Generation
IPv6 was developed to address IPv4's limitations, most notably the exhaustion of the 32-bit address space. The main improvements:
- 128-bit addresses: Instead of 2³² addresses, IPv6 has 2¹²⁸ addresses (about 3.4 × 10³⁸). That's roughly 3.9 × 10¹⁷ addresses per square meter of Earth.
- Fixed 40-byte header: IPv6 headers are always 40 bytes, making processing simpler. Options moved to extension headers.
- No checksum: IPv6 removes the header checksum. Bit errors are rare thanks to modern fiber networks, and higher layers handle error detection.
- No fragmentation in routers: Only the source can fragment. Path MTU Discovery ensures packets fit the network path.
- Extension headers: Instead of options in the main header, IPv6 uses a chain of extension headers for special functions.
IPv6 Header Fields
Version (4 bits): Always 6 for IPv6.
Traffic Class (8 bits): Similar to the DS Field in IPv4. Used for differentiated services and QoS marking.
Flow Label (20 bits): A unique identifier for related packets, helping routers treat packets in the same flow consistently. For example, video packets might have one flow label, while email has another.
Payload Length (16 bits): The size of everything following the IPv6 header (data + extension headers). Unlike IPv4 Total Length, this doesn't include the header itself. Maximum payload: 65,535 bytes.
Next Header (8 bits): Identifies the next header in the chain (TCP, UDP, an extension header, etc.). Value 59 indicates end of headers.
Hop Limit (8 bits): Like IPv4's TTL. Decremented by each router; packet discarded when it reaches 0.
Source and Destination Addresses (128 bits each): IPv6 addresses are written in hexadecimal, like 2001:0db8:85a3:0000:0000:8a2e:0370:7334, often abbreviated with :: for consecutive zeros.
IP Fragmentation
Different networks have different maximum transmission units. Ethernet typically has an MTU of 1500 bytes. WiFi might be 1500. A serial line might be just 256 bytes. When a datagram is larger than the outgoing link's MTU, it must be fragmented into smaller pieces.
If a 1500-byte IP datagram needs to traverse a link with 500-byte MTU, what happens?
In IPv4, any router can fragment a datagram. The router splits the datagram into multiple smaller datagrams, all with the same Identification field so they can be reassembled at the destination.
IPv6 and Fragmentation
IPv6 changed this model: only the source can fragment, and fragmentation is handled via an extension header. Routers never fragment IPv6 packets. Instead, sources use Path MTU Discovery to learn the smallest MTU along the path and send appropriately sized packets.
This makes IPv6 routers simpler and faster. Router performance isn't slowed by fragmentation logic. The tradeoff is more complexity in the source and more protocol overhead (the Path MTU Discovery process).
Fragmentation works, but it's expensive. If any fragment is lost, the entire original datagram is lost and must be retransmitted. Modern practice is to avoid fragmentation by:
- Using Path MTU Discovery to find the smallest MTU along the path
- Sending datagrams smaller than the MTU
- Letting TCP choose datagram sizes based on MSS (Maximum Segment Size) negotiation
IP Routing
How does a packet get from source to destination? Through routing tables and forwarding decisions at each router.
When a router receives a packet, it makes a simple decision based on the destination IP address:
- Look up the destination IP in the routing table
- Find the matching route with the longest prefix match
- Forward the packet to the next hop specified by that route
- If no match, send an ICMP Destination Unreachable message
The longest prefix match ensures specific routes take priority. For example, a route for 192.168.1.0/24 matches before a default route 0.0.0.0/0.
A host (your computer) has a default route pointing to a gateway (usually your router). That gateway has routes pointing to other gateways, forming a hierarchy. When your packet reaches an ISP router, it knows how to reach any destination on the Internet.
Routers build their routing tables using routing protocols (BGP, OSPF, IS-IS). These protocols exchange information about which networks are reachable and how to reach them. BGP specifically is used for interdomain routing, connecting different organizations' networks.
IP Options (IPv4)
IPv4 supports options that can be included in the header on a per-datagram basis. These were useful when the Internet was small and more trusted. Today, most are disabled for security and performance reasons.
Common IPv4 Options
- Source Route (strict/loose): Sender specifies the path the packet should take through specific routers. Deprecated for security.
- Record Route: Packet collects IP addresses of routers it passes through. Limited to 9 addresses due to header size.
- Timestamp: Routers record timestamps. Useful for measuring network delay.
- Router Alert: Tells routers that a packet needs special processing (used by IGMP multicast).
The 60-byte IPv4 header limit makes options impractical. A host receiving a header with options must process them, adding latency. Some firewalls strip options entirely for security.
IPv6 Extension Headers
IPv6 provides similar functionality to IPv4 options through extension headers. These form a chain, with each header pointing to the next.
IPv6 Header (40 bytes)
|
├─ Next Header: 60 (Destination Options)
|
Destination Options Header
|
├─ Next Header: 44 (Fragment)
|
Fragment Header
|
├─ Next Header: 6 (TCP)
|
TCP Header + DataThe sequence of extension headers is flexible. Some standard extension headers:
- Hop-by-Hop Options: Every router must process these. Rarely used.
- Destination Options: Only the destination processes these.
- Routing Header: Sender specifies waypoints. Type 0 was deprecated due to security concerns.
- Fragment Header: Used when source fragments (routers never fragment IPv6).
- Authentication Header (AH): Provides integrity checking.
- Encapsulating Security Payload (ESP): Provides encryption and authentication.
This design is more elegant than IPv4 options. IPv6 extension headers are processed only by relevant nodes (not every router), and new extensions can be added without protocol changes.
ICMP: The Internet Control Message Protocol
IP itself doesn't report errors. When something goes wrong, ICMP messages report the problem.
| Type | Meaning |
|---|---|
| Echo Request (8) | Ping request |
| Echo Reply (0) | Ping response |
| Destination Unreachable (3) | No route to destination or port closed |
| Time Exceeded (11) | TTL reached 0 or fragment reassembly timeout |
| Redirect (5) | Router suggests better path for next packet |
| Parameter Problem (12) | Invalid header field or option |
ICMP messages themselves are carried in IP datagrams. They're not a separate layer, ICMP is a protocol number (1) in the IP header.
Tools like ping use ICMP Echo Request/Reply. Tools like traceroute use the Time
Exceeded message to discover the path packets take through the network.
IPv4 vs IPv6: A Comparison
| Feature | IPv4 | IPv6 |
|---|---|---|
| Address Size | 32 bits | 128 bits |
| Header Size | 20-60 bytes | 40 bytes (fixed) |
| Checksum | Yes | No |
| Fragmentation | By any router | By source only |
| Options | In header | Extension headers |
| Address Notation | Dotted decimal | Hexadecimal |
| Security | Optional IPsec | IPsec built-in |
Despite IPv6's advantages, IPv4 adoption remains dominant. IPv6 adoption is growing, but dual-stack (supporting both) is common in enterprise networks. Many ISPs still offer primarily IPv4 with IPv6 support.
Network Address Translation (NAT)
NAT became popular after IPv4 address exhaustion became clear. A NAT device (usually a home router) shares one public IPv4 address among many devices on a private network.
- Internal device 192.168.1.100:8000 sends packet to 8.8.8.8:53 (Google DNS)
- NAT rewrites source to 203.0.113.5:45000 (public IP:random port)
- Response from 8.8.8.8:53 arrives at 203.0.113.5:45000
- NAT rewrites destination back to 192.168.1.100:8000
- Internal device receives response as if it came directly
NAT works but has limitations. It breaks applications that embed IP addresses in data (like FTP). It requires protocols to open holes for incoming connections (UPnP, manual port forwarding). IPv6's abundance of addresses eliminates the need for NAT.
Conclusion
The Internet Protocol is beautifully simple. Every device that connects to the Internet uses IP to find every other device and deliver messages. IP makes no guarantees, yet the Internet works reliably because of protocols built on top (TCP, applications).
IPv4 continues to dominate, despite being over 40 years old. Its header format is efficient, and NAT extended its address space beyond original expectations. IPv6 is the future, with its vast address space and improved design. Understanding both is essential for modern network engineers.
The next time a packet travels from your device to a server across the world, thank IP for quietly delivering it. It's not glamorous work, but it's the foundation everything else is built on.