Build Your Own TCP/IP Stack: Ethernet & ARP
Imagine you're sitting at your computer. You want to send a message to another computer across the room.
Both computers are connected to the same network switch. Same physical wire, same local network. How hard could this be?
Turns out, there's a fundamental problem we need to solve first. And the solution we invent will be the foundation for everything else.
Let's figure it out.
The Problem: Sending Bytes on a Wire
Your computer has data it wants to send. The other computer is right there, physically connected. You just need to push some bytes down the wire.
If you could send any sequence of bytes you want, how would you make sure the receiving computer knows where your message starts, where it ends, and who it's for?
Think about it for a moment. The wire is shared. Multiple computers might be connected to the same switch, all trying to send data at once. Pure chaos.
We need structure.
Inventing the Frame
Let's start simple. What if we just send our data like this?
[some data bytes...]
Nope. The receiver has no idea when your message starts or ends. The bytes just keep coming. It could be one message or ten messages stuck together.
We need boundaries. What if we add a length field at the start?
[length][data...]
Better! Now the receiver knows: "The next N bytes belong together." But wait, who is this message for? If three computers are on the network, all of them will see these bytes. We need addresses.
[destination][source][length][data...]
Now we're getting somewhere. The destination says who should read this. The source says who sent it (useful for replies). The length says how much data follows.
This is a frame, the envelope that wraps all data traveling on the physical network.
Click on each field. This is what we just invented. Every piece of data that leaves your computer gets wrapped in one of these envelopes.
But look closely at those address fields. What should we put there?
The Addressing Problem
We need some way to identify each computer on the network. Let's consider our options:
Option 1: Use IP addresses like 192.168.1.1
Seems obvious, right? But here's the problem: IP addresses are assigned by software. They can change. Your laptop might be 192.168.1.100 at home and 10.0.0.50 at work. If your network card is looking for a specific IP address burned into it, this breaks.
Also, we haven't invented IP yet. We're building the layer underneath IP.
Option 2: Use something permanent
What if each network card had a unique identifier that never changes? Something assigned at the factory that stays with the hardware forever?
That's exactly what we need.
MAC Addresses: Hardware Identity
Let's make each network card's address permanent. When the manufacturer builds a network card at the factory, they assign it a unique 48-bit number. This is its MAC address (Media Access Control address).
Example: A4:5E:60:D2:4F:38
These addresses are globally unique. The first 24 bits identify the manufacturer, and the last 24 bits are a serial number. Your network card will have this address forever. (Technically you can spoof it in software, but the hardware address doesn't change.)
Now when your network card receives a frame, it can look at the destination MAC address:
- If it matches my MAC? Process the frame.
- If it doesn't match? Ignore it. Not for me.
There's also one special address: FF:FF:FF:FF:FF:FF. This is the broadcast address. Every device on the network accepts frames sent to this address. (We'll see why this matters soon.)
Ethernet frames only travel on the local physical network. Your home WiFi is one Ethernet network. Your office LAN is another. A frame can't jump between them directly. It can only reach devices on the same wire. To cross networks, we'll need something else (that's what IP is for).
Perfect! Our frame format works. We can send data between computers on the same network.
But now we run into a new problem.
A New Problem Appears
Your application wants to send data to another computer. It tells you: "Send this to 10.0.0.4"
That's an IP address. But you need to fill in the destination MAC address in your Ethernet frame.
You know the destination's IP address (10.0.0.4), but Ethernet needs a MAC address. You have no idea what 10.0.0.4's MAC address is. How do you find out?
Take a moment to think about it. You're on a local network. Multiple devices are connected. You need to figure out which MAC address belongs to IP address 10.0.0.4.
What would you try?
The Obvious Solution: Just Ask
Here's what we could do: shout to everyone on the network.
Send a broadcast frame (destination: FF:FF:FF:FF:FF:FF) with a simple question:
"Who has IP address 10.0.0.4? Please tell 10.0.0.1 (that's me)."
Every device on the network receives this broadcast. Most of them ignore it (they're not 10.0.0.4). But one device is 10.0.0.4, and it responds:
"That's me! My MAC address is 00:0C:29:6D:50:25."
Done. Now you know the MAC address and can build your Ethernet frame.
This is embarrassingly simple. Almost too simple. But it works!
This protocol is called ARP (Address Resolution Protocol).
Watching ARP in Action
Let's see exactly how this works. Suppose you're computer 10.0.0.1 and you want to talk to 10.0.0.4.
Step 1: The Broadcast Request
You build an Ethernet frame:
- Destination MAC:
FF:FF:FF:FF:FF:FF(broadcast, everyone listens) - Source MAC: Your MAC address
- Payload: "Who has 10.0.0.4? Tell 10.0.0.1"
This frame flies across the network. Every device receives it because it's a broadcast.
Step 2: Everyone Receives, One Responds
- Computer
10.0.0.2sees it. "Not for me." Ignores it. - Computer
10.0.0.3sees it. "Not for me." Ignores it. - Computer
10.0.0.4sees it. "Hey, that's me!"
Step 3: The Reply
Computer 10.0.0.4 builds a reply frame:
- Destination MAC: Your MAC address (it knows from your request)
- Source MAC:
00:0C:29:6D:50:25(its own MAC) - Payload: "10.0.0.4 is at 00:0C:29:6D:50:25"
This reply goes directly to you. Not a broadcast, just a point-to-point frame.
You receive it and now you know: 10.0.0.4 = 00:0C:29:6D:50:25.
Select a target IP and click Send ARP Request. Watch the broadcast go to everyone. Notice how only the matching device responds. Now try requesting the same IP again. What happens?
Notice something different the second time?
The Cache: Remembering What We've Learned
The second time you sent a request for 10.0.0.4, you didn't broadcast. Why? Because you already knew the answer.
This is the ARP cache, a simple table your computer maintains:
IP Address → MAC Address Age
10.0.0.1 → A4:5E:60:D2:4F:38 0s
10.0.0.4 → 00:0C:29:6D:50:25 12s
10.0.0.7 → B8:27:EB:A1:2F:44 45s
Every time you learn a mapping through ARP, you store it. For the next few minutes, you can skip the broadcast and use the cached answer directly.
Entries expire after a timeout (usually 60-120 seconds). This handles cases where a device gets a new IP address or is replaced. The cache keeps the network from being flooded with ARP broadcasts while still staying current.
You can see your computer's ARP cache right now. Open a terminal and run:
arp -aThose are all the IP-to-MAC mappings your computer has learned recently.
Building It: Python Implementation
Let's implement what we've learned. Here's working Python code for Ethernet frames and ARP:
Ethernet Frame Builder
import struct
class EthernetFrame:
"""Build and parse Ethernet frames."""
def __init__(self, dest_mac, src_mac, ethertype, payload):
self.dest_mac = dest_mac
self.src_mac = src_mac
self.ethertype = ethertype # 0x0800 for IPv4, 0x0806 for ARP
self.payload = payload
def to_bytes(self):
"""Convert frame to bytes for transmission."""
# MAC addresses are 6 bytes each
dest = bytes.fromhex(self.dest_mac.replace(':', ''))
src = bytes.fromhex(self.src_mac.replace(':', ''))
# EtherType is 2 bytes
etype = struct.pack('!H', self.ethertype)
# Combine: [Dest MAC][Src MAC][EtherType][Payload]
return dest + src + etype + self.payload
@classmethod
def from_bytes(cls, data):
"""Parse bytes into an Ethernet frame."""
dest_mac = ':'.join(f'{b:02x}' for b in data[0:6])
src_mac = ':'.join(f'{b:02x}' for b in data[6:12])
ethertype = struct.unpack('!H', data[12:14])[0]
payload = data[14:]
return cls(dest_mac, src_mac, ethertype, payload)
# Example: Build a frame
frame = EthernetFrame(
dest_mac='FF:FF:FF:FF:FF:FF', # Broadcast
src_mac='A4:5E:60:D2:4F:38',
ethertype=0x0806, # ARP
payload=b'ARP request data here'
)
frame_bytes = frame.to_bytes()
print(f"Frame size: {len(frame_bytes)} bytes")ARP Cache
import time
class ARPCache:
"""Simple ARP cache with expiration."""
def __init__(self, timeout=120):
self.cache = {} # ip -> (mac, timestamp)
self.timeout = timeout
def add(self, ip, mac):
"""Add or update an IP-to-MAC mapping."""
self.cache[ip] = (mac, time.time())
print(f"ARP cache: {ip} -> {mac}")
def lookup(self, ip):
"""Look up MAC address for an IP."""
if ip not in self.cache:
return None
mac, timestamp = self.cache[ip]
# Check if entry expired
if time.time() - timestamp > self.timeout:
del self.cache[ip]
return None
return mac
def clear_expired(self):
"""Remove expired entries."""
now = time.time()
expired = [
ip for ip, (_, ts) in self.cache.items()
if now - ts > self.timeout
]
for ip in expired:
del self.cache[ip]
print(f"ARP cache: {ip} expired")
# Example usage
cache = ARPCache()
# Simulate learning MAC addresses
cache.add('10.0.0.1', 'A4:5E:60:D2:4F:38')
cache.add('10.0.0.4', '00:0C:29:6D:50:25')
# Look up an address
mac = cache.lookup('10.0.0.4')
if mac:
print(f"Found: 10.0.0.4 is at {mac}")
else:
print("Need to send ARP request!")ARP Request/Response
import struct
class ARPPacket:
"""Build and parse ARP packets."""
# Operation codes
REQUEST = 1
REPLY = 2
def __init__(self, operation, sender_mac, sender_ip, target_mac, target_ip):
self.operation = operation
self.sender_mac = sender_mac
self.sender_ip = sender_ip
self.target_mac = target_mac
self.target_ip = target_ip
def to_bytes(self):
"""Convert ARP packet to bytes."""
# Hardware type (1 = Ethernet), Protocol type (0x0800 = IPv4)
htype = struct.pack('!H', 1)
ptype = struct.pack('!H', 0x0800)
# Hardware size (6 for MAC), Protocol size (4 for IPv4)
hlen = struct.pack('!B', 6)
plen = struct.pack('!B', 4)
# Operation (1 = request, 2 = reply)
op = struct.pack('!H', self.operation)
# Sender MAC and IP
sender_mac = bytes.fromhex(self.sender_mac.replace(':', ''))
sender_ip = struct.pack('!I', int.from_bytes(
bytes(map(int, self.sender_ip.split('.'))), 'big'))
# Target MAC and IP
target_mac = bytes.fromhex(self.target_mac.replace(':', ''))
target_ip = struct.pack('!I', int.from_bytes(
bytes(map(int, self.target_ip.split('.'))), 'big'))
return htype + ptype + hlen + plen + op + \
sender_mac + sender_ip + target_mac + target_ip
# Example: Build ARP request
request = ARPPacket(
operation=ARPPacket.REQUEST,
sender_mac='A4:5E:60:D2:4F:38',
sender_ip='10.0.0.1',
target_mac='00:00:00:00:00:00', # Unknown (that's why we're asking!)
target_ip='10.0.0.4'
)
# Wrap in Ethernet frame
frame = EthernetFrame(
dest_mac='FF:FF:FF:FF:FF:FF', # Broadcast
src_mac='A4:5E:60:D2:4F:38',
ethertype=0x0806, # ARP
payload=request.to_bytes()
)
print("ARP request ready to send!")This is real, working code! You can extend it to actually send frames using raw sockets (requires root/admin privileges). The structure is exactly what we've been discussing: frames contain packets, packets contain data, and everything is just bytes on the wire.
Putting It All Together
Let's step back and see what we've invented:
1. Ethernet Frames: A fixed format for sending data on a physical network:
[Dest MAC][Source MAC][Type][Payload][Checksum]
2. MAC Addresses: Permanent hardware identifiers that let network cards recognize frames meant for them
3. ARP: A simple broadcast protocol to discover "which MAC address has this IP address?"
4. ARP Cache: A table to remember recent discoveries and avoid flooding the network with broadcasts
These four pieces work together to solve the fundamental problem: how to send data between computers on the same physical network.
Click through to see how data gets wrapped as it goes down the stack. We've built that outer Ethernet layer, the envelope that carries everything else.
We can now send frames on a local network. Your computer can talk to your router. Your router can talk to other devices on your home network.
But there's a catch.
The Limitation: Local Only
Ethernet and ARP only work on the local network. If you want to send data to Google's servers across the internet, Ethernet frames won't get you there. They can't leave your local network.
Think about it: broadcast frames (FF:FF:FF:FF:FF:FF) would be catastrophic on the internet scale. Every ARP request would hit millions of devices worldwide.
We need a new layer. One that can route data across multiple networks, from your home to servers on the other side of the world.
That layer is IP (Internet Protocol).
Next, we'll build IP, the layer that goes inside our Ethernet frames and makes global communication possible.