Things no one told me about TCP/IP networking | Blog

If you're a person with any experience in maintaining computer networks, you probably won't learn anything new reading this article. I myself am not a system administrator, and the following is merely my best attempt at summarizing my knowledge without omitting some but not all uncomfortable details. In other words, it's what I wish I was given back in college because my curriculum did a really poor job. Be warned: there can be factual errors.

Architectural beauty of TCP/IP and the "OSI model"

Let's say we have a few computers and we want to connect them together to share information between them. Create a network, if you will. In the history of computing, there have been countless ways to do it. Smart people who developed them called these ways protocol suites. Pretty much all of them are now dead, with the exception of TCP/IP which is a fancy way of saying "protocols that the Internet uses, 2 of which are TCP and IP".

If we want to fully admire the beauty of these protocols, we need to first understand their underlying principles that may not be that obvious at a glance:

Plug and play: For the most part, computers get along without announcing their presence or being aware that anyone is present. They send stuff and hope for the best.
Peer to peer: There’s no client and no server. Everything is just a computer.
Unencrypted and untrusted: No one has to obey each other's wishes, and everyone can do wild shit on every step of the process. As you'll see soon, this is not a bug but a feature.

These principles always stay true, and when they don't, it's purely by convention. Computers can agree to follow conventions, but it's not mandatory.

Now here is where your teacher would show you a chart of the "OSI model" to explain what different protocols on different layers do. The chart in question looks something like this:

Layer 1: Physical
Layer 2: Link
Layer 3: Network
Layer 4: Transport
Layer 5: Session
Layer 6: Presentation
Layer 7: Application

That's nice and all, but there's one detail missing here: This model wasn't decided for TCP/IP. It was designed, as the name suggests, for the OSI protocols which is a completely unrelated suite that competed with TCP/IP in the '80s but which no one uses today. It simply happens that this abstract model sort of works for describing any network, including TCP/IP if you pretend Layers 5 to 7 don't exist (most of the time, they're covered by a single protocol - HTTP) and Layer 3 includes IP and nothing else.

For the sake of simplicity, I'm going to pick 1 most common protocol from each layer, one that you're likely using at home, and describe how things would work for an average person. There are, of course, more protocols but they mostly follow the same principles.

Layer 1: Cat 5e cable

Before we can create a network by wiring several computers together, we should wire at least two. In order to get connected, computers have a piece of hardware called a network interface, also known as a network card or a network adapter. All the network interface does is send and receive bits over some medium. That medium is usually a cable for wired transmission or air for wireless transmission. When 2 computers are physically in contact with each other via either of them, this connection is called a network segment.

The type of cable we're interested in is called Category 5e, usually shortened to Cat 5e. Since virtually every Ethernet network that people usually encounter uses it, it's sometimes also called an "Ethernet cable". The cable itself is 8 copper wires, all wrapped in plastic and connected to a head that can be easily inserted into an interface. When you send data, the interface uses a technique called amplitude modulation which is basically picking 2 voltages and saying that one of them is 1 and the other is 0.

So now we have 2 computers wired together. They can send and receive data, but as you can tell, this won't quite scale well. If we want to have more than 1 segment, we need to agree on what to send over that cable.

Layer 2: Ethernet

The most straightforward way to connect a lot of computers together is by using a device called a switch, which is called this way because it does packet switching - going through every packet of data it receives, one at a time, and resending it wherever it belongs. The device itself isn't anything out of ordinary - it's a computer, usually underpowered, with a fuckton of network interfaces which is all you need for this job. Your "home router" is really a switch and a router wearing a trenchcoat.

Also the switch is not the only thing you can use here. After all, if you don't want your device to even think where to send what, you can use a hub which will happily broadcast packets indiscriminately to everyone connected. You can even go one step back and wire your computers in a circle or wire every computer to every other computer. You can then call your masterpiece a network topology so you sound like a true nerd to your friends. Sky is the only limit here!

How does the switch understand which computer is trying to talk to which? Using a protocol called Ethernet that gives the bits in the stream of data a meaning. The data is sent in chunks called frames that can have a payload of any size from 46 to 1500 bytes + a bunch of metadata. There are multiple ways to encode all of this, but the only one that actually matters is called Ethernet II. Apart from a funny series of ones and zeroes used to synchronize the transmission, the metadata has stuff like the checksum (that isn't perfect), which Layer 3 protocol is being sent, and the source and destination MAC addresses.

As you've probably guessed, the MAC address is the bit we're interested in, as network interfaces use them to identify themselves to each other. The address itself is usually built into the network interface and is allocated to its manufacturer by the company called IEEE, who manages them. There are also 2 special addresses: a loopback (00:00:00:00:00:00) which is like sending a letter to yourself and a broadcast (FF:FF:FF:FF:FF:FF) which is like asking your switch to become a hub for a moment.

At this point, you may already see a potential issue with this: If computers self-identify themselves, what's stopping them from claiming to be someone else? The answer is absolutely nothing! In fact, Android and iOS devices do exactly that when discovering Wi-Fi networks. (Wi-Fi is unrelated to Ethernet but both use MAC addresses.) Additionally, this adds a bonus that there could be several computers, called nodes in this context, behind a segment, which allows tricks like wiring 2 switches together and making it just work™ with no effort.

But wait, how does a switch figure out which port corresponds to which node? Why, of course, by analyzing the traffic! Since every frame contains its source address, the switch can look at which addresses come from which port and dynamically build a switching table. If the table doesn't have the needed address yet, no problem - it'll broadcast the frame instead. Or I suppose you could configure the table manually if your switch is advanced enough.

Now we have several computers, all connected to a switch. One sends a frame to another, and it's successfully received. What can it do with that frame? Theoretically, anything, but realistically, if it sees the destination address is wrong, it'll discard it. But you can't always be 100% sure - a node could be malicious and listening to everything.

Layer 3: IP

Internetworking

For obvious reasons, we can't connect every computer in the world to a single network. We can, however, connect all of the networks between each other and create a big network that consists of networks. An Internet, if you will. IP is the protocol that makes it possible by not only specifying which computer to send data to, but also which network.

The same way switches connect computers, routers, also called gateways in some contexts, connect networks. They usually do it by being part of several of them at once and relaying the packets they receive to the right one. But it's important to remember - they don't really have to do that. Like it happens with governments in real life, all it means to be a router is that other computers recognize you as one. Instead of sending packets to the right network, a router can, for example, purposefully send them to the wrong one, do what you expect but record the data, send the right payload but change the metadata, or even do nothing. "Gaslight, gatekeep, girlboss" truly is the moral imperative here.

And as a fun consequence of that - there may not necessarily be exactly 1 router on a network. There may be 0, in which case you're stuck with a single network, or there may be multiple routers, which is vanishingly unlikely for you to encounter and yet not technically impossible.

IP addresses

Communication over IP is done by sending packets, this time not called something weird like "frames", over whatever Layer 2 protocol is being used, which for us, is Ethernet. Packets can be up to 65535 bytes in size (including all of the metadata), and since that's bigger than an Ethernet frame, they sometimes have to be either fragmented among several of them (IPv4) or just not be that big (IPv6). In order for fragmented packets to be properly reassembled, each one gets a matching Identification field.

As you probably know, there are 2 versions of IP: IPv4, the older one that stuck with us for way too long, and IPv6, the objectively better newer one that, at the time of writing, still only has 38% adoption. For your convenience, I'll be covering both of them. Their packets have quite different layouts, but they both encode pretty much the same stuff like the version of the protocol, how long the headers and the payload are, a few hints to assist delivery, and a thing called time to live (renamed to hop limit in IPv6). The last one instructs routers to drop the packet after it's been resent too many times and accidentally makes traceroute possible. Oh yeah, and also the source and destination IP addresses.

The main difference between the addresses in the 2 versions is their size. IPv4 addresses are 32 bits and are usually written down as 4 decimal numbers, one for each byte, separated by dots - for example, 192.0.2.23. IPv6 addresses are 128 bits which is 4 times as big, so they're instead written down as 8 hexadecimal numbers, one for each 2 bytes, separated by colons - for example, 2001:db8:0:0:0:0:0:23. If you have several zeroes in a row, you can collapse them - the example above becomes 2001:db8::23. And yes, this doesn't only work in the middle:

1:0:0:0:0:0:0:0 = 1::
0:0:0:0:0:0:0:1 = ::1
0:0:0:0:0:0:0:0 = ::

In general, there are 2 kinds of addresses: globally-routable and reserved. Globally-routable addresses can, as the name suggests, be routed to from anywhere as long as it's on the Internet, if things go according to the plan. But again, they don't have to. Reserved addresses, on the other hand, are used for stuff that isn't on the Internet, like private networks, documentation (hey, I used a couple of them above), denoting the same machine (127.0.0.0/8 and ::1), denoting that we don't know the address (0.0.0.0 and ::), and black magic like translating IPv6 to IPv4. You can read the full list of reserved addresses if you want.

If you've ever noticed how every IPv4 address on a home network starts with 192.168., reserved addresses are the reason why. What's even cooler, however, is that every IPv4 address that starts with 10. is also reserved, which means you can set your home router's address to 10.69.4.20. Now that's what I call a party trick.

In case you're wondering who controls who gets which addresses: IANA does. How is it different from ICANN, its parent company? God fucking knows, but as far as I understand, IANA manages IP addresses and DNS names that aren't gTLDs (stuff like .com) or ccTLDs (stuff like .uk) while ICANN manages gTLDs and gives money and people to IANA so they can do their job. Except IANA doesn't really "control" IP addresses, it gives them to regional registries, and actual control is done via BGP, but let's not get into that.

Routing and subnetting

Alright, say we know which IP address to send data to. There are 2 things we can do (apart from doing nothing): If we know for a fact that wherever that address points to is physically on the same network, then we can send the packets there ourselves. After all, if we know the destination MAC address, we have no use for a router. If we know for a fact it isn't on the same network, then we send it to a router. Assuming everything goes right, the router will be able to send the packets either directly to the destination or another router, and the process will repeat until the data arrives. If the destination isn't valid or any router along the way decides not to cooperate, the data will obviously never arrive.

Keeping in mind that a router is merely a computer with a role: Like with Ethernet frames, a computer is free to do whatever it wants with IP packets it receives, but realistically, it'll look at the destination address and give up trying to process the packet if it doesn't match any of the addresses it's interested in. That is unless it decides to route that packet somewhere else. Linux calls this process "IP forwarding" and lets you enable it per network interface. So if you enable it for, say, enp1s0, all packets received there, that weren't sent directly to your computer, will be routed.

All you really need to know about routing is that it's basically resending packets from one network interface to another. Preferably unchanged, but no one is there to stop the opposite from happening. In order to figure out where to resend, or well, send stuff at all, computers use a thing called a routing table. The table consists of entries that say which network corresponds to which interface and which router to send packets to if the network isn't directly reachable. The router that's used for packets that don't have anywhere else to go is called the default gateway.

Wait, but what the hell does "network" even mean in this context? It's a long story: Since having a separate entry for every single address would be insane, ranges of addresses are used instead. These ranges are formulated by slicing an address in 2 and calling the 2 resulting parts the network, also known as the subnet if it's inside a bigger network, and the host. The number of bits given to the network is called a prefix and is usually written down separated from the address with a slash. This way of writing it is called the CIDR notation. For example, 192.0.2.23/24 means that we should slice the address into a 24 bit network - 192.0.2.x - and an 8 bit host - x.x.x.23 - and think "huh okay, we're referring to the host 23 inside 192.0.2.0/24".

What's the difference between 192.0.2.23/24 and 192.0.2.23/23? There isn't any if neither network is in our routing table. For all we know, they're the same address. What does an address with an all-zeroes host like 192.0.2.0/24 mean? It means we're referring to the network as a whole and not any computer inside it. That's the kind of range I was talking about. What does an address with an all-ones host like 192.0.2.255/24 mean? It means if you send a packet there, the router of that network will broadcast it. What does IPv4 mean by a "subnet mask"? It's a fancy way of writing down the prefix as a binary mask with N ones in the beginning. So a /24 prefix and a 255.255.255.0 mask are the same thing. Why was I forced to manually calculate this shit on paper in college? Because they're sadists.

Got it? Great. So given an address, we look through every entry in the routing table and try to find matching networks. If the network is 192.0.2.0/24, then the IP matches if its lowest 24 bits match with the network's. If there are multiple matches, then we find out whether that damn 192.0.2.23 is 192.0.2.23/24 or 192.0.2.23/23 by picking the network with the longest prefix and sending packets to the corresponding interface. Finally.

Or do we? We know the destination address but not the source address. So uhhhhh…

How do you get the IP address in the first place?

Well, you don’t really get the IP address, you get an IP address. Not only can a computer have several interfaces connected to different networks, but it can also have more than a single address on the same network. Also remember what I said about having multiple routers on the same network? So yeah, chances are you'll get a couple more of them than you expect.

Anyway, let's get the most obvious solution out of our way beforehand: If you know exactly what network your computer always connects to and how it behaves, you can set your static address and default gateway manually. There's nothing stopping you from doing that, and if your computer is a server that lives in the same datacenter for years at a time, this is your most rational choice. Whether your router is going to like that is a different matter.

If your computer is a phone, however, or you can't ever be bothered configuring anything, then it'd be really nice if it could negotiate a dynamic address for itself. Luckily, that's exactly what DHCP does, and it does that for both IPv4 and IPv6. That is unless you use Android because Google decided you don't need it for IPv6. Read this 11 year old bug report they closed as "won't fix" for more info.

The way DHCP functions is quite simple: There's a client - your computer that wants an address - and a server - usually a program running on your router, but it can be on a separate machine. The client sends a discover message to the server via UDP (more on that later) with stuff like "hey, this is my MAC address and I'd really like to have an IP address, preferably that one", and after a bunch of back and forth offers from the server and requests from the client, they hopefully arrive at an acknowledgement. The client receives the information it needs to send stuff: its own address, how long it can have that address for, the address of the gateway, the prefix/subnet mask so it can populate the routing table with its own network, the address of the DNS server, and possibly other things.

But hold on. The client sends how and where? It knows neither the address of the server nor its own address. Does it send messages to a fixed location? No, it instead screams to everyone on the network with a broadcast frame and hopes someone replies. What does it use as the source/destination addresses? 0.0.0.0 and 255.255.255.255, which is like saying "I don't know" and "anyone who can hear me". The IPv6 version of the protocol does it slightly differently by using a multicast address which is practically the same thing, except the client is only screaming to the computers that are explicitly listening for DHCP messages. Also yes, your intuition is completely right here. A malicious third party can absolutely spoof this.

And this is all fine and dandy, but DHCP still has a fundamental flaw despite being this simple - it requires a centralized server. Doesn't sound quite "plug and play", does it? Understandably, IPv6 has a solution for that, called SLAAC. That's one hell of a funny name. The way this protocol works is somehow even more straightforward, if you can believe that. In fact, it's so straightforward that IPv4 had its own version of it called IRDP that unfortunately went nowhere. Oh well.

When using SLAAC, your computer starts with outright making up a link-local address for itself by taking a fe80::/64 network prefix and shoving some random bytes into it. The address is already being used by someone else? Eh, just make up a new one. "Link-local", by the way, means that it's a reserved address that can only be used on the current network and which is always present, even if you're completely isolated from the Internet. If you've ever noticed that your computer has an IPv6 address but no IPv6 Internet connectivity, this is why.

After getting a link-local address, your computer, similarly to what it does with DHCP, uses multicast to discover routers, but this time using ICMP Router Solicitation (more on that also later) instead of UDP. In response, the routers on the network, if they feel like it, respond with a Router Advertisement. If the response isn't "fuck off and use DHCP", your computer picks up their addresses, DNS server configuration, and globally-routable prefixes they send back. Now we have everything we need for real.

As you can see, this time, we got a whole prefix instead of one specific globally-routable address. Does this mean your computer can shove some random bytes into it too? Yes, that's exactly what it means! Or well, it can also shove its MAC address, but that's a poor practice since then the Internet servers you connect to will be able to see your MAC address. Fortunately, modern operating systems, including Linux if you use NetworkManager, don't do that anymore. If you're unsure about yours, look for "IPv6 interface identifier generation" or something like that in your network daemon's settings.

There's only one piece of the puzzle left: How do you find if an IP address is used by someone on the network? Or for that matter, how do you find the MAC address of the router from its IP address? The answer for both questions is ARP for IPv4 and NDP for IPv6, both of which are stupidly simple protocols.

Their entire principle of operation is broadcasting "anyone know this IP address?" and seeing if anyone sends "yeah, here's the MAC address" back. The only major difference between the two is that ARP is sent directly over, in our case, Ethernet, effectively displacing an IP packet, while NDP is sent via ICMP. And yes, once again, both can be and are spoofed by hackers.

Layer 4: UDP and... ICMP?

Let's address the elephant in the room: ICMP is not a Layer 4 protocol, even though it's sent on top of an IP packet as payload. Actually, the same applies to ARP - it's not Layer 3, even though it's sent on top Ethernet. Yes, this makes no fucking sense and is clear evidence that computer people should never be allowed to categorize stuff, but it is technically consistent with the OSI model: ICMP doesn't transport and ARP doesn't network. This fact, however, is completely irrelevant because real protocols don't care. This is the exact kind of OSI and TCP/IP mismatch I was talking about. To make our lives easier, let's pretend it's Layer 4.

ICMP, to put it short, is an error reporting mechanism for IP. Pretty much all it does is send a code to identify what happened to a packet and maybe some additional data. How does it identify the packet? By sending the first bytes of it as the payload, duh! What can it say about the packet? A lot of things, including "I don't want to accept your packets", "your packet was way too big", and "you didn't send any packet, but thanks for making sure I'm still alive; here's your data back". The last one is known as Echo or, more commonly, ping. Plus, as I mentioned before, NDP uses it for MAC address and router discovery while traceroute uses it to trace which addresses a packet travels through.

If you want to send proper data, you aren't going to use ICMP unless you're this guy. You're going to use TCP or UDP. I'll only be covering UDP since it's more relevant to the whole routing aspect of TCP/IP, and its basic principles also apply to TCP. But the gist is: TCP is not connectionless, i.e. the client and the server have to shake their hands before sending stuff, and it tries really hard not to lose or reorder anything along the way. UDP doesn't do that. Why would you use it then? It's a lot faster, which is good for stuff like video games.

Compared to a plain IP packet, UDP provides 2 additional features: Rudimentary error checking (a checksum) and multiplexing, i.e. letting you send multiple streams of data over one address. As you can probably tell, packets being identified by their address alone doesn't let you do that. UDP does this by providing 2 additional 16 bit port numbers with every packet, one for the source address and one for the destination address.

What you need to keep in mind here is that every port is duplex which is a smart way of saying "sends data both ways", so you only need one for two-way communication. In other words, there aren't really "source ports" and "destination ports", strictly speaking. Your computer has 65536 ports total, so whatever port it picks as the source will also be used as the destination by whoever replies. For this reason, if it needs to send something as a client, it'll use a random, ephemeral port.

However, you're running server, it's certainly a good idea for it to accept requests on a stable, well-known port. Usually, those ports have a number below 1024 and require root to bind to, but not always. A few examples of such ports are 80 for HTTP, 443 for HTTPS, 53 for DNS, 123 for NTP, and 25565 for Minecraft: Java Edition, the most important piece of networking software. If you want to look through all the standard ports, IANA has got a nice big list for you.

Wait, bind? Yup, it means assign a program on your computer to a port and an address, so it and only it is able to send and receive stuff over it. A security mechanism of sorts. When you send an HTTP request, you bind to a port, preferably an ephemeral one, and when you listen to these requests, you also bind to a port, preferably 443 because not using TLS in the year of our Lord 2024 is a crime.

Layer 3, one more time: NAT

As you surely know, there are 8 billion people on the planet and only 4.2 billion IPv4 addresses, which is really bad. Well, guess what, you're wrong! There aren't 4.2 billion globally-routable addresses. Remember - reserved addresses exist. If we exclude those, including a mysterious 268 million address hole in 240.0.0.0/4 that was reserved for "future use" but never actually used, we only get 3.9 billion addresses. On top of that, most of them are allocated to multinational corporations with big datacenters and big bucks, not residential areas. Now that's bad.

Obviously, you're reading this article on the Internet right now and you're likely using IPv4. Okay, maybe you printed it, but someone had to get it from the Internet before that. And that someone didn't have to wait in a queue for a week to get a turn at using their IP address (hopefully). How come? The answer is a good dose of unholy rituals that your ISP performs, also known as CGNAT or, more generally, NAT.

Here's a few fun observations: You probably aren't hosting your own Internet services at home. Almost everything that you send is either TCP or UDP. You probably aren't using all 65536 ports of either protocol at once. See where I'm going with this? What if I give you and your friends reserved addresses, then track the port numbers you're using, send your stuff as my own using my single globally-routable address with different ports, and then give you the stuff that comes back as if it'd been sent to you. Trust me, bro, you won't notice it™.

Yes, this is pretty much what NAT subjects you to. Lies and deception. You think you have an entire address for yourself, but all you get is a port per connection, which NAT is desperately waiting for you to forget about for a few minutes so it can give it to someone else. You think you can let people send packets to you, but NAT's grim ghostly hand points to a sign that reads "your packet shall go out first" and doesn't even give you a port. Your existence is reduced to nothing but being a pawn in NAT's master plan to conquer the world.

Oh yeah, also it spoofs the Identification field. These aren't aware of ports, after all.

Conclusion

Be a good boy/girl/bunny and use IPv6. Please :)