Page requests to the metal - Network stack - From frontend to backend
This post is part of the page requests to the metal series, see the intro/overview page to see where the network stack fits in the bigger picture.
The request has to make it from a client computer over to the server (even if we are running the code locally we will still use the networking stack although a bunch of layers are bypassed when we use localhost. Historically this has been because localhost sockets have needed to be fast and special support was made specifically for this case and by using localhost there is a bypass of the local network interface hardware.) As evidenced by the fact that people have important careers in network admin/engineering there's quite a lot that goes on to allow network communications between computers. As this post is mostly about programming we are going to mostly gloss over this. The OSI model is a really good conceptual summary of how the networking stack really works.
Intermezzo - the internet
One of the amazing things about the Internet is the way in which you can resolve domain names to IP addresses. This takes place via the Domain Name system (DNS). When you have those IP addresses you then need to get the packets from your address to the other address and the meta level of routingcomes into the picture. To create connections spanning multiple networks and routers requires use of routing protocols, this is the way in which the routers get to understand the topology of the network and make the decision on where to send the packets to next. This is a broad topic, as evidenced by the people who's main profession is dealing with these types of things.
If you are interested in understanding how routing works have a look at some of these pages:
- https://en.wikipedia.org/wiki/Distance-vector_routing_protocols
- https://en.wikipedia.org/wiki/Link-state_protocol
TCP/IP
Assuming this is connecting over the internet we have to actually send the request over to the server. In this example we are connecting to localhost so a lot of the networking stack is bypassed. We don't need to resolve any domain names so we can gloss over DNS. And we can gloss over routing because 127.0.0.1 (or ::1 in IPv6) is a known special address which is always going to resolve to localhost.
Then we have to open a TCP/IP connection to the IP address server. Again because we are using localhost we won't be needing to do NAT or anything like that, we assume that a direct connection can be made. Also we won't assume anything complicated is going on in the connection, no SSL, no VPNs or similar.
Once we open the socket we can start sending packets over. There's the SYN/ACK/SYN-ACK handshake and then the sockets are open. You can see all this in something like Wireshark. At this point the transmission for the data is encoded into a format that can be sent over the wire. The data is then sent over the wire. Because it's TCP/IP from an application level there's a lot of things we don't need to concern ourselves with, data will be re-transmitted/rerouted/reordered without us needing to know or care (unless something breaks).
Intermezzo - missing steps from the network
We deliberately glossed over a few steps before by using localhost to keep things simpler to follow. If we were using a real interface a few things would happen in between the electromagnetic signal coming across the network into the network adapter and it being available to applications on the server. These include but are not limited to:
- Binary data is encoded and sent across the wire then decoded at the receiving end.
- The NIC reads incoming packets and places contents into the incoming hardware buffer.
- Network card driver:
- Validates incoming packet
- Copy packet from hardware buffer to operating system
- Driver sends interrupt to operating system kernel when packet is complete by wrapping it in the appropriate data structure for the operating system, in the case of Linux this is sk_buff and other operating systems have their own data structures.
- Operating system services interrupt
- The Ethernet layer checks if the packet is valid, then removes the Ethernet header
- The IP layer also checks if the packet is valid by checking the checksum, if it is to be handled locally it will remove the IP header.
- The TCP layer handles all the aspects required to get a a reliable stream of data across including handling re-ordering and re-transmission of packets.
If you are interested in more details this article has a more comprehensive treatment of the networking stack.
The individual parts of the stack are quite complicated, for example TCP as defined by RFC 793 provides highly reliable communications over packet switched networks but comes with significant complexity. This document has for more information on how TCP operates: http://www.medianet.kent.edu/techreports/TR2005-07-22-tcp-EFSM.pdf
After all these steps are done the data is passed over to the application layer. This is what is commonly referred to as the "backend".
This post is part 3 of the "PageRequestsToTheMetal" series:
- Page requests to the metal - introduction
- Page requests to the metal - frontend
- Page requests to the metal - Network stack - From frontend to backend *
- Page requests to the metal - Backend - What happens on the server
- Page requests to the metal - Backend web framework
- Page requests to the metal - Backend implementation
- Page requests to the metal - hardware level