Transfer Control Protocol, we have all heard the term once in our software engineering life, but how does it work? TCP is the underlying protocol of all HTTP requests made over the internet, along with File Transfers, Mails, and Peer to Peer communication. We won’t be opening up the engine in this article but we will definitely see what is going on under the hood.
When we make a request to a server, let’s say a GET request to retrieve a web page, before any information is exchanged, we need to establish a connection. Our HTTP request has a payload, headers, cookies, and more metadata. All of this cannot be sent at once and therefore is split into small packets of data that are sent to the server over this connection, where the server assembles them and then processes the request. TCP is the protocol that handles this connection.
What if one of the packets does not reach the server?
To solve this problem, the server sends an acknowledgement back to the client for each packet, and if the client does not receive acknowledgement of a packet from the server, it sends that packet again.
How does the server know the order in which the packets must be assembled?
Each packet is tagged with a corresponding sequence number, which can be a random range and does not have to start with 1. We introduce this randomness to prevent any attacks over the connection. Since the range is random, the client and the server need to synchronise their sequence numbers.
- When the client sends the first request to the server, it is a SYN request that tells the server about the client’s sequence number
- The server replies by sending a SYN+ACK request to the client, notifying of its own sequence number and acknowledging the initial SYN call from the client.
- Lastly, the client sends an ACK denoting that the connection is established and information can be exchanged.
Note that a SYN+ACK request is not the response to the actual HTTP request from the client, but just an acknowledgement that the request has been received.
This entire process is known as TCP’s 3 way handshake.
3 Way Handshake of a TCP Connection
What happens on the server?
In order to establish a TCP connection, the server needs to listen for incoming connections. Connection requests can be sent to the server by knowing its IP address and the port which is exposed to listen for connections. Internally, a Network Interface Controller (NIC) is responsible for checking if the connection request is meant for its machine and it then passes the request to the OS.
The Operating System allocates 2 queues, a SYN queue and an Accept queue which reside in the kernel memory. The sizes of these queues can be determined by the programmer but they also have a default size allocated by the OS. On receiving a SYN packet from the NIC, the OS adds it to the SYN queue and immediately sends a SYN+ACK back to the client and waits to receive the ACK. Meanwhile, the OS keeps adding incoming SYN requests to the SYN queue.
Once a legitimate* client sends ACK, the OS maps the ACK request to the original SYN request and pops it from the SYN queue while pushing it in the Accept queue. The connection is finally complete.
Notice that our application still has no idea about the original HTTP request that has been made. All that has happened till now is the establishment of the TCP connection. Our application actively queries the Accept queue in an infinite loop to receive details about a request. If the Accept queue is not empty, the connection in it is popped and the application can use it to process the request and exchange information with the client.
If you thought that the 3 way handshake to establish a connection was a long process, then its a good time for you know that TCP termination process is a 4 way handshake.
- The server or the client (usually the client) sends a FIN packet to the server and goes into a waiting state, waiting for ACK request from the server for the FIN packet, depicting the start of the termination process
- The server sends ACK notifying that it is ready to close the connection.
- The server then sends a FIN to the client, as a final confirmation that the it approves the termination of the connection. This delay between ACK and FIN sent by the server is done to clean up memory on the server side before closing the connection.
- The client sends an ACK to the FIN signal and as soon as the server receives the ACK, the connection is terminated.
The entire journey of a TCP connection is a long one with multiple round trips. There are ways to optimise this process by using a connection pool that utilises already existing connections instead of creating new ones, but it is still resource extensive. Regardless, TCP is the backbone of the internet and worthy of being studied by every engineer.
*As you can induce, adding SYN requests to the queue can cause the queue to overflow, also known as SYN Flooding, effectively preventing any more connections to get established. It can be done by a client with malicious intent that sends SYN but never sends an ACK.