HTTP | Vishal Raman

The HyperText Transfer Protocol (HTTP) is how browsers communicate to web servers.

HTTP is an application-layer protocol (Layer 7) that runs on top of a transport protocol (TCP, or QUIC in HTTP/3)
It follows a request-response model: A client sends a request, the server sends back a response.

HTTP has evolved significantly since inception. The goal of this blog post is to give a history of the evolution and describe the features and limitations of each version.

Early HTTP Versions

HTTP/0.9 was introduced by Tim Berners-Lee as part of the World Wide Web. It only supported GET requests and did not have headers or status codes (200, 404, 500, etc.). These were introduced in HTTP/1.0 along with POST and HEAD.

The original versions of HTTP did not support persistent connections. This meant that every request would require a new connection, which comes with additional overhead.

In HTTP/1.0, a connection was required for every resource. This means every image, CSS file, or JavaScript file required a new connection (and corresponding handshakes).

HTTP/1.1 (1997)

HTTP/1.1 addressed many of the initial problems with HTTP/1.0, and is still widely used for simple websites.

Improvements

Persistent Connections: Instead of closing the connection after a single response, the server keeps the TCP socket open. This allows the client to send multiple requests through the same pipe without the overhead of repeated handshakes.

Chunked transfer encoding: Servers could send results in smaller chunks. This made page rendering faster, since the server didn’t have to wait for the whole response to be ready before sending chunks.

Pipelining: Instead of waiting for a response before sending the next request, the client could send multiple requests at once.

Head-of-Line Blocking

Pipelining failed in practice due to head-of-line blocking.

The server must send responses in the same order that they were received.
If the first result gets stuck, all subsequent requests are also blocked, even if they are ready to be sent.

Developers got around this using a few methods:

Domain sharding: Serve static assets from subdomains (with separate connections).
Make fewer requests by bundling assets together.

HTTP/2 (2015)

The major goal of HTTP/2 was to reduce latency and address head-of-line blocking at the application layer.

Improvements

Binary Framing: Instead of plain text messages, HTTP/2 splits messages into binary frames.

HEADER frames carry the metadata for the request/reponse.
- These are encoded using HPACK: Headers are indexed in a dictionary. Instead of sending a repeated header, the browser can send the index and the server will look up the corresponding header.
DATA frames contain the actual message.

Multiplexing: Multiple streams (request/response pairs) are interleaved over a single TCP connection.

This avoids the application level head-of-line blocking problem. We can use a single connection with frames from separate streams.

However, we still have TCP-level head-of-line blocking. If a frame is dropped on the response, then all the frames behind it are also not received.

Sources

Original HTTP Memos:

Additional Resources: