back to index

How HTTP Ends

HTTP's blank-line terminator, the chunked transfer-encoding shape, and how both relate to a filesystem's EOF.

published Mar 31, 2019 tags #http #networking

~/posts/http-end-of-story $ cat post.md

/ LANG EN / 中文
/ THEME / /

EOF and reliable file systems

Many file formats end with an EOF marker — a signal that the file’s done. If something is reading the file as a stream, EOF says the stream can be closed too. For a reliable file system, this is a sane convention.

File systems are layered, just like networks. Linux as an example (narrowing to files on disk):

  • The application layer sits on top of the kernel.
  • The System Call Interface (SCI) offers stable syscall entry points. Various languages wrap these into their own functions — Node’s fs module, for instance, calls into the SCI.
  • Below SCI is the Virtual File System (VFS) layer, which exposes a uniform “read a file” API regardless of whether it’s on a disk or some other device. Disk partitioning happens here too.
  • Below VFS is the General Block Device layer, hiding the specifics of a piece of hardware and exposing a virtualized block-device interface.
  • Below that is hardware. Out of scope for this post.

It rhymes with network-protocol layering — each layer worries about its own slice. The API exposed to the application usually doesn’t have to care about “what if a response is slow, do we retry?”.

What I want to land on: EOF as an application-layer “this file is done” marker is sensible.

HTTP’s terminator

Reasonable question: HTTP is application-layer too — why not just use EOF as the end-of-request marker?

Short answer: HTTP already has its own terminator — \n\n (more strictly \r\n\r\n).

In a GET request, a double newline after the header block tells the peer “I’m done; you can respond now”. In a POST, the first double newline marks the end of headers; the second marks the end of the body.

But once you need to send multiple related chunks of unknown count over the same TCP connection, the double-newline trick isn’t enough.

Not much different from a double newline

The HTTP header Transfer-Encoding describes chunked transfer. Transfer-Encoding: chunked means the number of chunks isn’t known in advance — the client receives chunks and concatenates them to get the full body.

Per chunk:

  1. A hexadecimal number — the byte count for this chunk.
  2. A newline (\r\n by spec; some implementations like Chrome insert a space 0x20 between the size and the \r\n).
  3. The data.
  4. A trailing newline.

The chunks are concatenated to form the full body. The final chunk is an “empty” chunk — starting with a space, optionally followed by trailing headers, and ending with a newline. That marks end of transfer.

This pattern is structurally close to EOF or a double newline; EOF is just framed as “a file” rather than “a structured framing inside a stream”.

Still application-layer

Net-net, the application layer skips a lot of work — retries, in-flight integrity checks — by leaning on TCP. Things like packet order aren’t HTTP’s problem; TCP already handles them.

back to index