Understanding Latency on the Web

Definition

Latency is the amount of time it takes for a server to receive and process a request for a page object, such as a file. The amount of latency depends largely on how far away the client is from the server.

TCP and Window Size

TCP, the protocol underpinning the web, uses a process called "acknowledgement" on the client to let the server know the transmission is going OK. TCP starts off with a slow connection speed, or "window size" and attempts to establish a reliable connection before transfer speed can increase. It gradually increases the window size for each transmission, with most modern browsers having a starting window size of 14kB.

If the connection incurs any problems, like dropped packets or time outs, the window size decreases again, which effectively amounts to an extra round trip to establish a reliable connection again. Once the connection is established, multiple HTTP requests can be opened, but it is the initial connection which is affected most by latency, as multiple round trips may be necessary to establish transmission reliability.

First Paint and CDNs

This is why many performance articles cite 14kB as an initial performance budget for rendering the "first paint", because it can significantly impact perceived performance. Theoretically, light travelling in an optic fibre takes about 170ms to loop around the world. Once you account for delays between network interfaces between such connections, it explains why latency times can run into the hundreds of milliseconds. Hence where you host your code is the controlling factor. Using a CDN which serves your content from a geographically closer origin will not only reduce the time it takes for data to transmit from source to destination, it will also mitigate the impact of extra round trips incurred in establishing connection reliability.

Leave a Comment