The TLS Handshake Budget: Where Your First Paint Goes

On a cold visit from Frankfurt to a US-east origin, the first byte of HTML doesn't arrive until ~270ms after the click — even when the origin handler returns in 4ms. The other 266ms is TCP, TLS, and the small stack of round-trips the protocols require before HTTP gets to speak.

TLS is most of it. And the TTFB optimization industry — image diets, JS chunking, edge functions — almost entirely ignores it.

This post is a budget. We work from the moment a packet leaves the client to the moment the server is allowed to send <html>, and we account for every round-trip TLS spends along the way. Most of the savings the industry is leaving on the floor live here.

#The cold-visit budget

A cold connection from a transatlantic client (typical Frankfurt ↔ us-east RTT: ~85ms) to an origin running TLS 1.2 with no session reuse:

Phase	RTTs	Time at 85ms RTT
TCP three-way handshake	1	85ms
TLS 1.2 handshake	2	170ms
HTTP request → first byte	1	85ms
Total	4 RTTs	340ms

That's the floor before a single application byte ships. The origin's CPU time is rounding error against this stack.

TLS 1.3 (RFC 8446, August 2018) collapses the handshake to 1 RTT — ClientHello carries the key share speculatively, ServerHello returns the chosen one, and the client encrypts application data on the next packet. That's one full RTT (85ms) saved on every cold connection, every time.

0-RTT is a further saving. If the client has a recent NewSessionTicket from a prior visit, it can include encrypted application data inside the ClientHello itself. The handshake is zero round-trips of negotiation. The total cold-visit budget collapses from 4 RTTs to 2 RTTs — TCP plus the request.

If the connection is QUIC over HTTP/3, TCP and TLS combine into a single handshake (RFC 9000): cold visit becomes 1.5 RTTs typical, 0.5 RTTs with resumption. Half the round-trips of the TCP-plus-TLS-1.3 stack.

The numbers above describe what's possible. What most production sites ship is closer to the TLS 1.2 row — because TLS 1.3 is misconfigured, because session resumption is broken, or because OCSP isn't stapled. The next four sections are the four common ways this happens.

#0-RTT, ticket rotation, and the replay hole

0-RTT data is the largest single saving available — and the easiest to hold wrong. Two failure modes show up in the field:

Ticket rotation. A session ticket is encrypted with a server-side STEK (Session Ticket Encryption Key) that the server rotates — both for forward secrecy and because tickets carry connection state the server doesn't want indefinitely accepting. If you rotate the STEK every hour but cache tickets at a load-balancer fleet that doesn't share STEKs across nodes, resumption only works on whichever node minted the ticket. The other 31 nodes return HelloRetryRequest with a fresh handshake, and resumption silently degrades to first-handshake cost. The fix is a shared STEK (sealed in HSM or a coordinated KV) rotated on a known cadence, with the previous STEK accepted for the rotation grace window. Most operators ship default per-process STEKs and never notice the resumption rate is 12%.

Replay. 0-RTT data is replayable by definition — the server has no nonce to bind it to, because the handshake hasn't run yet. RFC 8446 §8 is explicit: 0-RTT must only carry idempotent application data. A GET /products is fine; a POST /payments is a duplicated charge. The application has to filter, and most applications don't. The pragmatic posture: disable 0-RTT for state-changing methods at the edge, allow it for everything else, and accept the resumption win on the read path.

A correctly-configured TLS 1.3 stack with 0-RTT on idempotent reads, shared STEKs, and a sane rotation cadence runs at ~85% session resumption across cold-but-recent visitors. That's three-quarters of TLS handshake cost gone, on the path that actually matters.

#OCSP stapling — the cert lookup nobody admits is theirs

When a browser validates a cert, it asks the issuing CA whether the cert has been revoked. Two designs exist; one is fast and the other is the default.

OCSP without stapling: the browser hits the CA's OCSP responder over plain HTTP, which itself involves a DNS lookup, a TCP connection, possibly TLS, and the OCSP request — adding 30–250ms to the first paint depending on the CA's responder latency. Worse, the responder is a single point of failure outside the operator's control. Let's Encrypt's OCSP responder briefly degraded in 2021 and the affected sites' TTFB jumped by ~150ms p50 globally.

OCSP stapling (RFC 6066 §8): the server periodically fetches a signed OCSP response from the CA and staples it into the TLS handshake itself. The client validates the staple offline, and the round-trip to the CA disappears. Cost: ~2 KB of additional handshake bytes. The staple has a nextUpdate field (CA-defined, typically 3.5–7 days for Let's Encrypt); the server refreshes before expiry.

The catch: when the staple is missing, most clients silently fall back to non-stapled OCSP, leaking the latency penalty to users without any visible signal. Must-Staple (RFC 7633) lets the cert itself declare "if you don't see a staple, fail closed." Effectively zero major sites use Must-Staple in production — the operational risk of one bad staple breaking 100% of traffic is too high. The realistic posture is to staple aggressively, monitor staple freshness as a first-class metric, and accept that a small sliver of clients will always pay the OCSP round-trip when something upstream goes wrong.

The OCSP-stapled handshake is one of the highest-leverage TTFB wins available, and is on by default on exactly zero distributions of nginx, HAProxy, or stock Apache. It is one config flag.

#Post-quantum is already in your ClientHello

As of late 2024, Chrome (124+) and Firefox (132+) negotiate X25519MLKEM768 by default — a hybrid post-quantum KEM that combines classical X25519 with ML-KEM 768 (FIPS 203, August 2024). The motivation is "harvest now, decrypt later" — adversaries recording today's TLS traffic to decrypt once a quantum computer can break X25519.

The operational consequence: the key_share extension grows from ~32 bytes (classical X25519) to ~1.2 KB. The ClientHello itself grows past the typical Ethernet MTU (1500 bytes) and is fragmented across two TCP segments. On networks where path-MTU discovery is broken (~5% of the internet, conservatively), the second segment is dropped and the handshake stalls. Cloudflare reported a ~0.34% increase in handshake failure rate during the early 2024 rollout, which sounds tiny until you multiply by global request volume.

The fix is on the server side: ensure the listener accepts oversized ClientHellos, verify path MTU on every peering, and treat any uptick in record_overflow alerts as a routing-layer regression. The fix is not to disable post-quantum on the server — that just defers the audit log of "this connection was downgraded by an active adversary on 2026-04-22." Atlas Network Suite terminates with PQ-hybrid by default and treats handshake-failure rate as a first-class operational metric.

This is a 2026 issue — five years ago, neither the algorithm nor the fragmentation behavior was real. Operators who haven't measured handshake failure rate in the last six months are running on a stale model of the wire.

#ECH — what's actually shipping in 2026

Encrypted Client Hello is the ongoing effort to hide the SNI (Server Name Indication) field from the on-path observer — the last cleartext leak in a TLS 1.3 handshake, and the field that lets coffee-shop networks, ISPs, and middleboxes know which website you're visiting on a shared IP.

The specification is still draft (draft-ietf-tls-esni-22 as of early 2026 — no RFC yet), but deployment is live:

Firefox 119+ (October 2023): enabled by default.
Chrome: enabled starting Chrome 117 for users with DNS-over-HTTPS active; otherwise behind chrome://flags.
Cloudflare: deployed in October 2023, briefly disabled in 2024 after middlebox-compat issues with corporate inspection appliances, re-enabled per-zone with an opt-in.
Apple iCloud Private Relay: ECH on by default for the second-hop relay since iOS 17.

ECH is not "encrypted SNI" — the older esni draft was abandoned for a more general scheme that encrypts the whole inner ClientHello. The DNS bootstrap relies on the HTTPS resource record (RFC 9460), which most operators have to publish manually.

The honest 2026 posture: ECH is real for browsers, irrelevant for non-browser TLS clients (most SDK HTTP libraries don't implement it yet), and a privacy win that has no measurable latency impact. Ship it. Don't promise it as a security boundary against on-path active attackers — the DNS bootstrap can be observed, and ECH's threat model assumes the resolver is trusted.

#Where Atlas terminates TLS — and why

Atlas terminates TLS at the anycast edge, not at the origin. Three reasons compound:

Round-trip economics. The TLS handshake's RTTs are between the client and whatever terminates TLS. Terminate at the origin, and every handshake pays cross-continental RTT × 2 (TLS 1.3) or × 3 (TLS 1.2 + OCSP). Terminate at the edge, and the handshake completes against the topologically-nearest POP — often <10ms RTT — while the TCP connection from edge to origin stays warm and reused across many client visits.
Session reuse density. A TLS terminator handling 10K customers' traffic mints 10K× more tickets than per-origin terminators do, which means the resumption hit rate is structurally higher. The same edge that gives Atlas CDN its cache hit rate gives the TLS layer its resumption rate; the geometry is identical.
OCSP and PQ rollout. A handful of edge terminators are easier to keep current on PQ-hybrid, OCSP staple freshness, and ECH config than 1,000 origin servers across a fleet. The control surface is one config repo, deployed to every POP.

This is the same property that made DDoS absorption work at the edge: every operator-controlled byte filtered upstream is a byte that doesn't have to travel to origin. TLS termination is the first upstream filter; it's just the one nobody markets.

#The take

TTFB is a budget. Most of it is TLS, and most of the savings live in three places: TLS 1.3 as the floor, session resumption with shared STEKs, and OCSP stapling on by default. Get those three right and a cold transatlantic visit lands inside 200ms; miss any of them and the floor is 350ms+. The image-diet and JS-chunking work is rounding error against a misconfigured handshake.

The newer concerns — post-quantum fragmentation, ECH bootstrap, the next protocol that bundles TCP and TLS into one round-trip — are 2026's audit checklist, not 2025's. The operators who measure handshake failure rate as a first-class metric will catch these as they happen. The ones who don't will read about them in a blog like this one, six months later.

Atlas Service

Atlas Network Suite

Anycast TLS termination, OCSP stapling on by default, post-quantum hybrid key shares, and handshake-failure rate as a first-class operational metric — built for the handshake your traffic is actually running, not the one in the marketing stack.

The TLS handshake hasn't gotten any cheaper since 2018. The wire didn't get faster. The math didn't change. What changed is that the optimizations that were optional in TLS 1.2 are now table stakes — and the cold-visitor floor for the operators who haven't done them is the same 340ms it was when TLS 1.2 shipped. The handshake is the budget. Spend it deliberately or spend it badly.

The TLS Handshake Budget: Where Your First Paint Goes

#The cold-visit budget

#0-RTT, ticket rotation, and the replay hole

#OCSP stapling — the cert lookup nobody admits is theirs

#Post-quantum is already in your ClientHello

#ECH — what's actually shipping in 2026

#Where Atlas terminates TLS — and why

#The take

More from the Atlas Meridian

Anatomy of a DDoS Attack: From Reflection to Mitigation

Why we built Foundry

Reading is one thing. Shipping is another.

Featured

Services

Company

Social