Security· Threat Anatomy

Anatomy of a DDoS Attack: From Reflection to Mitigation

Volumetric DDoS isn't 'solved' — it's absorbed. The math behind 51,000× memcached amplification, anycast scrubbing, and the four-class threat model behind modern defense.

Atlas Engineering
Engineering & Operations
14 min read · 2,911 wordsUpdated May 2, 2026

On February 28, 2018, GitHub absorbed a 1.35 Tbps DDoS — the largest publicly recorded volumetric attack at the time. The botnet behind it was small. The number that matters is 51,000×: every byte the attacker sent produced fifty-one thousand bytes hitting GitHub's edge. That ratio is the only reason the attack was possible at all, and the reason "the size of your botnet" stopped being a useful threat metric a long time ago.

DDoS is not one attack. It is a family of four — volumetric, protocol, reflection-amplified, and application-layer — each targeting a different exhaustion point in the stack, each requiring a different defense, and each sold by a different vendor. Treating them as one concept is how organizations end up paying for protection that doesn't apply to the attack they're actually getting.

This post is the anatomy. We start at the wire and work up.

The four classes

Every DDoS attack lives at one of four layers in the stack. The class determines what gets exhausted, where the defense has to sit, and what it costs to mount the attack in the first place.

ClassOSI layerWhat gets exhaustedMeasured inDefended at
VolumetricL3Network pipe / link capacitybps, ppsCarrier, anycast edge
ProtocolL4Kernel state tables (conntrack, SYN backlog)pps, half-open countKernel, stateful firewall
Reflection-amplifiedL3/L4 (technique)Same as volumetric — but cheaper to mountAmplification factor ×Carrier (BCP38), reflector hardening
ApplicationL7Worker threads, DB CPU, cache keysRPS, query costBehavioral classifier

The first three are about the wire. The fourth is about your code. They are not interchangeable, and the products that defend each layer are not interchangeable either. A WAF will not save you from a memcached reflection. A scrubber will not save you from HTTP/2 Rapid Reset. The selection is the whole game.

Reflection — the multiplier that breaks the budget

The volumetric attacks that show up in headlines are almost never the attacker's own traffic. They are reflected — the attacker spoofs the victim's source IP on packets sent to a high-amplification public service, and the service dutifully replies, at scale, to the victim.

The mechanic:

Text
1attacker  (spoof src=victim)  reflector  (large reply)  victim

Three properties make this structurally cheap:

  • The attacker only has to push spoofed packets out their own egress, which can be tiny — typical reflection campaigns originate from sub-100 Mbps botnets.
  • The amplification factor is multiplicative on the attacker's bandwidth budget — every byte the attacker emits becomes N bytes at the victim, where N is set by whichever protocol the reflector speaks.
  • The reflectors are legitimate services running normally. There is nothing for a destination-side filter to drop at the source — the packets are well-formed UDP responses from real servers.

There is exactly one upstream defense, and it is older than most of the readers of this post: source address validation at every network ingress. RFC 2827, published in 2000 as Best Current Practice 38 ("BCP38"). If every ISP refused to forward packets whose source IP didn't belong to the customer that sent them, reflection would be impossible. The CAIDA Spoofer project's measurements through 2024 still show that roughly a quarter of measured autonomous systems allow some form of spoofed egress. Twenty-four years after the RFC.

Reflection works because BCP38 doesn't.

The amplification table

The amplification factor (BAF — bandwidth amplification factor) is the response/query byte ratio for a given reflector. The numbers below are the ones the open-source measurement community has actually observed in the wild — they appear in US-CERT Alert TA14-017A and the steady stream of follow-on advisories every time a new reflector class scales.

ReflectorBAFWhat's exposed
memcached (UDP/11211)up to 51,000×Default-on UDP listener in pre-1.5.6 builds. Single largest amplifier ever observed.
NTP monlist (UDP/123)556×Returns up to 600 IPs per request. Disabled in ntp 4.2.7p26+.
CharGen (UDP/19)358×A 1980s diagnostic. Still on by default on a non-trivial number of routers.
QOTD (UDP/17)140×Same era, same answer.
LDAP / CLDAP (UDP/389)56–70×Active Directory exposed to the public internet.
DNS ANY w/ EDNS028–179×Heavy dependence on response size; ~50× typical.
SSDP (UDP/1900)30–75×UPnP discovery on consumer routers.
SNMP v2 GETBULK (UDP/161)6–10×Lower factor, near-universal on enterprise gear.
Apple Remote Desktop (UDP/3283)33×Surprised everyone in 2019.
TFTP (UDP/69)60×Should not exist on the public internet.

The number that broke the math is memcached. memcached versions before 1.5.6 listened on UDP/11211 by default, with no authentication. A 60-byte \stats command elicited a multi-megabyte response composed of repeated key-value listings. A reflector running with a few hundred populated keys produced replies in the multi-megabyte range from a single 60-byte query — the attack's amplification ceiling was set by the size of the reflector's working set, not by the protocol.

The Spamhaus attacks of 2013 used DNS reflection at 30–50×. The GitHub attack of 2018 used memcached reflection at 51,000×. The same outbound bandwidth went from "annoyance" to "saturate the public internet's spine."

The aftermath was a coordinated cleanup: Tier-1 carriers filtered UDP/11211 at borders, ISPs pushed firmware updates that closed the port at customer edges, and memcached 1.5.6 shipped with UDP off by default. By mid-2018 the population of exposed memcached reflectors was ~95% smaller. The class of attack still exists; it just stopped scaling. The next class will be discovered in some long-forgotten 1990s protocol that someone is still running on a public IP.

Volumetric without amplification — the SYN flood

Not all volumetric attacks are reflected. The classical protocol attack — still the workhorse against unprotected origin servers — is the SYN flood.

The attacker sends a TCP SYN. The kernel allocates a Transmission Control Block (TCB) tracking the half-open connection and replies SYN-ACK. The attacker never sends the final ACK. The kernel waits, holding the TCB, until either the ACK arrives or tcp_synack_retries exhausts (~63 seconds default on Linux). Multiply by 100,000 packets/second from a botnet and the SYN backlog (tcp_max_syn_backlog) saturates, after which the kernel refuses further connections from anyone — including legitimate clients on existing IPs.

The defense is SYN cookies, designed by D.J. Bernstein in 1996. Instead of allocating a TCB on the SYN, the kernel encodes a cryptographic cookie into the initial sequence number of the SYN-ACK. The cookie is a hash of (src_ip, src_port, dst_ip, dst_port, MSS, timestamp, secret). If the attacker never returns the ACK, no state was ever allocated. If a real client returns an ACK whose acknowledgment number minus one is a valid cookie, the kernel reconstructs the connection state from the cookie itself.

Text
1# /etc/sysctl.conf  Linux defaults worth confirming2net.ipv4.tcp_syncookies = 13net.ipv4.tcp_max_syn_backlog = 8192     # raise on busy edges4net.core.somaxconn = 4096               # the listen() backlog cap

SYN cookies have one cost: they don't honor TCP options that the client sent on the SYN packet (window scaling, SACK, timestamps), since there's no allocated state to remember them in. In practice that's a non-issue — the kernel only falls back to cookies under attack, and a few connections without window scaling is a strictly better outcome than the entire listen socket dropping.

HTTP/2 Rapid Reset — when a single connection is the attack

Above L4, attacker creativity ramps. The most important recent example is HTTP/2 Rapid Reset (CVE-2023-44487, disclosed October 2023), which scaled across Google, Cloudflare, and AWS in coordinated rounds peaking at 398 million RPS — at the time the largest L7 attack ever observed.

The mechanic is brutal in its simplicity. HTTP/2 multiplexes streams over a single TCP connection. The protocol allows a client to immediately reset a stream it just opened by sending a RST_STREAM frame. Most server implementations allocated request resources (parsing, routing, sometimes the start of work) before honoring the reset. The cost of opening a stream and immediately resetting it was negligible to the attacker; the server still paid full freight for the work it had begun.

A single HTTP/2 connection could open and reset tens of thousands of streams per second. A small botnet could produce L7 load larger than any prior protocol-layer attack — and it traveled inside legitimate-looking HTTP/2 connections that the typical edge couldn't drop without breaking real users.

Mitigations shipped in the weeks following disclosure were uniform across implementations:

  • Cap the rate of RST_STREAM frames per connection — if a peer exceeds it, treat the connection as malicious and close it with ENHANCE_YOUR_CALM.
  • Bound the number of in-flight streams per connection (SETTINGS_MAX_CONCURRENT_STREAMS) and the rate at which new streams can be opened.
  • Account for cancelled streams against the same per-connection budget as completed ones.

This is the new shape of L7 attacks: a protocol feature used at a frequency the protocol's author didn't anticipate. Expect more.

Mitigation — where the absorption happens

DDoS defense is layered for the same reason the attacks are. Nothing about a memcached reflector at 51,000× is solved by a Web Application Firewall, and nothing about an HTTP/2 Rapid Reset is solved by an anycast scrubber. The most important property of any production posture is that traffic is filtered as far upstream as possible — every byte that reaches your origin is a byte you've already failed to filter at four cheaper places.

1. Carrier-level scrubbing (BGP). Tier-1 transit providers offer BGP Flowspec (RFC 8955): the customer pushes BGP advertisements that tell the carrier's edge routers to drop traffic matching specific 5-tuples before it reaches the customer's ASN. Common in financial services and large content networks. The nuclear version is RTBH (Remotely Triggered Black Hole, RFC 5635) — announce a /32 with a community string that signals "drop everything to this destination upstream." Used as an active-incident escape valve when a single host is being saturated and you need traffic gone now; the cost is that the host is, by definition, offline until the announcement is withdrawn.

2. Anycast distribution. Every Atlas POP advertises the same IP prefix from a different AS-path. BGP best-path selection routes each client's packets to the topologically nearest POP. A 1 Tbps attack distributing across 42 anycast sites lands as ~24 Gbps per site — well below the line rate of every modern edge router. The math is the defense. The structural reason volumetric attacks rarely saturate well-built anycast networks is that an attacker has to concentrate traffic from a small enough geographic radius to land most of it at a single POP — and per-POP edge capacity is itself in the hundreds of Gbps. That's a budget no botnet smaller than the GitHub-class operators of 2018 has ever sustained at scale. Atlas CDN operates exactly this anycast topology, which is what makes our cache performance and our DDoS absorption the same product, not two.

3. L4 stateful filtering on the box. SYN cookies, ICMP rate-limits, conntrack tuning (nf_conntrack_max, tcp_max_syn_backlog), per-source UDP rate limits. This is kernel-level work on every machine that terminates traffic — the last line before the application sees anything. It is unglamorous and critical and comes free with a properly tuned Linux box. Most operators leave the defaults; the defaults assume you are not under attack.

4. L7 behavioral classification. Above L4 the defense is application-aware. JA3 / JA4 TLS fingerprinting captures the client-hello signature — the same handshake that owns most of TTFB — and Go-built scrapers do not produce the same fingerprint as Chrome, even when they spoof the user-agent. Header-order signatures catch curl-built attackers who got the headers right but emit them in alphabetical order. Per-cluster rate limits — where the cluster is (JA3, ASN, URI-pattern) — catch coordinated 10K-IP attacks running at 0.5 RPS each, which no per-IP rate limit can ever see. Tarpit routing (a deliberately slow response path that streams a few KB/s for 90 seconds) burns the attacker's connection budget without burning real-user latency, because real users have given up by the 90-second mark anyway.

The product that "does DDoS protection" is whichever vendor sits at the layer your attacker chose to hit. They are not interchangeable. The marketing copy implies they are.

What "Tbps of absorption" actually measures

Every DDoS-protection vendor publishes a headline absorption number — "5 Tbps mitigated," "12 Tbps capacity," "17 Tbps absorption floor." The number is the steady-state line-rate sum across the vendor's anycast mesh: what the network can sustain before it starts shedding traffic. It is the product of three things, and the moment you understand the three you can read any vendor's number for what it is.

  • Per-POP edge capacity, set by the line cards in the edge routers — typically 200–600 Gbps per modern POP.
  • POP count, set by physical presence — global networks run anywhere from 20 to 300+ sites.
  • Over-provisioning ratio, set by how empty the network runs in steady state. Empty networks absorb attacks; full networks don't.

For context against the modern attack distribution:

  • GitHub memcached attack (Feb 2018): 1.35 Tbps — the largest publicly recorded volumetric attack at the time.
  • Cloudflare HTTP/2 Rapid Reset disclosure (Oct 2023): 398 Mrps — the largest L7 attack ever observed at disclosure.

An absorption headline that's one structural step above the largest publicly observed attack is the engineering posture that determines whether the next memcached-class amplifier — and there will be one, because someone is always operating the next CharGen — is an incident or a non-event. Whoever published a 1 Tbps absorption number in 2015 had to publish a 17 Tbps number in 2025 — not because the math improved but because the attackers found the next reflector.

The headline number is not the product. The engineering choices that produced it are.

The metrics that tell you whether it's working

Every layer of DDoS defense ships with a metric you can watch on a normal day. If the vendor's product page leads with marketing copy and the metric isn't surfaced anywhere in the dashboard, the metric isn't being tracked — and you only find out the layer was broken when the incident already arrived. Four counters are worth instrumenting before a single attack lands:

  • Handshake failure rate. L7 attacks hide inside long-lived TLS connections that complete normal handshakes; a sudden uptick in record_overflow, unexpected_message, or aborted-mid-handshake events is the earliest signal you'll see. Bake it into the same dashboard your TTFB lives on.
  • Scrub-log volume per ACL. Carrier scrubbing at the edge generates a per-rule drop counter. Most operators install rules and never look at the counters again. A rule that drops zero packets across a quarter is either dead or covering an attack class nobody currently runs; either way it's no longer load-bearing. A rule whose counter just tripled in an hour is the one currently doing the work.
  • Time-to-first-ACL-push. When a reflection vector ramps, the only number that matters is how long passes between "we noticed" and "the carrier dropped it upstream." Sub-five-minute is competent. Sub-thirty-second is a rehearsed pipeline. Nothing in between exists by accident.
  • Origin-side packet rate during an active attack. The whole point of upstream filtering is that origin sees normal traffic during a public incident. If origin's NIC counters spike while the attack is visible at the edge, the layered defense is leaking somewhere — and the third party who can prove it is the customer reporting the symptom. Atlas Network Suite lets that customer run an MTR, traceroute, or packet capture from their own machine, timestamped to the second, and stream the result straight into your dashboard. The side-by-side with your edge counters closes in minutes instead of the hours screenshot-chasing usually takes — and the timestamp is what makes the correlation defensible after the fact.

The marketing answer to "is this working?" is the SLA. The engineering answer is the four counters above. They don't lie, and unlike the SLA they tell you which layer is currently doing the work — so you can stop paying for the ones that aren't.

The take

DDoS is a layered problem with layered defenses. The botnet does not care which layer you are soft on; it picks whichever one is cheapest. The job of a competent network operator is to make every layer above some absorption floor — and to know which layer is which, so the defense for one doesn't get sold as the defense for another. The same principle that makes Providence's anti-cheat decision tree honest — surface the features, let the operator override — applies here. False positives have a cost; black boxes have a worse one.

Volumetric DDoS is not solved. It is absorbed — at the carrier, at the anycast edge, in the kernel, in the classifier, in the application — by people who understand which layer is currently being attacked and which one is doing the work. The number on the marketing page is the floor. The engineering decisions that produced that floor are the actual product.

Atlas Service
Atlas CDN
42 anycast POPs, 96% cache hit rate, sub-15ms TTFB on every continent, and 17 Tbps+ of DDoS mitigation included on every plan — the absorption layer and the cache layer are the same product, not two.

If your DDoS posture is one product from one vendor at one layer, the math says you are protected against one quarter of the threat surface. The next reflection-amplified attack to scale will be the one whose vector doesn't have a marketing page yet — and the only thing you get to choose, before it lands, is how upstream you've already pushed the work.

Atlas Engineering
Engineering & Operations

The Atlas engineering team writes about the infrastructure, security, and product work happening across the company.

Continue reading

More from the Atlas Meridian

Engage · Atlas Engineering

Reading is one thing. Shipping is another.

If this writeup resonates with what you're building, let's talk. We'll connect you directly with an engineer to assist.