A serious outage at infrastructure provider Cloudflare knocked out large swaths of the internet on November 18, 2025, taking down services such as ChatGPT, Spotify, X (formerly Twitter), and numerous others. According to Cloudflare, the disruption was triggered by a “latent bug” in the company’s bot-mitigation system: a configuration change caused a key internal file to balloon in size and ultimately crash the routing software underpinning its global network. Cloudflare says it is not the result of a cyber-attack, has implemented a fix (with most traffic restored by mid-morning U.S./early afternoon GMT), and is now monitoring the network while promising improved resilience going forward.
Sources: CloudFlare, The Guardian
Key Takeaways
– The outage underscores how centralized much of the global internet has become: a failure at one major infrastructure provider can ripple across countless platforms and services.
– While the issue was internal (a “latent bug” in configuration, not an external hack), the event reveals that operational risk remains a significant vulnerability in critical infrastructure.
– For businesses and users alike, this incident reinforces the importance of contingency planning, diversified infrastructure, and robust monitoring — what’s assumed to “just work” may still be exposed.
In-Depth
The November 18 outage at Cloudflare, a company believed to power roughly 20 per cent of global websites, is a stark reminder that the backbone of the digital economy remains exposed to non-malicious failures that nonetheless trigger massive downstream effects. According to Cloudflare’s internal blog, a configuration change to a service underpinning its Bot Management system caused a database to output a “feature file” much larger than it should have been; this oversized file then propagated through the network’s machines, surpassing a size threshold and causing the routing software to fail. The company initially suspected a large DDoS attack but later identified the failure as a latent bug rather than an external intrusion. Traffic started failing around 11:20 UTC, with core services “largely flowing” again by about 14:30 UTC after the outdated file was reinstated.
This kind of failure arrives at a moment when many businesses assume their cloud and network dependencies are rock-solid. Yet within hours we saw widely used services — from social media platforms like X, to enterprise tools, to streaming services — show elevated error rates or go completely offline. The crisis also shows how the failure of the “invisible” infrastructure layer can be far more disruptive than user-facing glitches: in effect, one misstep in internal configuration becomes a broad public failure.
Cloudflare’s acknowledgment that this was not an attack is important in a time when many outages are presumed malicious. But that fact doesn’t make the impact any less severe. Operational risk — the risk of mis-configuration, code or system failure, human error — now looms as large across the digital economy as malicious hacking. For firms and end users alike, the lesson is clear: reliance on a single provider (especially one so central) without fallback strategy is a gamble. As the post-mortem from Cloudflare notes, we should not accept this kind of outage as inevitable; yet we should prepare for it as a realistic threat.
On the policy and regulatory front, this outage may feed calls for greater scrutiny of the “hyper-scaling” infrastructure layer: if a handful of companies are responsible for decades-worth of internet traffic across all geographies, the systemic risk is effectively concentrated. Cloudflare says it will work to ensure the incident does not happen again — but market watchers and enterprise clients will now want evidence that the improvements promised translate into demonstrable resilience.
In short: the outage is a wake-up call, not because the internet went down (it didn’t entirely), but because the assumption that it can’t is no longer defensible.

