Today will be remembered as the day the internet broke. On Tuesday morning, many of the websites we rely on daily, including Amazon, Reddit, Twitch, Pinterest and, unfortunately, CNET went offline due to a major outage at a service called Fastly. Everywhere you looked, there were 503 errors and people complaining they couldn’t access key services and sites.
At around 2.58 a.m. PT/5.58 a.m. ET, Fastly noted an error on its status update page that said: “we’re currently investigating potential impact to performance with our CDN [content delivery network] services.” Shortly thereafter, reports of major news publications including the BBC, CNN and the New York Times being offline emerged on Twitter. Twitter itself was still running, although the server that hosted its emojis went down, leading to some odd-looking tweets.
Rather than isolated incidents affecting individual sites, it soon transpired a mass outage had taken place that had brought much of the internet to its knees. Across the world, people were receiving Error: 503 messages as they tried to access sites, including some vital services, such as the UK’s government’s gov.uk web properties.
Almost an hour later, at 3.44 a.m. PT/6.44 a.m. ET, Fastly updated its status page again to say the issue has been identified and a fix was being implemented. At 4.10 a.m./7.10 a.m ET, the company tweeted: “We identified a service configuration that triggered disruptions across our POPs globally and have disabled that configuration. Our global network is coming back online.”
The same message was sent to CNET as a comment by Fastly spokespeople.
What is Fastly?
Fastly is a cloud computing service provider that has been around since 2011. In 2017, it launched an edge cloud platform designed to bring websites closer to the people who use them. Effectively this means that if you’re accessing a website hosted in another country, it will store some of that website closer to you so that there’s no need to waste bandwidth by going to fetch all of that website’s content from far away every time you need it.
This makes for faster website load times, and optimizes images, videos and other high-payload content to show up quickly and smoothly when you land on a web page. Among the boasts on the company’s website, it says it made loading pages on Buzzfeed 50% faster and allowed the New York Times to simultaneously handle 2 million readers on election night. It also performs vital cybersecurity functions, protecting sites from DDoS attacks and bots, as well as providing a web application firewall.
Due to the way Fastly sits between the back-end web servers and the front-facing internet as we see it, any errors on its part can cause whole websites to be unavailable. Due to the localized nature of the edge cloud platform, it also means that errors don’t affect all regions in the same way at the same time (although people all across the world reported experienced problems on Tuesday).
What is a 503 error?
When you see a website displaying a 503 error rather than showing you the page you were expecting, it means the server hosting the website is not ready to handle the request. It also indicates that the problem is temporary and that it will likely be resolved soon.
Commonly, it is caused when a server is down for maintenance, or when a website has been overloaded — for example, if too many people are trying to access it at once.
Why did Fastly fail on Tuesday?
We know that Tuesday’s internet outage was caused by a “service configuration,” but not much more than that right now. Until Fastly investigates fully, it’ll be hard to declare the root cause of the catastrophic failure. It’s important to note that it’s not necessarily a cybersecurity attack, as many people have speculated on Twitter. There are many technical reasons a CDN can fail, and cyber attacks are just one of them.
Why were so many websites affected by the Fastly outage?
Fastly is a widely used service by web publishers and services — and it became apparent exactly how widely used on Tuesday when vast swaths of the internet was unavailable.
The reason it’s so popular is that the services it provides are considered essential by many online web properties, but not many companies provide these services. As such, a vast number of websites are reliant on a very small group of companies to keep running.
As Corinne Cath-Speth, a Ph.D. candidate at Oxford Internet Institute and Alan Turing Institute pointed out on Twitter, this can mean “a technical hiccup in a single company can have huge ramifications.”
“This in turn — raises major questions about the dangers of (power) consolidation in the cloud market and the unquestioned influence these often invisible actors have over access to information,” she added.