Facebook’s whole network of services fell offline just as Antigone Davis was live on CNBC defending the business over a whistleblower’s charges and its management of research findings claiming Instagram is detrimental to kids.
The disruption began shortly before noon ET and lasted nearly six hours to restore. This is Facebook’s largest outage since a 2019 incident that knocked the site offline for more than 24 hours, causing a strain over small businesses and creators who solely rely on these services for a living.
Cause of the Outage Explained
On Monday evening, Facebook gave an explanation for the outage, stating that it was caused by a configuration issue. Facebook engineering teams have discovered that changes to the configuration of the backbone routers that manage network traffic between data centers caused issues that disrupted connectivity. That disruption in network traffic had a knock-on effect on how the data centers communicated, causing all services to come to a halt, according to Santosh Janardhan in a Facebook Engineering post.
At 5:30 PM ET, after failing all tests for the majority of the day, a test of ISP DNS servers via DNSchecker.org revealed that the majority of them were successful in finding a route to Facebook.com. They were able to resume normal Facebook and Instagram use a few minutes later; however, it may take some time for the DNS adjustments to reach everyone. Nevertheless, according to the company, no user data was compromised.
Facebook Executives were Aware of What Happened
Facebook communications executive Andy Stone said on Twitter they are aware that some individuals are having difficulty accessing their applications and products. They are working hard to restore the routine as soon as possible. Mike Schroepfer, who will step down as CTO next year, also tweeted they are experiencing networking difficulties, and teams are working as quickly as possible to debug and reconnect.
The outage has disrupted nearly all of the internal services used by Facebook staff to communicate and operate. Several employees told The Verge that they resorted to communicating via their work-provided Outlook email accounts, despite the fact that employees cannot receive emails from external addresses. Employees who were logged into business products like Google Docs and Zoom before the outage can continue to use them, but anyone who needs to log in with their work email was prohibited.
According to two sources familiar with the matter, Facebook engineers have been dispatched to the company’s US data centers to attempt to resolve the issue. This meant that the outage, which was already the worst in years for Facebook, could be extended even further.