Rantburg

Today's Front Page   View All of Thu 05/02/2024 View Wed 05/01/2024 View Tue 04/30/2024 View Mon 04/29/2024 View Sun 04/28/2024 View Sat 04/27/2024 View Fri 04/26/2024
2021-10-06 Science & Technology
Facebook outage caused by a single mistake; has huge implications
[9to5mac] Yesterday’s Facebook outage – which took down Facebook Messenger, Instagram, and WhatsApp as well as the main service – resulted from a mistake by the company’s own network engineers.

The mistake led to all of Facebook’s services being inaccessible, with one analogy likening it to a failure in the “air traffic control” services for network traffic …

We reported yesterday on the massive failure.

It’s not just you: Facebook, Instagram, and WhatsApp are all currently down for users around the world. We’re seeing error messages on all three services across iOS applications as well as on the web. Users are being greeted with error messages such as: “Sorry, something went wrong,” “5xx Server Error,” and more.

The outage is affecting every Facebook-owned platform, according to data on Downdetector and Twitter. This includes Instagram, Facebook, WhatsApp, and Facebook Messenger […] While some Facebook, Instagram, and WhatsApp outages only affect certain geographic regions, the services are down worldwide today.

It gradually appeared that the problem might relate to DNS – the domain name servers that tell devices which IP addresses to use to access services – but it was unclear what exactly had happened, and whether this was an external hack, malicious action by an insider, or a catastrophic mistake.

Facebook has now admitted in a blog post that it was a mistake.

Our engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication. This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt.

It took a long time to resolve the problem because the inaccessible systems included the servers and tools engineers would normally use to solve the problem remotely. Reports suggest that lower-level employees had to gain physical access to the data centers, and then rely on step-by-step instructions from more senior engineers in order to undo the mistake. Complicating this, the networks being unavailable meant that Facebook’s door access systems were also offline, physically preventing access.
Read the rest at the link
The Times of Israel adds:
After an almost unprecedented six-hour global outage, Facebook restored its services and those of WhatsApp and Instagram on Monday and blamed the fiasco on configuration changes it made to the routers that coordinate network traffic between its data centers.

“This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt,” Facebook vice president of infrastructure Santosh Janardhan said in a post.
Posted by badanov 2021-10-06 00:00|| || Front Page|| [7 views ]  Top

#1 Yeah, DNS is the glue that holds everything together. But nobody pays any attention to it. People let domain names lapse all the time. "Who knew we had to renew it!"
Posted by Blinky Pholuling8616 2021-10-06 00:30||   2021-10-06 00:30|| Front Page Top

#2 A 'mistake' - that's what they're going with now?
Posted by Raj 2021-10-06 01:45||   2021-10-06 01:45|| Front Page Top

#3 Mistakes were made....billions lost by the boss...move along nothing happening.
Posted by Joluling Gleque7445 2021-10-06 06:08||   2021-10-06 06:08|| Front Page Top

#4 Facebook slammed for promoting 1619 Project content: 'Utterly irresponsible'
Posted by Skidmark 2021-10-06 07:49||   2021-10-06 07:49|| Front Page Top

#5 
the networks being unavailable meant that Facebook’s door access systems were also offline, physically preventing access.

A truly dumbass network "feature"
Posted by Bubba Lover of the Faeries8843 2021-10-06 14:16||   2021-10-06 14:16|| Front Page Top

#6 Always wanted to replace DNS with GPS, altitude, a random # and provider network.
Posted by 3dc 2021-10-06 20:29||   2021-10-06 20:29|| Front Page Top

09:39 Super Hose
09:38 ed in texas
09:37 JohnQC
09:36 ed in texas
09:18 DarthVader
09:12 DarthVader
08:55 Bobby
08:48 Elmerert Hupens2660
08:44 Cleared Cookies Lost Nic
08:44 Mullah Richard
08:42 alanc
08:33 Cesare
08:29 Jolusing+Hatfield1692
08:25 Cesare
08:18 Skidmark
08:17 Skidmark
08:15 Elmerert Hupens2660
08:01 Huputle+Cherelet4131
07:58 Skidmark
07:58 Elmerert Hupens2660
07:55 MikeKozlowski
07:53 Huputle+Cherelet4131
07:52 MikeKozlowski
07:46 Elmerert Hupens2660









Paypal:
Google
Search WWW Search rantburg.com