Amazon Faces Backlash Following Massive AWS Outage

Amazon Faces Backlash Following Massive AWS Outage

Amazon’s massive outage on Monday knocked out concurrent service for thousands of websites and apps. The outage itself took place within the US-EAST-1 region, which is Amazon’s biggest concentration of data centers in America. Dr. Junade Ali, a specialist in cloud computing, said the reason for this was “faulty automation.” The outage affected over 1,000 services, from large platforms like Snapchat, Reddit, and Lloyds Bank, to small community homepages. This occurrence compounded fears regarding the resilience of cloud service providers.

Some of those issues were evident in the first hours of Monday morning’s fiasco. Delays in one process caused a cascading failure throughout Amazon’s services. This lack of foresight led to the company’s failure to find one of its primary systems. Dr. Ali emphasized the severity of the issue by stating, “So they couldn’t find one of the other key systems.” This systemic failure paralyzed all of Amazon’s services. It further sent confusion and shockwaves through many businesses that depended on its infrastructure.

For some businesses such as smart mattress company Eight Sleep, the effect of the outage was even more pronounced. The outage led to an estimated two hours of downtime for their industry-leading services. As a consequence, certain mattresses caught fire and became permanently lodged in a tilted orientation. This latest incident has caused businesses to worry more about the risks of being heavily dependent on a single public cloud provider.

Amazon Web Services (AWS) has since responded to this major outage, posting an apology to customers impacted by the outage. “We apologise for the impact this event caused our customers,” a spokesperson stated. The company recently settled with the affected communities and is in the midst of taking steps to ensure this kind of event doesn’t happen again.

Dr Junade Ali, from the University of Bristol, emphasised the need for companies to make their operations more resilient to these types of disruptions. He recommended that businesses pursue a multicloud strategy, at least partially by using more than one CSP. In so doing, they would be able to avoid the liability risks of a single point of failure in one geographic area. “In this instance, those who had a single point of failure in this Amazon region were susceptible to being taken offline,” Dr. Ali remarked.

Organizations are reconsidering their cloud strategy in light of this incident. Particular scrutiny will be given to Amazon’s pledge to improve the reliability of the system itself. This ongoing imbroglio illustrates the many and profound hazards that accompany cloud computing. It puts a spotlight on the pressing importance of building better contingency plans.

Tags