Analysis of Major AWS Outage
Summary
TLDRIn this video, Thousand Eyes reviews the Amazon AWS outage on March 2nd, focusing on the US East (Ashburn, Virginia) region. The HTTP server test reveals localized availability issues around 6:25 AM Pacific, with a quick recovery, followed by a recurrence at 8:15 AM. Connectivity errors suggest infrastructure problems. Network visualization shows packet loss spikes correlating with application outages, pinpointing internal AWS issues in Ashburn. The outage's impact on applications like Atlassian's Jira is highlighted, emphasizing the need for monitoring services on Amazon EC2.
Takeaways
- 📅 The Amazon AWS outage occurred on Friday, March the second.
- 🌐 Thousand Eyes conducted an HTTP server test to the Amazon EC2 console for the US East (N. Virginia) region.
- 📉 Initial availability issues were detected around 6:25 a.m. Pacific Time, affecting locations near Ashburn, Virginia.
- 🔄 Service recovered after about 15 minutes, but issues reoccurred around 8:15 a.m. Pacific Time.
- 📊 Thousand Eyes observed connect errors, suggesting infrastructure problems affecting applications or services.
- 📈 Network visualization revealed packet loss spikes corresponding to the application outages.
- 📍 Hotspots of issues were identified within the Amazon network, specifically in Ashburn, Virginia.
- 🔧 The AWS outage was quickly resolved, but some applications like Atlassian's Jira had prolonged downtime.
- 🔗 The outage was linked to AWS Direct Connect, which caused an extended failure.
- 🛠 Thousand Eyes offers monitoring services for applications running on Amazon EC2 to help prevent and diagnose such issues.
Q & A
When did the Amazon AWS outage occur?
-The Amazon AWS outage occurred on Friday, March the second.
What is the Thousand Eyes HTTP server test?
-The Thousand Eyes HTTP server test is a service that checks the availability of the Amazon EC2 console for the US East (US East 1) region from approximately 20 different cloud agent locations around the world.
What time did the initial availability issues start according to the Thousand Eyes test?
-The initial availability issues started at about 6:25 a.m. Pacific Time, which is 9:25 a.m. Eastern Time.
Which locations were primarily affected by the availability issues?
-The availability issues were mostly clustered around the Ashburn, Virginia area, with problems reported from locations like St. Louis and Charlotte, while Chicago and New York were unaffected.
What does the recovery of the service after 15 minutes indicate?
-The recovery of the service after 15 minutes suggests that the initial issue was localized and was resolved relatively quickly.
What kind of errors were observed during the outage according to the status bi-phase?
-During the outage, connect errors were observed, which are indicative of infrastructure problems or failures affecting the application or service.
How did the network visualization help in understanding the outage?
-The network visualization showed spikes in packet loss that corresponded with the application outages, indicating network issues within the Amazon networks, particularly in the Ashburn, Virginia location.
What was the impact of the Amazon AWS outage on Atlassian's Jira service?
-Atlassian's Jira service was affected by the first outage, had difficulty recovering, and then remained down for an extended period after the second outage.
What is the significance of the direct connect reference in the context of the AWS outage?
-The direct connect reference highlights that AWS's infrastructure issues triggered an extended outage and failure, emphasizing the impact on services relying on AWS's network.
How can users monitor their services running on Amazon EC2?
-Users can monitor their services running on Amazon EC2 by setting up tests through Thousand Eyes, which allows them to observe what's happening inside their applications.
What is the website mentioned for users to sign up for Thousand Eyes services?
-The website mentioned for users to sign up for Thousand Eyes services is www.thousandeyes.com/signalup.
Outlines

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowMindmap

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowKeywords

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowHighlights

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowTranscripts

This section is available to paid users only. Please upgrade to access this part.
Upgrade Now5.0 / 5 (0 votes)