You're offline - Playing from downloaded podcasts
Back to All Episodes
Podcast Episode

AWS Suffers Second Major DNS Outage in Three Months, Raising Cloud Reliability Concerns

January 20, 2026

Audio archived. Episodes older than 60 days are removed to save server storage. Story details remain below.

Amazon Web Services experienced a significant service disruption over the weekend of January 18, 2026, affecting DynamoDB and multiple core services in its critical US EAST 1 region located in Northern Virginia. The incident marks the second DNS related failure in the same data centre within a three month period, raising fresh questions about cloud infrastructure reliability as Amazon prepares for its February 5, 2026 earnings report.

The latest disruption affected numerous popular applications and services, including Snapchat and Venmo, leaving users unable to access their accounts and functionality. The timing proved particularly notable as US stock markets remained closed Monday in observance of Martin Luther King Jr. Day, preventing investors from reacting until trading resumed Tuesday. Amazon shares had closed Friday's session at 239.12 dollars, up 0.39 percent.

The Technical Failure

AWS confirmed on Sunday that it was investigating higher error rates and latency spikes in its DynamoDB database service within the US EAST 1 region. The company subsequently identified a DNS resolution problem affecting the DynamoDB API endpoint as the root cause of the disruption.

DNS, or Domain Name System, functions as the internet's address book, mapping service names to their correct network locations. When DNS fails to perform this mapping function, applications can hang or become unresponsive even when the underlying servers remain fully operational. This creates a scenario where functioning infrastructure becomes unreachable, similar to a restaurant that's open for business but has lost its street address.

Pattern of DNS Failures

The weekend's incident follows a major October 2025 outage that proved far more severe and prolonged. That disruption, caused by what Amazon described as a latent race condition in DynamoDB's automated DNS management system, generated an empty DNS record for the service's regional endpoint. The result was cascading failures across 140 AWS services that persisted for over 14 hours.

The October outage affected a wide range of platforms, from social media services like Snapchat and gaming platforms like Roblox to financial services including Coinbase and Venmo. The breadth of impact demonstrated how deeply interconnected modern cloud services have become, with a single point of failure capable of disrupting services across multiple industries and use cases.

Following the October incident, AWS introduced new DNS resiliency features designed to achieve a 60 minute recovery time objective during service disruptions in the US EAST 1 region. However, the weekend's outage suggests these measures were insufficient to prevent DNS related failures from recurring.

Strategic Importance of US EAST 1

The US EAST 1 region in Northern Virginia represents one of AWS's largest data centre clusters and serves as the default region for many AWS services. This outsized importance means disruptions in this region can propagate globally, affecting applications that are geographically distant from Virginia.

The region's centrality to AWS infrastructure makes it a critical component of the company's operations, but also creates a potential single point of failure. When issues arise in US EAST 1, they don't remain localized but can cascade across AWS's global network, impacting services and customers worldwide.

Financial Stakes and Competitive Pressure

The reliability incidents come at a particularly sensitive time for Amazon as the company prepares for its February 5, 2026 earnings report. AWS remains Amazon's primary profit engine, having generated 33 billion dollars in revenue during the third quarter of 2025, representing a 20 percent year over year increase. The cloud division produced operating income of 11.4 billion dollars in that quarter, with CEO Andy Jassy noting that AWS is growing at a pace not seen since 2022.

AWS faces intensifying competition for enterprise cloud spending from Microsoft Azure and Google Cloud Platform, with stakes elevated by rapid AI infrastructure investment. Amazon's capital expenditures reached approximately 125 billion dollars in 2025, with company executives indicating plans for even higher spending in 2026, largely driven by AI infrastructure buildout.

Reliability issues threaten AWS's competitive position at a time when enterprise customers are making long term cloud infrastructure decisions worth billions of dollars. Service disruptions can influence these decisions, potentially shifting market share to competitors who can demonstrate greater stability.

Broader Cloud Infrastructure Questions

The recurring DNS failures at AWS raise fundamental questions about cloud infrastructure design and the risks of centralization. As more of the internet's services consolidate onto a handful of major cloud providers, the impact of outages grows proportionally. When a single region in one data centre can disrupt services used by billions of people globally, it highlights the vulnerability created by these architectural dependencies.

The incidents also underscore the challenges cloud providers face in managing increasingly complex systems at massive scale. Race conditions, DNS management issues, and cascading failures represent the kinds of subtle, interconnected problems that can be difficult to predict and prevent, even with substantial engineering resources and sophisticated monitoring systems.

As cloud computing continues to underpin an ever larger portion of critical internet services, from social media and entertainment to financial services and communications, the pressure on providers to achieve near perfect reliability intensifies. The AWS outages demonstrate that even industry leaders with vast resources and expertise continue to grapple with these fundamental challenges.

Published January 20, 2026 at 3:35am

More Recent Episodes