The allure of new technology and existing companies dabbling in new technology can be too much for some companies to resist. That has been the case with Amazon's Web Services, known as AWS. A large number of companies, some that should really know better, have come to rely on the AWS EC2, Amazon's server system for data storage. Companies like Foursquare, Pinterest and even Netflix have switched their data storage to the EC2 platform.
One of the problems with new technology is a lack of preparation from the companies that buy into it. This has been the realization of many of Amazon's cloud-server customers. Over the past few months, AWS has experienced a series of failures leaving customers in the dark. Last weekend's failure left all three of the mentioned major companies entirely down. Netflix losing service costs them money each and every time, so this is a pretty massive deal for everyone involved.
So, what caused the problem that took down the largest user of Internet bandwidth on the planet? Hit the break to find out.
One of the most important things that a server farm needs to focus on is fail-over redundancy. For big players, or even most medium-sized players, this means having at least two servers for every one actually needed. For example, one set of servers in Virginia, and another in Berlin, with one replicating its data to the other constantly. Obviously the need to have them on other continents isn't required, but it does emphasize the point. Data needs to be replicated over large distances to prevent service failure due to regional disasters.
Amazon, somehow, has managed to attract huge customers seemingly without having this very simple scenario in place. What that means is, when the east coast underwent massive storms last weekend, brought on by Tropical Storm Debbie, and large parts of the region lost power, including our own studios, Amazon's EC2 servers went offline. In a real data center, the backups on the other side of the country, or the world, would kick on in their place and no one would even know of the issue. Instead, what Amazon's customers experienced was a total shutdown of all of their services, meaning massive losses of revenue and possibly lost customers.
Being a data provider myself, I have looked into Amazon's EC2 servers for hosting our own massive amounts of show data, but declined the switch because of the lack of dedication to the platform. Have you also considered AWS and went with someone else, or are you using them now? If so, did you lose service last weekend? Let us know in the comments.