As we approach our 5th day without power here in lower Manhattan, I wanted to update our users on how we’ve fared here at Octopart during Hurricane Sandy.
Our team members are all safe and doing well. Watching the images of destruction puts into perspective the small inconvenience of walking uptown to charge my phone battery. Our hearts go out to all who were affected.
As the storm approached on Sunday evening, this image made a strong impression.
We utilize Amazon Web Services to power Octopart, in particular the us-east-1 region which is located in Northern Virginia. Although we spread our services across availability zones within the region, the swath of red on the map above made it clear that those preparations were not enough.
High on our to-do list has been testing our ability to migrate to a different AWS region. We quickly decided to run that test.
We immediately began migrating databases, spinning up web servers, and moving search indexes across the country to the us-west-2 region in Oregon. We ran into a number of stumbling blocks which we figured would be useful to share:
EBS snapshots cannot be moved between regions
UPDATE: AWS now supports cross region copying of EBS snapshots
This meant that we could not simply snapshot our existing webserver EBS volumes in us-east-1 and build new EBS volumes from them in us-west-2. Instead we had to rebuild the servers from our puppet definitions, a longer process. This surprised us because EBS snapshots are stored on S3 in non user-viewable buckets. Since S3 is available from any region we (incorrectly) assumed that we would be able to access EBS snapshots from any region.
RDS database snapshots cannot be moved between regions
RDS database snapshots are stored the same way as EBS snapshots and have the same limitations. To work around this we loaded database dumps, a time consuming process.
Not all AWS services are available outside of the us-east-1 region
In particular, we use SES (Simple Email Service) which is only available in us-east-1. Amazon has a page describing which services are available in which region.
Service settings are region specific
EC2 security groups, RDS security groups, RDS Parameter groups, among other settings need to be set up in each region separately, a time consuming process. Since this can be done without cost, you should have these set up ahead of time.
Aside from these challenges, most things worked well.
Data transfer between regions was reasonably fast
We achieved roughly 9MB/s between m1.large instances in us-east-1 and us-west-2. Your mileage may vary.
We were able to provision the machines we needed in us-west-2
Keep in mind though that not all instance types are available outside of us-east-1. Amazon has information about the regional availability of instances in their FAQ. Had there been an complete outage of us-east-1, thousands of AWS customers would have been scrambling to provision machines in other regions. It’s unclear whether enough capacity exists in other regions to handle a mass migration out of us-east-1, Amazon’s largest region. It’s also not clear if the AWS API could handle the request volume which would accompany a large scale exodus.
Ultimately, we were able to work around all of the issues above, and had us-east-1 gone down completely, we were ready to make the switch to us-west-2.
My battery is running low and my fingers are cold so it’s time to hike uptown to plug in and publish.