Suppose you want to host a webserver on AWS. You don’t need to loadbalance over multiple instances; one will work fine. What’s the best way to set it up?
Normally you will want or need to assign a domain name in DNS - say, www.example.com. A potentially simple way is to just start an EC2 instance. AWS will assign a name accessible over the public internet. This is based on the IP address, typically something like ec2-50-18-28-125.us-west-1.compute.amazonaws.com. You can then create a CNAME resource record specifying your desired name as an alias.
In practice, there is a complicating factor: EC2 instances are ephemeral. That means once you start an EC2 instance, it may be terminated by AWS without warning, at any time. That, of course, will bring down your web server.
If you were to immediately noticed this, and recreate the instance, it will almost certainly have a different IP address and domain name. Now your CNAME record is out of date!
There are two ways to improve on this situation, with different benefits and costs.
Improvement #1: Static IP
The DNS woes can be solved by adding an elastic IP address. Basically, Amazon lets you create a static IP that you can count on always being available.
This elastic IP address (hereafter, EIP) can be associated, detached, and re-associated with any running EC2 instance. If the attached instance terminates for any reason, you can re-assign the same IP to a new one.
So instead of creating a CNAME record, you will create a DNS A record mapping your desired domain name to the static EIP. When and if the instance dies, you can start a new instance and associate it with the IP. The server will then be instantly accessible via your domain name, since DNS server caches around the world already resolve it to that IP address.
See the script run-and-associate.sh for an example of how to automate this. Its source is also embedded at the end of this post.
The added cost is minimal. (Caveat: as of this writing. Please verify current pricing with Amazon.) If an EIP is associated with a running EC2 instance, the cost is literally zero: nothing extra is charged. Only if your instance is terminated, and the IP is left unassociated with any instance, you are charged (last I checked) one cent per hour. And that is only until you bring up the server and reconnect with the elastic IP.
Which brings us to the downside: your EC2 instance can still go down, without warning! If high availability is not a concern, this might be fine. For example, if you are running some kind of test or development server, or some web service for occasional, personal, non-critical use.
For anything customers are actually paying for, you will need something more robust.
Improvement #2: Load-balancing An Army Of One
Ideally, what you want is not only a consistent DNS resolution, but to have AWS automatically and immediately recreate the instance when necessary, and instantly associate it with your preferred domain name. It turns out there is an extremely reliable way to do this with the AWS toolset, though it has a lot of moving parts.
Essentially you will create a load-balanced, auto-scaling group of EC2 instances, with the minimum and maximum server count set to one. If your instance vanishes, the health sensor detects the server count has dropped to zero, and quickly starts another one to pick up the slack.
There’s a prerequisite here: you must create a custom AMI that can hit the ground running, so to speak, as soon as it boots. This is easier if your web service has no long-term persistent state, such as a database. For the purposes of this article, let’s assume you can create an S3-backed AMI that can spawn your needed web server. Connecting it to more persistent storage is possible too.
The pieces involved include:
- An AMI that can be launched to immediately provide your web service.
- An elastic load balancer, which lives eternally - or until you are done with the web service, whichever comes first.
- A launch configuration, defining how AWS will launch replacement instances
- An auto-scaling group, which is AWS’ concept of the group of instances. (A group of one, in this case.)
- Scaling policies - basically, the trigger for creating a new instance
(The EC2 instance isn’t in this list; it’s implicitly created by all the above.)
The DNS problem is solved differently here. The endpoint for DNS resolution purposes is your elastic load balancer, which will have its own, randomly assigned domain name set by AWS. Rather than an A record pointing your human-readable name to a static IP, you create a CNAME again, defining your desired name as an alias for the canonical load balancer’s name.
For this reason, you want to be very careful not to terminate the load balancer itself, until you are absolutely certain you want to permanently discontinue the service. As of today, you cannot control what domain name will be assigned. (That would be a great feature for Amazon to add in a future release.) If you terminate then re-create the load balancer, you’ll have to update the CNAME record as well… meaning your service will be unreachable for a day or three, while that propagates around the world.
(Side note: at my mobile web design company, we guard against accidents by having different AWS accounts with different permissions. The normal production user doesn’t have the power to terminate a load balancer even if it tries; the best it can do is change the maximum server count down to zero, leaving a bare LB with no instances. I’m planning to cover this kind of account organization in detail in a future article.)
In a planned future update, we’ll go into more detail of how this works, including some demo code.