ELEKS Labs: aws

Amazon Web Services is a powerful and mature cloud platform that keeps growing rapidly. Quite often, new features are added in such pace that many developers and architects barely have a chance to experiment with all of them on real projects. However, as with any cloud platform, the god is in the details. If you want to make the most out of your cloud services, you must know how to use them wisely, and, more important, how NOT to use them.
Based on what I've seen in my experience of training, consulting, designing and implementing cloud solutions, I've compiled a list of common 'gotchas' that might help you avoid many pitfalls on your own projects.

Update: Shortly after writing this blog post, I was excited to discover a very recent HighScalability article by Chris Fregly, a former Netflix Streaming Platform Engineer, which describes similar interesting facts you might encounter with AWS. Make sure to check it out as well!

1. Assuming the identity of Availability Zones across different AWS accounts.

When you launch AWS resources in a specific AZ, you refer to this AZ by name, such as us-east-1a or us-east-1b. However, these names are purely virtual: when you create a new AWS account and use the us-east region (which has 5 different AZs at the time of this writing), AWS randomly selects 3 physical AZs in this region and assigns the letters a, b, and c to them. The primary purpose of such design decision is to balance the load across multiple AZs in the same region. Otherwise, the users would be more likely to select the first option, us-east-1a, causing an uneven load distribution across the AWS insfrastructure. As a result, us-east-1a in one AWS account is probably a completely different AZ than us-east-1a in another account.

Details: AWS: Regions and Availability Zones

2. Referring to Elastic Load Balancers (ELB) using IP addresses.

The golden rule of designing highly available and fault tolerant systems in the cloud is: do not assume fixed location, health or availability of individual components. The same applies to ELBs: as they scale to accommodate the traffic of your application, AWS may need to relocate or replace them, causing the IP address to change. Therefore, when dealing with ELBs, always refer to them by their DNS names. If you use a custom public domain for your application, create a CNAME record pointing to the DNS of the ELB.
If you use Route 53, there is a special kind of A record available for you: the Alias record. In this case, you can create an A record for your domain that points to the name of the ELB instead of its IP address. In this case, AWS would perform a double DNS resolution behind the scenes and substitute the alias with the particular IP address of the ELB.

Details: AWS: Elastic Load Balancing

3. Performing load testing without pre-warming the ELB.

Behind the scenes, ELB is a load balancing software, which is most likely installed on the instances similar to EC2. And, as any other instance in your infrastructure, it takes time to scale it up and down to accommodate the changes to the traffic. Therefore, if you plan to perform load testing on your application, or if you expect sudden significant load spikes, you should contact AWS support to "pre-warm" your ELBs ahead of time.

Update: Chris Fregly describes the DNS side of this phenomenon in his recent post on HighScalability.

4. Hard-coding the API keys inside AMIs or instances.

When you deploy computations on the EC2 instances, at some point they will inevitably need to use other AWS services (S3, SQS, DynamoDB, CloudWatch, you name it). In order to authenticate on these services, you'll need an IAM key pair: an Access Key ID and a Secret Key. Naturally, the problem is: how does an EC2 instance acquire these keys? An obvious (but a very suboptimal) solution would be to store the keys in a configuration file somewhere in the storage volume. It would work, but you'd have a lot of trouble with securing and rotating these keys.
So there's a better way: use the EC2 Metadata Service. Using this HTTP-based service, you can query many interesting details from within the instance: its public and private IP, name of its security group(s), region and availability zone, etc. The same applies to API keys: if you assign an IAM role to your EC2 instance, you can get the API credentials in your bootstrap script using a simple query:

$ curl http://169.254.169.254/iam/security-credentials/role-name

Note that 169.254.169.254 is the metadata service address that is the same for all EC2 instances.

Details: AWS: Instance Metadata and User Data

5. Launching EC2 instances outside VPCs.

According to the release notes, new AWS accounts are now VPC-enabled by default, which means that every EC2 instance should be associated with a particular VPC at launch time. While VPC is one of the more complicated AWS features, launching EC2 instances inside VPC provides numerous benefits, particularly:

You can assign persistent static IP addresses that 'survive' stop and reboot operations. You can also assign multiple IPs and network interfaces to a single instance.
You get a more fine-grained control over the security configuration. In VPC, the security groups involve both egress and ingress filtering, compared to classic security groups where you can only configure ingress filtering.
You can run your instance on a single-tenant hardware.
You can launch resources in private subnets, which is particularly useful for database servers and private ELBs.

So, if you're designing a new solution on an old AWS account, consider using VPC for all your instances. Probably the only case when you wouldn't want to do so is when you have a large legacy system built on top of EC2-Classic instances that would involve a lot of migration effort.

6. Performing port scanning or penetration testing without being authorized.

Doing such things is a violation of AWS Terms of Service. If you need to test your cloud applications for vulnerabilities, you should contact AWS support beforehand so that they would temporarily disable their intrusion detection systems for certain machines and services.

Details: AWS: Acceptable Use Policy

7. Using S3 ACL for managing access to S3 buckets.

Amazon S3 is one of the oldest AWS services. When it appeared on the market, there was no IAM yet. As a result, to manage access to S3 buckets, S3 implemented its own security mechanism, known as S3 ACLs. Nowadays, the preferred way of managing access to S3 is using IAM policies. Not only do they offer a more fine-grained control over access to S3 operations, but they also provide a unified way of managing access to other AWS services.

Details: AWS: Using IAM Policies for S3

8. Involving your application servers for transferring data from browser to S3.

S3 supports direct browser-to-bucket uploads. All you need to do is generate a couple of hidden fields in the HTML to authenticate the client properly. If you define your form as <form enctype=“multipart/form-data">, the browser would even perform a multipart upload for large files.
Also, if you develop mobile applications that upload data to S3, you can use the AWS Security Token Service (STS) to generate temporary access keys for S3 and upload the data directly to buckets without proxying it through your servers.

Details: AWS: Browser-Based Uploads Using POST

9. Waiting for the EBS snapshot operation to complete before unfreezing the filesystem.

If you need to perform hot backups of your production environment, you must ensure that the backup captures the filesystem in a consistent state. To do this, you should freeze the filesystem (e.g., xfs_freeze) and then issue the snapshot creation command. However, you don't need to wait for the entire copying operation to finish before unfreezing the FS. Since the backup operation "captures a point in time", it only takes a matter of seconds for EBS services to capture the blocks that need to be persisted. Therefore, the right way would be to freeze the FS, issue the snapshot command, and then unfreeze the FS and expect the copy operation to be completed asynchronously. This helps you to ensure the minimum downtime of your environment.

Details: AWS Whitepaper: Backup and Recovery Approaches Using AWS [PDF]

10. Using On-Demand EC2 instances for every case.

An important aspect of designing solutions for the cloud is not only availability, scalability, security and fault-tolerance, but also cost-effectiveness. AWS provides you with a variety of cost-saving options under particular use cases. For example, if you expect to use a fixed amount of EC2 instances for a long period of time, consider using Reserved Instances. If you need a fleet of short-lived instances, outages of which can be tolerated, consider using Spot Instances (check out the blog post on Spot Instances by our engineer, Taras Kushnir). Choosing the right type of instance can make a dramatic improvement to the TCO of your infrastructure and you get a much bigger bang for your buck.

Details: AWS Economics Center.

About the author: Yuriy Guts is an AWS Certified Solutions Architect with several years of experience with architecting cloud solutions for various companies across the globe. He is responsible for the cloud solutions stream at ELEKS R&D and specializes in AWS and Windows Azure.

Intro

We have discussed basic principles of work with On-Demand instances of AWS web-services in the previous post. On-Demand instances are quite straightforward and have expected behaviour. But... what about something more stochastic and thus more interesting?

Amazon EC2 instance types

Amazon offers us several types of it's compute instances: on-demand, reserved and spot instances. One can read an article on the AWS website but to make long story short there are several types of instances:

On-Demand instances are fixed hourly rate mostly always available instances which are often used for applications which need some basic guarantees about start time of the instance
Reserved instances are instances, which we can (as it comes from their name) reserve for some period of time and they will be always available within this reserved time period
Spot instances are the most interesting kind of instances, because we act like auction player and set a price which we are willing to pay to run such an instance and if our bid beats other customers at the moment we are able to launch requested instances for some time period. If current spot price moves higher than our's, Amazon EC2 service shuts down our instance

More about Spot instances

Spot instances are kind of tricky instances which are not guaranteed to run when we request them. We have to place a bid and wait until it wins (if it will someday) in order to run a spot instance. To do this we can send spot instance request with our suggested price to Amazon EC2 service. Each request has a state which indicates if our bid won. Initially request has state "open". Our request’s state becomes "active" in case our bid won and we can see new running instances which are connected to our request with SpotInstanceRequestId property.

Describing spot requests

A DescribeSpotInstanceRequestsRequest is used to describe our spot requests state. We can query opened and active spot requests using filter property. For example, that is how one can query opened spot requests:

var describeSpotRequestsRequest = new DescribeSpotInstanceRequestsRequest();

describeSpotRequestsRequest.Filter.Add(new Filter(){Name="state", Value="open"});

var openedSpotInstanceRequests = ec2Client.DescribeInstances(describeSpotRequestsRequest);
var openedRequests =
openedSpotInstanceRequests.SpotInstanceRequest;

Running spot instances

To launch Spot instances we have to send RequestSpotInstances request and pass our price, maximum desirable spot instances number and usual launch parameters like image Id, instance type, key pair name, security groups etc. If our price wins AWS will try to launch as many instances as it can limited by our desired maximum number.

var ec2Client = new AmazonEC2Client(accessKey, secretKey);
var spotRequest = new RequestSpotInstancesRequest();

spotRequest.SpotPrice = "2.0";
spotRequest.InstanceCount = maxInstancesCount;
spotRequest.LaunchSpecification = new LaunchSpecification() {ImageID = imageID};

var spotResponse = ec2Client.RequestSpotInstances(spotRequest);
var spotResult = spotResponse.RequestSpotInstancesResult;
var placedSpotRequests = spotResult.SpotInstanceRequest.Select(rq => rq.SpotInstanceRequestId);
Spot request can throw several exceptions, one of which is AmazonEC2Exception with ErrorCode “MaxSpotInstanceCountExceeded”. This exception means we requested more instances than our account allows us to launch.

Requesting spot instances can fail even not for our fault, but because Amazon is not able to provide us with such number of EC2 instances in our region at the moment. A “InsufficientInstanceCapacity” error code is thrown with AmazonEC2Exception in this case.

Requesting current spot price

It is useful to know the current price of spot instances when placing your own bid. AWS SDK has DescribeSpotPriceHistoryRequest for this purposes.

var ec2Client = new AmazonEC2Client(accessKey, secretKey);

var spotPriceHistoryRequest = new DescribeSpotPriceHistoryRequest();
spotPriceHistoryRequest.StartTime = DateTime.Now.ToString("yyyy-MM-ddTHH:mm:ss.000Z");
spotPriceHistoryRequest.EndTime = spotPriceHistoryRequest.StartTime;

var spotPriceHistoryResponse = ec2Client.DescribeSpotPriceHistory(spotPriceHistoryRequest);
var priceHistory = spotPriceHistoryResponse.DescribeSpotPriceHistoryResult.SpotPriceHistory;

var currentPrice = priceHistory.First().SpotPrice;

A note: as it is said in the official documentation: “You can view the Spot Price history over a period from one to 90 days based on the instance type, the operating system you want the instance to run on, the time period, and the Availability Zone in which it will be launched.”

Tagging spot instances

There is a problem with tagging of spot instances: we cannot tag them just after start, because we don’t know when they will start. But there is a simple workaround: when we place spot requests, we can tag them and when actual spot instances would be launched, we can match each running instance spot request id and get corresponding tags.

Stopping spot instances

To stop Spot instances we have to work with SpotInstanceRequest but not with the running spot instance itself just like when launching spot instances. CancelSpotInstanceRequestsRequest does the job with SpotInstanceRequest Id’s to stop in parameters.

var ec2Client = new AmazonEC2Client(accessKey, secretKey);

var cancelSpotRequestsRequest = new CancelSpotInstanceRequestsRequest();
cancelSpotRequestsRequest.SpotInstanceRequestId.Add(spotRequestId);

var cancelSpotInstancesResponse = ec2Client.CancelSpotInstanceRequests(cancelSpotRequestsRequest);
var cancelledSpotRequests = cancelSpotInstancesResponse.CancelSpotInstanceRequestsResult.CancelledSpotInstanceRequest.Select(csr => csr.SpotInstanceRequestId);

After the spot request is cancelled AWS shuts down corresponding Spot instance.

Conclusion

Spot instances service from Amazon is interesting, tricky yet powerful. It allows using Amazon unused capacities at a lower price. You can purchase these instances by placing a bid, which you are willing to pay for those instances and for the time your bid is highest, you get them. Spot instances are useful in a number of situations when losing partial work is acceptable like cost-driven workloads or application testing. They help you to save your money when used wisely.

11/06/2013

AWS: 10 Things You're Probably Doing Wrong as an Architect