J Cole Morrison
J Cole Morrison

J Cole Morrison

Developer Advocate @HashiCorp, DevOps Enthusiast, Startup Lover, Teaching at awsdevops.io

Complete Guides:


The Hitchhiker's Guide to AWS ECS and Docker

Posted by J Cole Morrison on .

The Hitchhiker's Guide to AWS ECS and Docker

Posted by J Cole Morrison on .

The Hitchhiker's Guide to AWS ECS and Docker


In this guide we're going to discuss the major components of AWS EC2 Container Service (ECS), what they are conceptually and how they work together.

The prime directive - understanding how hosting, scaling and load balancing an application with Docker and ECS works. What are the primary pieces? How do we put the puzzle together? Does it interfere with internal development of alien civilizations?

This is a conceptual guide. Not a technical step-by-step. If you're looking for that, I have a full one of those here:

Guide to Fault Tolerant and Load Balanced AWS Docker Deployment on ECS

There's plenty of piecemeal step guides out there, including mine, so it didn't seem necessary to create another one.

Instead, we're looking for mental framework of how to think about it. Without a conceptual understanding of our tools and systems, our problem-solving ability is limited. And that limit usually winds up being "what technical step-by-step guides can I find?"

Check out the entire guide, with extras, in a 10 Part Video Series if you prefer to watch over read:

The Hitchhiker's VIDEO Guide to AWS ECS and Docker watch here

Table of Contents

  1. Overview
  2. The Analogy We'll Use
  3. Docker Images and Containers Overview
  4. Summary of Docker Analogy
  5. Challenges with Managing Docker Containers
  6. AWS EC2 Container Service "ECS"
  7. Clusters and the ECS Container Agent
  8. Task Definitions
  9. Running a Task
  10. Services
  11. Application Load Balancers
  12. Launch Configurations and AutoScaling Groups
  13. Summary
  14. Final Thoughts
  15. Image Accreditation


So what's our agenda here?

1) A conceptual overview of Docker Images and Containers

We'll discuss some of the problems it solves and build a visual analogy.

I'm including this because there seem to be a severe lack of "Dude, this is just wtf it is and what it solves - in plain english." Instead, there's just tons of marketing content, technical speculation and "reasons to use."

Also, we'll build on the analogy we set up here when diving into ECS concepts. I'll put a tl;dr at the end of the Docker section just in case you do have a good grasp of these concepts. That way you'll be caught up on the analogy.

2) The core components of ECS (EC2 Container Service) and how they're connected

What are they?

  • Cluster
  • Container Agent
  • Container Instances
  • Task Definition
  • Task
  • Service

The most helpful naming conventions in the world.

3) The supporting components to ECS and how they're connected

Outside of the core components, there are supporting ones that are practically required to do anything useful:

  • Elastic Application Load Balancers
  • Launch Configurations
  • AutoScaling Groups
  • CloudWatch Alarms (for Auto Scaling)

EC2 is also a critical piece of ECS (since it's in the name), but we're not going to dive into it. That would be a deep dive and we might drown in text because of how long it would be. If you're unfamiliar with EC2, you should probably go learn about that first before continuing with ECS. At least know "enough to be dangerous."

Also, the more you understand about VPCs and IAM, the better your overall architecture and design will be.

One more note: even though these are "supporting" they aren't something you can skip. They're about as much "supporting" as buns are to a sandwich. They're not the meat, but ya need them.

Now, when looking at the above, the list may seem pretty exhaustive. And I won't lie there is a lot to it. However, the "supporting" concepts are going to be necessary to anything serious you might set up on AWS. Even Kubernetes.

On the other hand, implementing this is easy as pie. A few clicks or a couple of CLI calls. It's understanding it that's a bit tricky.

The Analogy We'll Use

The year is 21XX. We've discovered alien life, and they're just as much of consumers as we are. Because of this, the new frontier has merchants and explorers journeying out into space. To go where no merchant has gone before. And to sell their wares at great prices once they get there.

So businesses left and right begin traveling "beyond the clouds." Beyond Cloud providers like AWS, Google, etc are leasing out spaceships. Companies can set up their intergalactic shops on these ships and begin their star-bound commerce.

A fictional company we'll call "Edges Group" begins ideating how they can bring physical books to the galaxy...

(Warning: I use Star Trek references ahead. It fit the analogy better despite the title - although the thought of a lone surviving developer wandering with an alien documenting AWS was a fun idea.)

Docker Images and Containers Overview

The way we'll structure our analogy: first we'll talk about a futuristic physical spaceship shop; second we'll compare that to a modern day software company and their application.


The Bookstore - Setting up in a spaceship

Edges Group is looking to build a bookstore that can travel the galaxy. They'll name it "Edges." There's an idea for the layout, what it will sell, the interior furniture needed, the level of electricity, etc. This is the blueprint for their shop.

With the blueprint in-hand, they need to set up the bookstore in a spaceship. Let's assume they're renting / leasing space (vs. brand new construction).

Challenges here:

1) Finding a spaceship that suits the needs of the bookstore

2) Tweak the blueprint and bookstore to fit the uniqueness of the spaceship

3) Optimize square footage price by finding a spaceship with just enough space for their bookstore.

Let's say Edges Group solves these challenges for setting up one of their Edges bookstores. Guess what? Now they've created challenges for if they want to build another Edges bookstore:

1) The blueprint was tweaked to fit the first store's spaceship.

2) Because of #1 we'd need to find an identical spaceship to reuse our blueprint

3) If we can't do #2 we'll have to get a different spaceship. We'll have to tweak our blueprint to fit this spaceship.

So if we get all the way to #3, which is likely, we'll have 2 different blueprints and won't be able to "standardize."

Unused server space or reconfiguration may needed for other services

For example, let's say our original blue print for the Edges Bookstore was for a 10000 square foot spaceship. If our second spaceship is 8000 sq. ft., we won't be able to use the exact same layout as the 10000 sq. ft. spaceship.

What if we get a 15000 sq. ft. spaceship? What do we do with the extra 5000?

We could try renting it to other tenants or set up a different shop. BUT. If we've specially modified our spaceship just for our Edges Bookstore, new tenants/shops/centers would have to be compatible. If that extra 5000 sq. ft. is disco tiling, other tenants would need to be okay with that OR we'd need to renovate that area.

Let's cross over to the modern day software part now (And therefore talk about Docker).

The Application - Setting up on a server

Edges Group is looking to host an Application, "Edges", for their bookstores. It has a set of requirements in order to run. OS, CPU, access to other services, file system, software dependencies etc. From our space bookstore scenario above, the Application is the space bookstore. These list of dependencies, etc is the blueprint.

To deploy the app, we need to provision a server that meets these requirements. Let's assume that we're "renting" servers. So like EC2 instances.

In comparison to above, the servers are like the spaceships for the shop. Similarly, just putting our app straight on a server creates some challenges.


1) Finding a server that fits our app's needs

2) Tweaking our app to fit the nuances of the server we select

3) Optimizing for cost to provision a server that provides just enough power and resources for our app and traffic.

But Edges Group is tenacious. So they get their app hosted on a server. Well, as with above, there's new challenges involved with scaling out:

1) Our app has been tweaked to fit the server that it's on.

2) If we do want to straight copy our app, we need an identical server to #1.

3) If the additional servers are different, we may need to tweak our app again. And thus manage 2 versions.

Also what do we do with unused computing power and memory servers might have? It... just... goes idle?

Yes, we can put other apps and processes on there and expose them. However, it will require tweaking to both the server and the app. Especially if we have many environment specific modifications on our server just for our main "Edges" app.

Now of course, the problem we've described above has been solved in a variety of ways with a variety of different technologies. We're interested in how to do it with Docker.

The Bookstore - Space Containers

There's a new construction trend going around. Instead of worrying about spaceship fit, companies can make a big boxed-space. These big boxed-spaces are called space-containers.

Inside of these boxed-spaces, or space-containers, they can create their shops, retail, whatever, exactly how they would want it to be.

On the outside of these are utility hookups for electricity, plumbing, internet, robotics, etc. These utilities are then made available to whatever is being built within the space-container.

From the outside, these boxed-spaces, or space-containers, look more or less the same. Although some are large and others are small, they all just look like boxes with hookups on the outside. There can be anything on the inside though!

In our company's case, there's an entire "Edges" bookstore. Every single thing it needs for its shop is there - registers, shelves, lights, robots, etc. Of course, it's not live until we have it hooked up to those utilities.

We can also define blueprints for these space-containers. In this blueprint we'll define every single thing required by one of our "Edges" bookstores. These blueprints are called space-images, because we wouldn't want to confuse anyone with terminology...

Usable server space through space containers

These space-containers have a number of advantages:

1) Any spaceship will do, as long as it has hookups for those utilities

2) The inside "Edges" shop never needs to be modified based on the spaceship

3) Any unused space in a spaceship can EASILY be rented out or used by other shops that are in space-containers

4) Because of #3, our company can easily repurpose unused space for other space-containers!

The Application - Containers

Similar to the above scenario - instead of worrying about app-to-server fit, we can create a Docker Container. Inside of it will be everything we need to run our application - the OS, the different software dependencies, etc.

On the outside, all of these Docker Containers just look like another Docker Container. Although some are larger in size and others smaller, they're all just Docker Containers that interface with Docker. There could be anything on the inside though!

In our company's case, there's an entire "Edges" application with all the needed dependencies! Of course this container isn't live until we have it hooked up and running in context of Docker.

Also, instead of having to define each Container individually, we can create a Docker Image. This is essentially a blueprint for our Containers. We use a "Dockerfile" to list all of our needs for our Docker Container. This way, spinning up Containers is a cinch.

Using Docker we can take advantage of all of a server's resources

Docker Containers come with a number of advantages:

1) Any server with Docker will do.

2) The "Edges" application never needs to be modified based on the server.

3) Any unused server power/resources can easily be repurposed for other Docker Containers.

4) Because of #3 we can easily take advantage of extra server space for other Docker Containers!

Summary of Docker Analogy

Docker Containers are like the boxed-spaces where we can build anything inside. We called them "space-containers." On the outside there are hookups for utilities.

Docker Images are like the boxed-space blueprints.

Servers set up for Docker are like spaceships set up to plug-n-play these boxed-spaces.

Learn to build production-ready Docker infrastructures on AWS from nothing.

Get notified when my next AWS DevOps Workshop opens:

Challenges with Managing Docker Containers

The Bookstore

With this concept of boxed-spaces, or space-containers, in hand, the company leases 5 spaceships that can house them. The spaceships all have the correct utility hookups for the space-containers. These spaceships will travel together in a fleet.

They now have some new challenges to face:

1) Monitoring each of the space-containers in each of the spaceships

2) Monitoring the spaceships themselves

3) Managing space and utility usage in the spaceships

4) Directing traffic to the most appropriate spaceship

5) Adding/removing space-containers if traffic levels demands it

6) Updating all of the space-containers if the blueprint changes

The company could solve all of these manually if they wanted. But there's probably a better way.

The Application

The company provisions (leases) 5 EC2 instances from AWS to run Docker Containers. The servers all have Docker and all the supports for running the Containers. The servers are all in different AWS availability zones in the same region.

Similar to our space commerce company, our tech company has challenges to face:

1) Monitoring each of the Docker Containers on the different servers

2) Monitoring the servers themselves

3) Managing usage of space, resources, power, etc of the servers

4) Directing traffic to the most appropriate Container within the most appropriate server

5) Adding removing Containers if traffic levels demand it

6) Updating all of the Containers if the Docker Image changes

So we could manually solve all of this if we wanted. However, there is a better way.

AWS EC2 Container Service "ECS"

All ECS does is solve the previously mentioned problems. In a simple phrase: "ECS manages and deploys Docker Containers." That's really it.

Sure I could rattle off the other hundred things it does, but it does all of those things to manage and deploy Docker Containers.

For our analogy: it's a service where instead of leasing our spaceships and being on our own, we get a leadership team to manage the fleet. Each of our ships will also get a captain to help coordinate with each other and manage space-containers on board. All the captains will coordinate with an admiral who oversees the entire fleet. The admiral reports to us.

We'll still use the analogy when it helps, but some of these concepts seemed to get even more confusing with it. Therefore for a few of them we'll just look at them 100% in the real world.

Clusters and the ECS Container Agent

A Cluster is just a group of EC2 instances, that each have an ECS Container Agent on them configured to point the same Cluster.

Clusters, ECS Container Agent and servers aka Container Instances in ECS

For our analogy...

The EC2 Instances are like our spaceships.

The Cluster is like our spaceship fleet's Admiral.

The ECS Container Agent is like the captain of each ship, that reports back on the status of the ship itself and the space-containers.

Let's cover each a bit more.

The Cluster is like the admiral of all the spaceships Edges Group has leased. It receives information from each of the spaceship captains and coordinates them. It also reports back directly to us, versus us having to check in with each captain.

The ECS Container Agent is like the captain of each ship, that reports back on the status of the ship itself and the space-containers.

An EC2 instance that has a Container Agent and is part of a Cluster is referred to as a Container Instance. This is like a spaceship with a captain that knows it belongs to specific admiral's fleet.

If we have a Cluster named "Luster" and 4 EC2 instances with a Container Agent that's configured to point to "Luster": Then our Cluster is coordinating with 4 instances. In other words our Cluster has 4 instances.

Sound complex? Nope, simple as pie actually. In fact creating a Cluster is the easiest part.

It consists of 1 thing. A name! Of course the majority of a Cluster is the supporting components. So it's almost like a club. The "club" doesn't get things done, it's members do. But we still say the "club" did it.

So, let's step through ALL the things needed to set up a complete Cluster.

1) Create a Cluster.

Creating a Cluster is literally nothing more than navigating to the ECS console and creating an EMPTY Cluster. With a name. That's it. Nothing else.

On the the CLI? Same. We create a Cluster with a name.

$ aws ecs create-cluster --cluster-name "luster"

The console launch wizard makes it look like there's a lot more to creating Cluster. It asks for instances, instance types, vpcs, security groups blah blah. Those are things we use with a Cluster, but they're not a Cluster. That's actually AWS trying to help us out by using CloudFormation to create all the supporting components.

What supporting components? Instances, VPCs, EBS volumes, IAM roles, etc. But these things aren't directly a part of ECS. We still need them, and need to set them up, but they are not a part of ECS.

With our spaceship analogy - the admiral and captains manage the spaceships. But the spaceships are not a part of the admiral and captains.

Okay, I could be super literal and stop here. But let's walk through those supporting components.

2) Create the IAM Role for the EC2 Instances to be used in the Cluster

EC2 instances that hook up with ECS need an IAM role with the AmazonEC2ContainerServiceforEC2Role policy. This is actually a managed policy, meaning that it's pre-made. So all that's need to create this role is to:

a) Create a new IAM role with the type: Amazon EC2 Role for EC2 Container Service

b) Select the AmazonEC2ContainerServiceforEC2Role policy

And that's it. If you'd like to learn more about the wizardry surrounding IAM policies, I have an in-depth write-up about it here:

AWS IAM Policies in a Nutshell

3) Set up EC2 instances with the ECS Container Agent

While you can manually install this on instances yourself, AWS has an ECS Optimized AMI for us to use as well. Therefore the most straightforward way to create an instance that's ready to hook up with ECS is to just use the AMI:

ECS Optimized AMI

There's nothing crazy in it, so we'll still be able to customize it further if need be (and make more decorated AMIs). The list of what's in it is:

  • The latest minimal version of the Amazon Linux AMI
  • The latest version of the Amazon ECS Container Agent
  • The recommended version of Docker for the latest Amazon ECS Container Agent
  • The latest version of the ecs-init package to run and monitor the Amazon ECS agent

And we can find a list of all the latest optimized AMI's here:

List of the Latest ECS-Optimized AMIs by Region

Remember, instances with the Container Agent that are a part of an ECS Cluster are referred to as "Container Instances." I keep reiterating this because at first look I thought it was some play on words for it being an instance of our Containers or Images.

We'll come back and talk about strategies for launching and managing Container Instances in the Launch Configurations and AutoScaling Groups section.

4) Point the ECS Container Agent on the instances to the Cluster we want them to join

If we're using the ECS Optimized AMI, just write:


To the ECS config file at /etc/ecs/ecs.config. We can do this using a user data script. The script can either write to the ECS_CLUSTER variable itself or we can keep our config file in a secure S3 bucket and pull it in.

If we're doing it the manual way, just set an environment variable on the Docker run command:

$ docker run --name ecs-agent -env=ECS_CLUSTER=YOURCLUSTERNAME

You'll need to put in more options to the Docker run command if you're doing it manually.

5) Launch the instances!

The instances will automatically "join" the Cluster if you've done all of the above.

Obviously there's more to launching instances beyond this. Instance type, Security Groups, etc etc. I'd say that's one of the biggest off-puts to ECS for those unfamiliar with AWS - it assumes that we know a metric ton about AWS.

A Cluster can also manage these instances across availability zones within a region. That's right. It can work with all of the great fault tolerance tools that AWS provides pretty simply. If our instances are spread across the different AZs, our Docker Containers will be as well.

Clusters work with Instances across availability zones

To do so, we'd need to launch the instances into differing VPC subnets. If you'd like to use a non-default VPC, I have a great write-up here on it:

AWS VPC Core Concepts in an Analogy and Guide

We'll also need to make use of an Application Load Balancer which we'll talk about after the rest of the primary components.

And That's the Cluster

Just a bunch of Instances with the ECS Container Agent, pointing to the same Cluster. Although there are many supporting concepts, it has only one true property - a name. It's the Admiral of our fleet of Container Instances.

So our servers are all linked up and ready to coordinate. Now we need to actually set up our containers on them. Before we can do that though, we need a way to tell ECS how to set up our Containers.

Task Definitions

In context of our spaceship and space-containers analogy, a Task Definition is a specification of exactly what's needed to set up our space-container(s) in our spaceships. Note the (s) there. In a "Task Definition" we can say that the "Task it's defining" consists of multiple space-containers.

(also note that Task Definition is both the real name of the resource AND what I'm calling it in the analogy.)

For example, our Edges Bookstore "Task Definition" might consist of:

1) the blueprint for our bookstore "Edges" space-container

2) a blueprint for a coffee shop space-container

3) a list of utilities, space, power, plumbing, etc levels for each space-container

If we created one "Task" from this "Task Definition" we'd have a bookstore and coffee shop. They'd be linked in a specified amount of space in one of our spaceships. It'd use up certain amounts of plumbing, electricity, etc.

A Task Definition specifies what's needed to set up Containers on a Container Instance

In AWS ECS, this comparison carries over. A Task Definition is a specification exactly what's needed to set up our Container(s) on our Container Instances. Note the (s) there. In a "Task Definition" we can say that the "Task it's defining" consists of multiple Docker Containers.

For example, our Edges App "Task Definition" might consist of:

1) a Docker Image for our Edges web app

2) a Docker Image for a video processing app

3) the cpu, memory, port mappings, entry points, etc for each to-be-made Container

The Docker Images we specify in 1 and 2 are what we make the Containers from. In each, we specify the levels of cpu, memory, etc. If you've done anything with Docker, these properties should ring a bell. Most of the options we pass to our "Task Definition" about Containers are the options we can pass to Docker when creating Containers. Like PortBindings, MemoryReservation, and the like.

If we created one "Task" from this "Task Definition" we'd have a web app and video processing app on one server. It'd reserve a certain amount of cpu, memory, bind to certain ports, etc.

A Task Definition can be defined in the AWS console through the usual console UI OR through a JSON format. The CLI can also be interacted with via the JSON Format as well. Here's the sample template straight from AWS:

The AWS Task Definition Template.

Here's the specific area that covers all the specific options and properties that can be used in a task definition:

Task Definition Parameters

The real meat of the template is the containerDefintions property. This is where we define the Containers and their needs. In a containerDefinition, the image property is where we point to the Docker Image we'd like used for that specific Container. We can define multiple Containers all using different Images.

Setting up a Task Definition?

Again, we're not diving too deep here because I have another guide that shows exactly how to set one up. This is more the thought process of creating one.

Why is that?

Because as soon as you see how many options there are for a Task Definition, you're likely to be overwhelmed. This is especially true if you haven't done much directly with Docker.

For the Console

1) In the AWS ECS console, go to the Task Definitions and create a new one.

2) Follow through the instructions and cross reference each of the fields it asks for with the Task Definition Parameters documentation.

For the CLI or JSON Format in the Console

1) Start with The AWS Task Definition Template..

2) Begin modifying each of the properties and also cross reference it with Task Definition Parameters documentation.

How do I know what properties to use?

This is where you'll need to read up on what each of the properties does. There aren't any real shortcuts here folks. You've either profiled your Containers and know the needed memory or not. Same for a lot of the other properties.

Some notes about the process:

  • Properties of containerDefinitions like CPU and Memory aren't just ECS things. Those are Docker concepts. You'll need to figure out what each of your container needs and uses.
  • the essential property on a containerDefinition means that if that that container goes down, they all do.
  • entrypoint and command just overwrite whatever you have in the image. You don't have to re-define them here.
  • links is how you specify that containers can communicate with each other. It's like using the --link option in docker run.
  • Many of the properties have defaults, allowing us to leave them blank.

So I'm Done Right?


Remember, Task Definitions are just a set of instructions. They're independent of Clusters. Meaning that we can use them in any Cluster (assuming they have the resources).

In order to use the Task Definitions, we have 2 options:

a) Running a Task


b) Creating a Service

The best visual here is that we have a set of instructions on how to build our space-containers sets (Task Definitions). Now we're handing it to the admiral (Cluster). The admiral knows we want it set up in the spaceships, but needs to know how (and how many).

Agents coordinate with the Cluster and Each other to find the best fit for Tasks

So of course the next step is for the admiral (Cluster) to coordinate with the captains (ECS Container Agents) and deploy it to the proper ships (Instances).

Let's start with Running a Task.

Learn to build production-ready Docker infrastructures on AWS from nothing.

Get notified when my next AWS DevOps Workshop opens:

Running a Task

We've defined what's needed to set up a full-blown "Edges" shopping experience with our Containers. We did so through the aforementioned "Task Definition." We can hand this to our spaceships' admiral (Cluster) and ask them to create and run a "Task."

The process of running a task goes along the lines of something like:

a) The admiral (Cluster) takes the "Task Definition"

b) Coordinates with all of the spaceship captains (ECS Container Agents)


c) Find which spaceship (Server) is the best fit.

When it knows what spaceship is the best fit, it will set the space-containers up there - in our case a bookstore and coffee shop. These shops will have all the required utilities, like electricity and plumbing, needed to run.

In ECS, in our Cluster, this is like choosing to Run a Task. We specify the Task Definition and the Cluster will create all the Containers we've specified within it. It will coordinate with the different Container Instances (EC2 servers with the container agents) and find the ones that can best fit the entire "Task."

When running a Task, we can specify to run more than 1 Task. This is the equivalent of saying:

"Given my blueprint (Task Definition), make 3 Tasks from it."

Therefore, if we had a Task Definition that called for 1 web app Container and 1 video app Container linked, and we wanted 3 Tasks from it: then we'll wind up with 3 pairs of those apps.

Any Containers specified in a Task Definition are created with the correct resources

Spreading Tasks Across Availability Zones (AZs)

Obviously, we want some say in how our Tasks are placed on our Instances. This is where Task Placement strategies come into play. We tell our Cluster how we want them spread across our Container Instances.

Assuming that our Instances are being launched into different Availability Zones...

...then the only thing we have to do to spread traffic across zones is select a strategy. There's 5 main strategies we get to select from:

1) AZ Balanced Spread - Spreads Tasks evenly across AZ's. Within an AZ, it spreads Tasks evenly among Instances.

2) AZ Balanced Binpack - Spreads Tasks evenly across AZ's. Within an AZ, it tries to use the least amount of Instances. How? By prioritizing Instances with the least amount of available CPU or memory.

3) Binpack - Places tasks on instances with the least amount of CPU or Memory. Doesn't care about AZ spread.

4) One Per Host - One per Instance.

5) Custom - Allows the user to customize the placement from a set of options. This allows us base spread on something like OS type or AMI ID as well. The docs are basically non-existent on how they work though.

The official docs on this are here:

Amazon ECS Task Placement Strategies

Unfortunately, these aren't the most helpful because they don't explain the options from the console. Instead they just give the values of the API's type and field parameters. When it comes down to it, the above 5 options we covered are just a mix-and-match of 2 things:

a) ECS Constraint Types and Attributes


b) The Cluster Query Language

For my own sanity, I'm not going to dive into those just yet. I've experimented with them some, but since the docs are so sparse it's hard to know what's happening. The default 5 strategies seem pretty solid.

Let's Not Get Distracted From The Fact...

That running a Task is simply:

Giving our Cluster a Task Definition and telling it to put Tasks on our Container Instances.

Yes there's a lot of details we can configure, but they still all head towards that same purpose.

However, the deal ends here. If one of our Tasks goes down, that's the end. The Cluster won't try and put it back up. It also won't give us detailed metrics on them either. Oh, also, how do we do service discovery? How do we load balance between Tasks??

Given these issues, running a Task (or Tasks) is going to be limited. If we want something like a web server, we need a different option. That's where Services come into play. Don't you love these amazingly descriptive names?


As we just covered, Running a Task is a very "one and done" type of deal. To remember it, think "one and done" because it's "run and done."

Creating a Service tells our Cluster to go beyond just running Tasks - it manages them. We create a Service and hand it a Task Definition. It takes the Task Definition and does a number of things for us. Specifically:

1) Sets up our Tasks from our Task Definitions on the best suited Container Instances (same as running a Task).

2) Monitors our Tasks and reports back metrics

3) Keeps the number of Tasks we specify always up and running

4) Updates our Tasks by handing them an updated Task Definition

5) Optionally scales out/in our Tasks based on customer demand (traffic)

6) Distributes our customer demand evenly to all Tasks via Load Balancer (if we hook one up)

It becomes a little unwieldy for me to continue with the analogy we've set up here. If you're interested in keeping the visual though just think of it as us asking the admiral (Cluster) to also...

  • To manage our space-container sets (Tasks).
  • To monitor and report back to us about them.
  • If one of our space-container sets (Tasks) burns down, to put them back up.
  • To put a traffic guide outside of the spaceships and have them direct it to the space-container set best suited for traffic.
  • To build/remove more space-container sets based on traffic demand.

Like I said, it gets a bit odd.

Let's dive into each of the things a Service does for us in detail now.

1. Set up of our Tasks on the best suited Container Instances

This is what plain running a Task does. Services also do this. They also give us the same option of selecting a Task Placement strategy: the options of things like AZ Balanced Spread, Bin Pack, etc.

So they do everything running a Task does PLUS the other 5 things we're going to step through.

2. Monitors our Tasks and reports back metrics

Services give us monitoring and metrics about its used resources

When we have a Service, we can get specific metrics about it. How much of our allotted CPU and Memory are our Services Tasks' taking up? How much is reserved? How much is being utilized?

The docs on the different metrics are here. The two for Services are CPU and Memory Utilization. Note that there are also metrics available for the Cluster as a whole.

It also reports back to us a detailed list of "events." The events are simply a play-by-play of what ECS is doing with the Tasks. Is it launching a Task? Is it removing a Task? Is it updating a Task? At what times are these things happening? Are they in a steady state?

Lots of extra info that goes far and beyond just whether or not a Task is Running, Pending, or Stopped.

The power of these metrics really comes into play when utilizing CloudWatch alarms. We can pick a metric from ECS, either Cluster wide or Service wide metrics, and watch them. If they go above or below thresholds we've set, we can respond to them.

CloudWatch alarms are one of the "supporting components." They're not so complex that they require their own section here. At the same time, there's so much to them that if we did make a section, it'd be huge. To create an alarm we just pick a CloudWatch metric and set a threshold. If that threshold is crossed the alarm sounds.

Check out more about CloudWatch in the Docs.

3. Keeps Tasks up and running

When creating a Service we can specify the "Number of Tasks" we want to keep alive in our Cluster. The Service will always make sure that the number of Tasks we specify will be up. If one fails, the Service will create a new one in its place.

We can also set a maximum percent of Tasks and a minimum healthy percent of Tasks. These percents represent the number range of Tasks that our Service can have live at any given time. They're used in updating Tasks when we revise our Task Definitions. For example, we might update our Task Definition to use an updated Docker Image for the Containers.

The maximum and minimum percents are used to deploy updated Tasks without ever having a service outage.

That brings us to the next benefit:

4. Updates our Tasks by handing them an updated Task Definition

... AND it can update them without ever having a service outage. How? By deploying new Tasks incrementally and taking down old Tasks incrementally. It uses the maximum percent and minimum healthy percent of Tasks parameters to achieve this.

Handing a revised Task Definition to a Service will update all associated Tasks

For example, if we set our service up with the following settings:

1) Number of Tasks: 4

2) Minimum Healthy Percent: 50%

3) Maximum Percent: 200%  

Then that means we can have as few as 2, but as many as 8.

What does this have to do with updates? Well it determines how ECS will update everything.

First off, the way an update works with a Service is:

a) Make a revision of your Task Definition

This means, make a new version of it. We don't "update" an existing one. We create revision of an old one.

b) Update the Service with the revised Task Definition

Which is either done by a couple of simple clicks in the console, or passing up the JSON formatted Task Definition via CLI.

When we do this, we'll trigger an update to all Tasks being managed by the Service. The Service won't just remove all of the current ones and then add all of the new ones. Instead it'll follow some rules based on our maximum percent and minimum healthy percent values we set:

a) Any old Tasks that are currently receiving traffic will be "drained".

In other words, it will allow existing requests to complete, but will deny new ones. This is related to Elastic Load Balancers, which we'll get into in a bit.

b) Deployment of the new Tasks is based on our Minimum Healthy and Maximum percents.

Since we have our Maximum Percent at 200%, it will add 4 of the new revisions first and then remove the 4 old Tasks.

Updating a Service's Tasks with a maximum percent of 200

This allows our Service to update without ever having to be down.

Let's look at another example:

Number of Tasks: 4

Minimum Healthy Percent: 50%

Maximum Percent: 100%  

Because 4 is our maximum, if we update, it won't add 4 new Tasks and remove 4 old Tasks. Instead it will...

1) Wait for any existing traffic to drain

2) Remove 2 old Tasks, once drained.

Now we're at 50% of our tasks, which is still healthy.

3) Add 2 new Tasks

Now we're at that max of 100%. Once these are up and live...

4) Remove the last 2 old Tasks, one drained.

5) Add the final 2 new Tasks.

Updating a Service's Tasks with a maximum percent of 100

Yep. And we don't have to deal with any of the headaches of doing this manually!

5. Optionally scale out or in our Tasks based on traffic

Services can automatically add and remove Tasks to our Cluster. The flow of auto scaling is the general AWS responsive workflow:

1) Pick a metric to watch in CloudWatch

2) Set a CloudWatch alarm to trigger when the metric goes above or below thresholds

3) Respond to the CloudWatch alarm with an action

It's a very simple process. Any service that feeds data into CloudWatch can be measured, monitored and responded to. ECS feeds two type sets of metrics in:

a) Cluster Metrics - usage / reservation of resources in context of our entire cluster

b) Service Metrics - usage / reservation of resources in context of our Service

We would pick a metric in CloudWatch, like "CPU Utilization" of our Service. We'd set a CloudWatch alarm with a threshold like "When it gets above 50% of the total available CPU."

With the alarm set up, we'd hand that to our ECS Service. We'd create "Scaling Policies", which are actions to take when a specified CloudWatch alarm is triggered. Our Scaling Policy may be to "Add 1 Task."

In terms of setting up these resources - watching a metric and setting up an alarm are all done through CloudWatch. The Scaling Policy is done through ECS in the console. The CLI and CloudFormation use a variety of separate concepts, in combo with ECS, to set this up:

  • The Application AutoScaling API
  • Application AutoScaling Scalable Targets (what are we scaling?)
  • Application AutoScaling Scaling Policy (rules for scaling)

Again, we're focusing on concepts here. There's an entire list of tech steps here for setting up Service Auto Scaling.

If you're using CloudFormation, scaling down is unfortunately broken. I've asked multiple times about it.

6. Distributes our Traffic evenly to all Tasks via Load Balancer (if we hook one up)

Services allow us to hook up Elastic Load Balancers and spread traffic to our Tasks. Specifically, we use the Version 2 Load Balancer known as the "Application Load Balancer." We'll dive into these more in the next section.

These are what we use for service discovery as well (note the lower case "s"). We can launch multiple Services into a single Cluster. For example, we might have a Book Service, that launches Tasks for a book app. We might have a Movie Service that launches Tasks for a movie app.

Using an Application Load Balancer, we can route to those differing Services using a variety of rules. The Book Service might be available at /edges. And our Movie Service might be available at /cube-buster. But they'd all be sharing the same set of Container Instances!

In simple terms though, an Application Load Balancer saves us from a major pain:

Having to load balance traffic across servers and then across Tasks on a server.

An application load balancer will work to balance traffic across our Service's Tasks

Hooking one up to a Service is as simple as:

a) Upon creation of the Service, choosing to configure an ELB

b) Specifying the Application Load Balancer

c) Selecting the Target Group

Or for the CLI or CloudFormation

1) Specifying the Load Balancer and associated Target Group

NOW. Obviously, this assumes that we have an Application Load Balancer created. "ALBs" are one of the supporting concepts. They're practically required - how else will traffic reasonably reach our Containers? But the docs and most guides out there assume knowledge of Elastic Load Balancing. Well that's BS, so let's actually learn about them.

This transitions us right into the next section.

Application Load Balancers

Traffic obviously needs to reach our Tasks and the Containers within them. How do we do that? Right now they're all on a Container Instance, but there's not a practical way to reach them.

This is where our Elastic Application Load Balancers come into play. These accept incoming traffic on a protocol (i.e. HTTP) and port (i.e. 80) we specify. They then route the traffic to Target Groups, based on the path (i.e. "/").

A Target Group, in context of ECS, is where our Service will register its Tasks. Once these Tasks are registered, our Load Balancer knows to spread traffic between them.

If you've worked with Elastic Load Balancers, this should point out the differences between Classic and Application. Classic Load Balancers spread traffic between servers. Application Load Balancers spread traffic between applications. In this case, the applications are our ECS Tasks.

Classic ELBs go to Instances.  Application ELBs go to apps or Tasks in our case.

It's a pretty neat concept where we just quit caring about balancing between servers and subsequently the apps within them. We do away with that and say:

"Hey, just tell me where all the live Applications are. I'll spread traffic between those."

Application Load Balancers have a few primary components:

a) The Load Balancer itself - from a CLI and API standpoint this is almost just a central point to attach options and configurations to.

b) Target Groups - a common group of apps, in our case Tasks, that will receive load balanced traffic.

c) Listeners - what ports and protocols is our Load Balancer listening on? i.e. Port 80, Protocol HTTP

d) Listener Rules - when traffic arrives on a particular Listener, what do we do? i.e. Send it to a particular Target Group when the path is /

Conceptually, a load balanced setup for one of our "Edges" Applications we be:

1) We have an Edges Task Definition

2) We create an Edges Service using the Task Definition

3) We have a Target Group called Edges Targets

4) We hand the Target Group, Edges Targets, to our Service

5) Our Service registers ALL of our Edges Tasks in our Service with the Target Group

6) We set up a Listener for HTTP request on Port 80

7) We create a Listener Rule that says, when the Path is / we send it to our Edges Target Group.

Our Application Load Balancer can now receive traffic and spread it to all the Tasks being managed by our Edges Service.

More than 1 Target Group and Service

As mentioned earlier, we're also not limited to one Target Group, and Service in our Cluster. We can launch as many as we have room for. For example, we could do the following:

1) Create an entirely new set of Docker Images and Containers

2) Create a Task Definition with the new Images. We'll call it the CubeBuster Task Definition.

3) Create a new Service out of the CubeBuster Task Definition. Launch it into the same Cluster we've been using (with the Edges Service).

4) Create a new Target Group called CubeBuster Targets

5) Hand the CubeBuster Target Group to our CubeBuster Service

6) Our CubeBuster Service registers ALL of our CubeBuster Tasks managed by our CubeBuster Service with the Target Group

7) Change up our Load Balancer Listener Rules to be:

Listener for HTTP 80 Traffic

Listener Rule 1 - Send to CubeBuster Targets:

Target Group: CubeBuster Targets  
Path: `/cubebuster`  
Priority: `1`

Listener Rule 2 - Send to Edges Targets:

Target Group: Edges Targets  
Path: `/`  
Priority: `2`  

Priority is just which rule takes precedence - The lower the number, the higher the priority. Brilliant inverse relationship AWS.

With our new configuration, if a request comes in for /cubebuster, it will go to our CubeBuster Targets.

There's A Lot Going On Here, Let's Simplify

So this got pretty out of hand. When it comes down to it, hooking up your Service to an Application load balancer comes down to:

1) Creating a Target Group.

Note here - don't register targets manually if you're working with ECS. ECS will do that for us.

2) Creating your Application Load Balancer, and making a Listener and Listener Rule that points to the Target Group.

3) Register your Application Load Balancer and Target Group with the ECS Service.

And that's it. This is even easier in the console:

a) Create your Application Load Balancer

In the console wizard, it'll have you walk through and create the Target Group, Listener and Listener Rules. Just remember, don't register any targets manually. ECS does that.

b) Create the ECS Service and Register the Load Balancer

In the console, when creating a Service you Configure ELB. In this page you...

  1. select from the Load Balancer that you've created
  2. select what Task to Load Balancer (although the console says container)
  3. select or create the listener you've set up (i.e. 80:HTTP)
  4. select or create a Target Group

And that's it. Then everything is good to go.

CLI and CloudFormation require all of the above concepts to be created individually. After creating them, we then have to configure them to point to and reference each other. I plan on releasing more on this aspect, but it's a pretty heavy topic to dive into.

We've discussed scaling and load balancing our Tasks (and thus Containers), but we're still missing something. We've discussed how to set up an EC2 instance, but we probably don't want to set them up individually each time. This is where Launch Configuration and AutoScaling Groups come into play.

Learn to build production-ready Docker infrastructures on AWS from nothing.

Get notified when my next AWS DevOps Workshop opens:

Launch Configurations and AutoScaling Groups

Review: Container Instances are just EC2 instance that have the ECS Container Agent on them.

When EC2 Instances have the agent on them configured to point to a cluster, they "join" the cluster. And this point they've fulfilled their entire destiny and are full-fledged Container Instances.

How do we make these Container Instances?

Well, we've already explained how to make one of them above. We either use the ECS-Optimized AMI or install/configure the ECS Container Agent + Docker.

What about making many of them? We have 2 primary options:

1) Launching All of them Individually.

This obviously isn't the route you'll likely want to take. Too much manual labor. To speed things up, you could always get an instance exactly how you want it and then create an AMI from it. This way, every time you make a new instance, there's no extra configuration.

The better way...

2) Launch Configurations and AutoScaling Groups

These allow us to automate and manage many Instances at once. Like the Application Load Balancer, it's a "supporting concept." These aren't directly related to ECS, and are used in almost anything that leverages EC2. Since they deal with EC2, which is the bread and butter of everything server related, let's dive in a bit.

Launch Configurations and AutoScaling Groups

A Launch Configuration is just a blueprint for an EC2 instance. In terms of creating one, it's exactly like creating an EC2 instance. You still select an AMI, pass it user data scripts, etc. At the end of the process you have Launch Configuration though and not a live Instance.

With the Launch Configuration in hand, we now create an AutoScaling Group. These take a Launch Configuration and automate / manage the creation and scaling of instances.

Launch Configuration are blue prints for EC2 Instances.  Auto Scaling Groups use these to create and manage EC2 instances.

There as simple as:

a) selecting the desired launch configuration

b) selecting which VPC and Subnets to launch instances into

c) determining how many instances you'd like

d) configuring scaling actions based on CloudWatch alarms (optional)

e) setting up notifications for events

Getting these Launch Configurations and AutoScaling Groups working with ECS involves exactly 2 steps:

1) Setting up the Launch Configuration to use the ECS Optimized AMI; or using a user data script to manually configure the ECS Container Agent.


2) Pointing the container agent to the Cluster you'd like them to join. Which we covered in the Cluster section.

If your Cluster is live and the Launch Configurations are set-up to either use the ECS Optimized AMI or install the ECS Container Agent; If you've also configured the Launch Configuration to point to the correct cluster; Then BOOM. That's all you have to do to launch instances into your Cluster.

One More Note About Container Instances

In order to work with ECS they need access to the internet. If you've not done much work with VPCs, this is probably confusing. This means that whatever VPC they reside in needs to be attached to an internet gateway. The subnet in the VPC then needs to either:

a) Point directly to the internet gateway


b) Point to a NAT Gateway or NAT Instances in a subnet pointing to an internet gateway

If this sounds like incoherent rants of a mad man, check out my guide here on VPCs.


AWS ECS just helps us manage and deploy our Docker Containers across EC2 instances. That's really it. We can bog it down with 1000 other supporting things, but it's relatively sparse.

The most confusing part of grasping ECS is due to the number of assumed supporting components. Like Application Load Balancers, AutoScaling Groups, Launch Configurations, VPCs, etc. These things are definitely used with ECS, but they're not ECS.

ECS consists of just a few basic concepts:

Task Definition - Everything your Docker Container(s) need(s) to persist on a server. CPU, Memory, volumes, Docker Images, etc. Think blueprints.

Task - instantiating a Task Definition.

EC2 Instance - the servers. These are the spaceships in our analogy.

ECS Container Agent - software installed on an EC2 instance that helps coordinate with other Agents, monitor local Docker Containers and communicate with the Cluster. These are the captains of each spaceship.

An EC2 instance with a Container Agent that belongs to a Cluster is referred to as a "Container Instance."

Cluster - The parent to which our Tasks and Agents belong to. It's a very ambiguous concept, that's more just there to represent membership.

For our analogy, think of an admiral. It's in charge of the fleet, all the captains report back to it. It reports to us so that we don't have to check in with each captain individually.

Running a Task - we hand our Cluster a Task Definition; it creates a Task and places it on the best suited server; the best suited server is found by coordinating with the ECS Container Agents.

Creating a Service - we hand our Cluster a Task Definition; it does the same thing as running a Task PLUS:

1) Monitors our Tasks and reports back metrics

2) Keeps the number of Tasks we specify always up and running

3) Updates our Tasks by handing them an updated Task Definition

4) Optionally scales out/in our Tasks based on customer demand (traffic)

5) Distributes our customer demand evenly to all Tasks via Load Balancer (if we hook one up)

The confusing aspect of ECS is that it practically requires a TON of supporting concepts. And unfortunately the docs just kind of mention them in passing. Don't mistake this requirement as a "ECS is just complicated" thing. Anything beyond playing around will require these supporting components.

What are they?

Application Load Balancers are needed to direct traffic to Containers. They allow for a variety of powerful concepts like name-spacing sets of containers as "Target Groups." This allows us to direct traffic to different sets based on rules like the request Protocol or Path.

While not required, the following help bolster our ECS setup:

CloudWatch alarms and metrics to respond to events in our Clusters and Services. This is how we achieve auto scaling.

Launch Configurations and Auto Scaling Groups to manage sets of Instances vs. setting them up piecemeal.

VPCs to create varying subnets to allow for multiple availability zone deploys

IAM to create the proper roles for your Instances, Services and Tasks

Caffeine to make sure that you don't fall asleep while reading through 100s of documentation pages.

Final Thoughts

The obvious next steps need to be an actual implementation of everything. However, now that you understand the concepts, you won't be clicking in the dark! If you've done an implementation before, hopefully this has shed some light on why things worked as they did.

I feel the need to mention this one more time, if you're looking for a step-by-step of implementation, I have a full one here:

Guide to Fault Tolerant and Load Balanced AWS Docker Deployment on ECS

The reason I opted to make this long conceptual guide is because implementing ECS is the easy part. Understanding it is the tricky part. It's just a few clicks or CLI calls. So "80% mental, 20% mechanical" wound up being true here as well.

In other news, it turns out the rest of the galaxy didn't care about physical books. Or physical movie rentals. So Edges and CubeBuster were shut down.

Image Accreditation

The majority of imagery is derived from or "inspired" directly from AWS own ECS theme. Hopefully no one takes offense.

Also, I used a lot of Star Trek references. Hopefully they're also not offended.

Finally, the planets come from Freepik and are apparently through www.flaticon.com. The little ships that I pulled down and modified also come from Freepik.

As always, please leave me a comment if you find any typos or technical glitches! Thanks!

Enjoy Posts Like These? Sign up to my mailing list!

My Tech Guides and Thoughts Mailing List

More from the blog

J Cole Morrison

J Cole Morrison


Developer Advocate @HashiCorp, DevOps Enthusiast, Startup Lover, Teaching at awsdevops.io

View Comments...
J Cole Morrison

J Cole Morrison

Developer Advocate @HashiCorp, DevOps Enthusiast, Startup Lover, Teaching at awsdevops.io

Complete Guides: