Written by Damian Tykałowski
Published June 11, 2018

AWS ECS — quickly create environment for your dockerized apps

Learn this few tricks to help you quickly creating environment for your dockerized apps.

Go to GitHub and clone repository, change variables, run it.

https://github.com/d47zm3/devops/tree/master/aws/ecs-cluster

You will need AWS CLI, JQ and ECS-CLI (links in script).

First part (VPC creation) comes from here, I modified it a bit to add another subnet for HA and more open ports: https://medium.com/@brad.simonin/create-an-aws-vpc-and-subnet-using-the-aws-cli-and-bash-a92af4d2e54b

Since the whole setup consists of few files, I won’t paste them here, they all are available in my GitHub repository. Here I will discuss more important parts of it and how it all runs together. Let’s start with VPC. I’ve took this part from Brad with his permission, I just added another subnet for HA and a rule for loadbalancer as well as some other small tweaks.

It’s time to explain everything, step-by-step. First we need to have VPC that would keep our EC2 instances for ECS cluster, underneath there are a lot of resources to be created, thus everything connects properly. VPC will have two subnets for HA located in a two separated zones. In any case, tags will be added, thus it’s easy to identify ECS resources, all required routes, Internet gateway, firewall rules, all that jazz. We will use two custom tools besides obviously required AWS CLI, JQ to parse responses from AWS CLI client (to save IDs) and ECS-CLI client to manage ECS cluster. Couple of variables to be set here… In addition to the first few ones, like name, region or SSH keypair (which you will have to create yourself if you want to use one) there is no need to change network settings. In any case, I added comments for variables that might be confusing, hopefully it’s self-explaining.

cluster_name="ecs-medium"
# where cluster will store logs, can take name off cluster or application
log_group_name="${cluster_name}-log-group"
region="eu-west-2"
# keypair if you want to SSH into instances
keypair="d47zm3"
service_name="${cluster_name}-service"

# on what port on container your application listens
target_port=80

# on what port will loadbalancer listen
listener_port=80

# how container (task) should be named in ecs, web, api, proxy?
container_name="web"

name="ECSMedium"
vpc_name="${name} VPC"

profile_name="ecs-medium"
loadbalancer_name="${cluster_name}-loadbalancer"
loadbalancer_targets_name="${cluster_name}-targets"

role_name="ECSMediumRole"
tier_class="t2.small"

availability_zone_1="eu-west-2a"
availability_zone_2="eu-west-2b"

subnet_name_1="${name} Subnet 1"
subnet_name_2="${name} Subnet 2"

gateway_name="${name} Gateway"
route_table_name="${name} Route Table"
security_group_name="${name} Security Group"
vpc_cidr_block="10.0.0.0/16"
subnet_cidr_block_1="10.0.1.0/24"
subnet_cidr_block_2="10.0.2.0/24"

# allow traffic on these ports from anywhere
port_cidr_block_22="0.0.0.0/0"
port_cidr_block_80="0.0.0.0/0"
port_cidr_block_443="0.0.0.0/0"

# allow traffic out anywhere
destination_cidr_block="0.0.0.0/0"

After the first part you will see message like this.

You can check out your AWS console to see newly created VPC, ready with all dependencies.

After VPC is created, it’s time for some IAM magic. We will need the IAM role with some permissions, so your ECS instances can pull images from your private ECR registry (it’s a private Docker registry hosted on AWS). To achieve this task, we need to use some external files with definitions of permissions, not too much to talk about.

Now it’s time to get started with ECS part itself. While you could start-up ECS cluster without all these options, and it would create VPC, EC2 etc. instances for you, you could quickly hit the VPC limit when managing multiple clusters, also making mess that way is also an easy task. It’s better to have one (or more, your choice) dedicated VPC to manage for ECS cluster instances and have control over them (and customize them on the way). Again, added comments are explaining steps one by one.

echo "[*] [$( date +'%H:%M:%S')] Configure ECS profile..."
ecs-cli configure profile --profile-name ${profile_name} --access-key ${AWS_ACCESS_KEY_ID} --secret-key ${AWS_SECRET_ACCESS_KEY}

echo "[*] [$( date +'%H:%M:%S')] Configure ECS cluster before launch..."
ecs-cli configure --cluster ${cluster_name} --region ${region} --config-name ${profile_name} --default-launch-type EC2

echo "[*] [$( date +'%H:%M:%S')] Bring up EC2 instance..."
# bring up cluster
ecs-cli up  --size 2 --instance-type ${tier_class} --vpc ${vpc_id} --cluster-config ${profile_name} --subnets ${subnet_id_1},${subnet_id_2} --security-group ${group_id} --instance-role ${role_name} --keypair ${keypair} --ecs-profile ${profile_name}

echo "[*] [$( date +'%H:%M:%S')] Wait until EC2 instances are registered..."
sleep 60

After executing the next part, you will see your cluster created, since machines are in different AZs it’s highly available.

Now, we’ve got EC2 machines with configured Docker that we can control from ECS dashboard. Now we can set up services and tasks on them, but what is a service and what is a task? As for the task, we need to start with a task definition, that is described in docker-compose.yml. It specifies image to use, mounts, port forwards, logging configuration and such. Task i simply running instance of task definition. You start container with all settings specified. Service on the other hand makes sure proper amount of tasks is running, if one stops, it restarts it and such, it ties load balancer so it points to your running tasks, ensures placement strategy so containers are spreaded etc. Find more info here: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service_definition_parameters.html

But before we define our tasks and services, let’s create load balancer for it. Notice that, once you create service tied to one load balancer, you cannot change load balancer. You will need to re-create service. Amazon has two types of load balancer available, Classic Load Balancer and Elastic/Application Load Balancer. For us, what is important is fact, using Classic Load Balancer, we won’t be able to set up dynamic port forwarding and we cannot do no-downtime deployments. So we will use Application Load Balancing that supports this feature.

echo "[*] [$( date +'%H:%M:%S')] Create Application Load Balancer in chosen subnets..."
alb_response=$( aws elbv2 --region  ${region} create-load-balancer --name ${loadbalancer_name} --subnets ${subnet_id_1} ${subnet_id_2} --security-groups ${group_id} )
alb_arn=$( echo -e "${alb_response}" |  jq '.LoadBalancers[] | .LoadBalancerArn' | tr -d '"' )
alb_dnsname=$( echo -e "${alb_response}" |  jq '.LoadBalancers[] | .DNSName' | tr -d '"')

echo "[*] [$( date +'%H:%M:%S')] Create Target Group for ECS instances (targeting port ${target_port})..."
target_group_response=$( aws elbv2 --region ${region} create-target-group --name ${loadbalancer_targets_name} --protocol HTTP --port ${target_port} --vpc-id ${vpc_id} )
target_group_arn=$( echo -e "${target_group_response}" |  jq '.TargetGroups[] | .TargetGroupArn' | tr -d '"' )

echo "[*] [$( date +'%H:%M:%S')] Create Listener that will forward traffic to registered instances..."
listener_response=$( aws elbv2 --region ${region} create-listener --load-balancer-arn ${alb_arn}  --protocol HTTP --port ${listener_port}  --default-actions Type=forward,TargetGroupArn=${target_group_arn} )

Crucial for dynamic port forwarding is this “strange” 0 port number mapping in docker-compose.yml

version: '2'
services:
  web:
    image: httpd:2.4.32
    ports:
     - "0:80"
    logging:
      driver: awslogs
      options:
        awslogs-group: ecs-medium
        awslogs-region: eu-west-2
        awslogs-stream-prefix: example

This means, it will map any random port on host to 80 port in container. ECS (or rather service) will take care of registering this random port correctly in target group. Other than that we specify group for logs, it will be created and stream-prefix, it allows us to recognize quickly from which container stream logs come from. Now we are ready to define our task and start service. Below I present output from script and the final result. From timestamps you can see we managed to close in 5 minutes with whole environment ready!

[*] [13:45:42] Creating VPC...

[*] [13:46:01] VPC created, VPC ID: vpc-5f6eb437
[*] [13:46:01] Use Subnet ID subnet-fb4ab481 and subnet-8a5659c7, for Security Group ID sg-ebb61580
[*] [13:46:01] AWS resources will be created in eu-west-2, and in these AZs: eu-west-2a, eu-west-2b
[*] [13:46:01] Dumping values for future usage to ecs.values file...
[*] [13:46:01] Creating IAM role for ECS instances... (gives you possibility to use your ECR - AWS private registry for example
[*] [13:46:05] Configure ECS profile...
INFO[0000] Saved ECS CLI profile configuration ecs-medium.
[*] [13:46:05] Configure ECS cluster before launch...
INFO[0000] Saved ECS CLI cluster configuration ecs-medium.
[*] [13:46:05] Bring up EC2 instance...
INFO[0000] Created cluster                               cluster=ecs-medium region=eu-west-2
INFO[0001] Waiting for your cluster resources to be created...
INFO[0001] Cloudformation stack status                   stackStatus=CREATE_IN_PROGRESS
INFO[0061] Cloudformation stack status                   stackStatus=CREATE_IN_PROGRESS
INFO[0122] Cloudformation stack status                   stackStatus=CREATE_IN_PROGRESS
Cluster creation succeeded.
[*] [13:48:38] Wait until EC2 instances are registered...
[*] [13:49:38] Create Application Load Balancer in chosen subnets...
[*] [13:49:40] Create Target Group for ECS instances (targeting port 80)...
[*] [13:49:41] Create Listener that will forward traffic to registered instances...
[*] [13:49:41] Create Service in ECS cluster using created Target Group...
DEBU[0000] Parsing the compose yaml...
DEBU[0000] Opening compose files: docker-compose.yml
DEBU[0000] [0/1] [web]: Adding
DEBU[0000] [0/1] [default]: EventType: 32
DEBU[0000] Parsing the ecs-params yaml...
DEBU[0000] Transforming yaml to task definition...
DEBU[0000] Finding task definition in cache or creating if needed  TaskDefinition="{\n  ContainerDefinitions: [{\n      Command: [],\n      Cpu: 0,\n      DnsSearchDomains: [],\n      DnsServers: [],\n      DockerLabels: {\n\n      },\n      DockerSecurityOptions: [],\n      EntryPoint: [],\n      Environment: [],\n      Essential: true,\n      ExtraHosts: [],\n      Image: \"httpd:2.4.32\",\n      Links: [],\n      LinuxParameters: {\n        Capabilities: {\n\n        }\n      },\n      LogConfiguration: {\n        LogDriver: \"awslogs\",\n        Options: {\n          awslogs-stream-prefix: \"example\",\n          awslogs-group: \"ecs-medium\",\n          awslogs-region: \"eu-west-2\"\n        }\n      },\n      Memory: 512,\n      MountPoints: [],\n      Name: \"web\",\n      PortMappings: [{\n          ContainerPort: 80,\n          HostPort: 0,\n          Protocol: \"tcp\"\n        }],\n      Privileged: false,\n      ReadonlyRootFilesystem: false,\n      Ulimits: [],\n      VolumesFrom: []\n    }],\n  Cpu: \"\",\n  ExecutionRoleArn: \"\",\n  Family: \"ecs-medium-service\",\n  Memory: \"\",\n  NetworkMode: \"\",\n  RequiresCompatibilities: [\"EC2\"],\n  TaskRoleArn: \"\",\n  Volumes: []\n}"
INFO[0000] Using ECS task definition                     TaskDefinition="ecs-medium-service:3"
WARN[0000] Failed to create log group ecs-medium in eu-west-2: The specified log group already exists
INFO[0000] Created an ECS service                        service=ecs-medium-service taskDefinition="ecs-medium-service:3"
WARN[0001] Failed to create log group ecs-medium in eu-west-2: The specified log group already exists
DEBU[0001] Updated ECS service                           count=1 service=ecs-medium-service
INFO[0001] Updated ECS service successfully              desiredCount=1 serviceName=ecs-medium-service
INFO[0016] (service ecs-medium-service) has started 1 tasks: (task 89307d75-3001-496a-9117-85290f46d053).  timestamp="2018-05-10 11:49:50 +0000 UTC"
INFO[0031] Service status                                desiredCount=1 runningCount=1 serviceName=ecs-medium-service
INFO[0031] (service ecs-medium-service) registered 1 targets in (target-group arn:aws:elasticloadbalancing:eu-west-2:506673647732:targetgroup/ecs-medium-targets/4e4a92c73085203c)  timestamp="2018-05-10 11:50:03 +0000 UTC"
INFO[0031] ECS Service has reached a stable state        desiredCount=1 runningCount=1 serviceName=ecs-medium-service
[*] [13:50:13] Scale Service to 2 replicas...
DEBU[0000] Parsing the compose yaml...
DEBU[0000] Opening compose files: docker-compose.yml
DEBU[0000] [0/1] [web]: Adding
DEBU[0000] [0/1] [default]: EventType: 32
DEBU[0000] Parsing the ecs-params yaml...
DEBU[0000] Transforming yaml to task definition...
DEBU[0000] Updated ECS service                           count=2 service=ecs-medium-service
INFO[0000] Updated ECS service successfully              desiredCount=2 serviceName=ecs-medium-service
INFO[0000] Service status                                desiredCount=2 runningCount=1 serviceName=ecs-medium-service
INFO[0015] (service ecs-medium-service) has started 1 tasks: (task 087bb880-3fd1-4228-b994-d589d171a75c).  timestamp="2018-05-10 11:50:22 +0000 UTC"
INFO[0030] Service status                                desiredCount=2 runningCount=2 serviceName=ecs-medium-service
INFO[0030] ECS Service has reached a stable state        desiredCount=2 runningCount=2 serviceName=ecs-medium-service
[*] [13:50:43] Finished! Reach your service at ecs-medium-loadbalancer-1120207948.eu-west-2.elb.amazonaws.com !

 

We can see our service running with two tasks (as inside script it’s scaled to two instances). By default placement strategy for tasks, is spread meaning tasks will be spread across available ECS instances. As you see basic metrics, CPU and memory utilization are available at hand.

To check logs from this cluster and it’s containers, choose any of tasks, expand container view (in my case web) and click on “View logs in CloudWatch

Note that each container has it’s own log stream

What to do if you want to roll new image without downtime? It’s easy, specify new version in docker-compose.yml file and run deploy script below, remember to specify same values. Also add timeout flag since, even smallest deployments takes little longer than 5 minutes and it timeouts…

#!/bin/bash

cluster_name="ecs-medium"
service_name="${cluster_name}-service"
profile_name="ecs-medium"

ecs-cli compose --verbose --file docker-compose.yml --cluster-config ${profile_name} --project-name ${service_name} service up --timeout 10

 

And here’s output from deployment.

12:26:52 SEALS/ECS-Project-Base [~d47zm3@w0rk~] » (⎈ |gke:default) ./deploy.sh
DEBU[0000] Parsing the compose yaml...
DEBU[0000] Opening compose files: docker-compose.yml
DEBU[0000] [0/1] [web]: Adding
DEBU[0000] [0/1] [default]: EventType: 32
DEBU[0000] Parsing the ecs-params yaml...
DEBU[0000] Transforming yaml to task definition...
...
INFO[0000] Using ECS task definition                     TaskDefinition="ecs-medium-service:1"
DEBU[0000] Updated ECS service                           count=2 service=ecs-medium-service taskDefinition="ecs-medium-service:1"
INFO[0000] Updated the ECS service with a new task definition. Old containers will be stopped automatically, and replaced with new ones  desiredCount=2 serviceName=ecs-medium-service taskDefinition="ecs-medium-service:1"
INFO[0000] Service status                                desiredCount=2 runningCount=2 serviceName=ecs-medium-service
INFO[0030] Service status                                desiredCount=2 runningCount=4 serviceName=ecs-medium-service
INFO[0030] (service ecs-medium-service) has started 2 tasks: (task 0eded47d-82c1-4862-ac22-66912a3150ff) (task 039c6cb2-dcf4-43ff-95fa-331ca1d4038a).  timestamp="2018-05-10 10:27:13 +0000 UTC"
INFO[0045] (service ecs-medium-service) registered 2 targets in (target-group arn:aws:elasticloadbalancing:eu-west-2:506673647732:targetgroup/ecs-medium-targets/febcb7fe7daf5c72)  timestamp="2018-05-10 10:27:24 +0000 UTC"
INFO[0060] (service ecs-medium-service) deregistered 2 targets in (target-group arn:aws:elasticloadbalancing:eu-west-2:506673647732:targetgroup/ecs-medium-targets/febcb7fe7daf5c72)  timestamp="2018-05-10 10:27:46 +0000 UTC"
INFO[0060] (service ecs-medium-service) has begun draining connections on 2 tasks.  timestamp="2018-05-10 10:27:46 +0000 UTC"
INFO[0362] Service status                                desiredCount=2 runningCount=2 serviceName=ecs-medium-service
INFO[0362] (service ecs-medium-service) has stopped 2 running tasks: (task 8202859b-ef90-4d60-bae5-1d1cf9b935e8) (task 8fd1903b-51ce-4750-98c7-4112d6fca84a).  timestamp="2018-05-10 10:32:52 +0000 UTC"
INFO[0377] ECS Service has reached a stable state        desiredCount=2 runningCount=2 serviceName=ecs-medium-service

 

That’s it, you have a ready infrastructure for your containerized application that has logging and monitoring, which are must for success. Also, if you just play with it do not forget to clean up resources, or it will eat your money quickly!

Written by Damian Tykałowski
Published June 11, 2018