Traditionally a three-tier application is made up of three components, usually all of which are based on databases:

  • Presentation layer: where user interact with the application.
  • Business logic layer: where functional components reside.
  • Persistence layer: which connects to the databases.

In a monolithic app, each of these is built into a single artifact. Updating, and scaling the application introduces many challenges.

However in a microservices architecture, the application is decomposed into small, independent services, which uses network protocols for communications. In reality, a large application would be composed into dozens, or hundreds, of microservices. Each service communicates via an API. The API becomes a contract or an agreement between the services. Decomposing the application like this does come with some advantages:

  1. Each service can be built and deployed independently of other services.
  2. Write each service in the language best suited to the task.
  3. Services can choose the persistence that make sense to them.
  4. A new hire can get productive on a single service much quicker than a developer learning an entire monolithic app.

Stateless services can be terminated and re-created without us losing any data. By accelerating our ability to deploy applications at scale, containers work very nicely to enable microservice architectures.

Multi-Container Deployments

Getting one container up and running, while not difficult, is also not particularly useful. We often want multiple containers running, both for resilience and for scalability. Think of each container instance as ephemeral. So you’ll likely need to integrate with other services to store persistent data in places like EBS volumes or in databases like DynamoDB.

Docker Compose

Docker Compose is a tool we can use to work with multiple containers locally. Docker Compose automates a lot of the local container workflows we have already seen. You can think of a Docker Compose file as a declarative way to run all the same commands you have done for single containers. This becomes very nice when your local environment has grown to multiple containers.

When you ran the command docker-compose up, Docker Compose was creating a Docker network, and the containers were automatically added to the network. From inside the containers, services can be accessed using a container hostname that will actually resolve to the container’s IP address.

We also have a dependency between containers. Some service needs to wait for another service to be available. A container can be started, but the application inside the container is not yet ready to accept connections. We can add some logic to the calling service to reattempt connections until we succeed. A service may be temporarily unavailable. Adding retry logic to your applications will create a much more resilient system when we work in a distributed environment.

Container Orchestration Platforms

Integration, permission, automatic scaling, taking logs, etc, there are a lot of work to deal with containers. Luckily we have Container Orchestration Platforms help us automate the management and lifecycle of containers at scale. Besides running Docker Swarm on Amazon EC2 virtual machines, AWS offers two options with less overhead:

  1. Amazon Elastic Container Service (ECS)
  2. Amazon Elastic Kubernetes Service (EKS)

There are similarities in the jobs that these platforms perform, but the way they perform those jobs and the tooling available differs.

Amazon ECS

Control planeProvision software, any user or service-requested configurations.
Manage the lifecycle of resources.
Help the data plane do its work.
Data planeProvide the capacity to perform whatever work the control plane requests to be done.

You, as a client, interact with the control plane via an API (you can think of it as the ECS service itself). The data plane is a cluster where your containers are hosted. A cluster is the logical grouping of compute resources, and there are different types of launch types that you can use for this cluster: EC2 instances, or the serverless compute platform AWS Fargate.

You (client) -- API --> Control plane (ECS) --> Data plane (EC2 or Fargate)

To prepare your containerized application to run in ECS, you first create a task definition, which specifies various parameters, and is essentially a blueprint for deployment. Once you create your cluster and task definition, you can then run the task. A task itself is the instantiation of a task definition within a cluster.

The scheduler of ECS is responsible for placing tasks on your cluster, essentially the when and where of running your tasks. For scheduling, you have several different options available. ECS allows you to run and maintain a specified number of instances of a task definition simultaneously, and this is called a service. Services can be run (say, behind a load balancer) and can be deleted, stopping all the running tasks associated.

The container agent runs on each compute node within an ECS cluster, say an EC2 instance. The agents:

  1. send information about the resource utilization to the control plane
  2. start and stop tasks whenever it receives a request from the control plane

Scheduling and Task Placement

When considering hosting workloads on ECS, you will need to determine a few things dependent on your use case and on demand:

  1. when to run the container
  2. where to place the container
  3. how you want to scale this container in or out

You specify which task definition to start, then the ECS’s scheduling engine is to provide logic around how and when to start and stop containers. ECS provides different types of schedulers for you to choose from:

Service schedulerspecify how many copies of tasks across the cluster you want to run at all times
Daemon schedulerspecific a task is running at all times on every node in your cluster
Cron-like schedulerschedule a task to run at a particular time of a day
Your own scheduler

The placement engine‘s goal is to place your task on an instance that has an appropriate amount of memory and CPU space, as well as runs your task in a configuration that you choose. You can customize placement by using task placement constraints and task placement strategies.

  1. After the placement engine determines that it has enough space for the container, the first thing that the engine looks at is the task placement constraints. You could specify affinity and distinct instance.
  2. Then, the engine uses algorithmsย called task placement strategiesย to determine how to place your containers.
    • Binpack: pack your containers as dense as they can across your instances in your cluster.
    • Spread: spread your tasks across instances for high availability.
    • Strategy chaining: binback your instances while also maintaining high availability.

Infrastructure, Scaling and Service Discovery

There are two factors to consider when looking at how to scale your containers:

  1. Provisioning and scaling of the underlying cluster.
    • VPC, subnets, some EC2 instances or Fargate nodes.
    • A way to scale like an Auto Scaling group.
    • You could use:
      • Launch types to set up and manage clusters based on EC2 or AWS Fargate.
      • ECS’s Capacity providers to provision and manage underlying infrastructure for you. You can focus on your application first, instead of having to get the infrastructure up and running first. ECS takes on the heavy lifting of managing capacity.
  2. Scaling of the container themselves.
    • Service auto scaling gives you the ability to increase or decrease the desired count of tasks in your Amazon ECS service automatically.
    • You can use performance metrics to perform scaling actions automatically, by defining scaling strategies using either target tracking scaling, or step scaling.

Newer versions of the service would have a new network address for every new container, so there is a need to register the new service and deregister the old one. To do this on your own is challenging, however service discovery can help. AWS Cloud Map is a cloud resource discovery service, which natively integrates with ECS. With AWS Cloud Map, you can do:

  1. define custom names for your application resources
  2. maintain the updated location of the dynamically changing resources
  3. create and host service discovery across compute services
  4. increase your application availability

AWS Copilot

AWS Copilot is a tool for automating a lot of the steps you would be taking to create an ECS environment and deploy applications. When you deploy with AWS Copilot, you are creating three things: an application, an environment, and a service.

ApplicationA logical grouping of resources.
Can have multiple services
EnvironmentHow you create multiple stages in your application. 
Under the application, we might have dev, testing, and production stages.
For every environment, Copilot is going to create an ECS cluster and networking resources.
ServiceA service is the long-lived app you want hosted inside the ECS cluster.
This might be a front-end web application or an back-end internal API service.

For example we can create:

  1. an application, named “CorpDirectory”
  2. a single environment, called “test”
  3. two services: one is front end UI and one backend service connecting database or something.

Using Copilot, we can quickly create they using the best practices of the ECS engineers themselves.

AWS Fargate

EC2 instances are great if you want granular control over the infrastructure that the containers run on. But if you don’t need that tight control, all this setup may increase overhead. AWS Fargate is a serverless compute engine and hosting option for container-based workloads. With Fargate, you can host your containers on top of a fully managed compute platform. This means:

  1. no provisioning infrastructure
  2. no setting up your cluster scaling
  3. no server management. 

AWS Fargate abstracts all of that away from you. You only pay when your container is running.

Network Mode

For Fargate, you need to be using the awsvpc networking mode. But there are also other modes you need to know.

Network modeDescription
awsvpcThe task is allocated its own elastic network interface and a primary private IP4 address. This gives the task the same networking properties as Amazon EC2 instances.
BridgeEnable the task to utilize Docker’s built-in virtual network, which runs inside of each Amazon EC2 instance hosting the task.
HostAllow the task to bypass Docker’s built-in virtual network, and map to container ports directly to the network interface of the Amazon EC2 instance host the task.
You can not run multiple instantiations of the same task on a single Amazon EC2 instance when port mappings are used.
NoneThe task will have no external network connectivity.

Introduction to Kubernetes

Kubernetes is open-sourced, and it is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. Kubernetes was accepted as a Cloud-Native Computing Foundation project back in March 2016, and has received graduated status.

A Kubernetes cluster runs on many nodes. Kubernetes is going to help with the deployment, maintenance, and scaling of your applications. We can interact with Kubernetes declaratively. You don’t tell Kubernetes every step that needs to be performed to deploy your containers.  Instead you only describe the desired state. Controllers are observing the current state of the cluster and working to make the changes needed to bring the system to the desired state.

The Kubernetes platform has been designed to be extensible. There are multiple locations in Kubernetes that can be configured to work with many extension points. The kubectl binary is how you spend most of your time with the Kubernetes API. The kubectl can be extended with plugins.

The API itself can be extended with a custom controller, with which, you add a new endpoint to the Kubernetes API, and now you work with your new resource declaratively, just as you would with in-built Kubernetes resources. In fact, some of the core features of Kubernetes are implemented as custom controllers. For example AWS Controllers from Kubernetes is an open-source project from AWS, which has been implemented as a custom controller for AWS service resources. With AWS Controllers for Kubernetes installed, you can define and use AWS resources, like an S3 bucket, directly from Kubernetes.


A cluster can be one or many nodes. When creating a Kubernetes cluster in AWS, EC2 instances are acting as the nodes. Your containers are run on worker nodes. The control plane hosts the Kubernetes API, you interact with the API, In turn, the control plane deploys and manages containers on the worker nodes.

You (client) -- API --> Control plane (K8S) --> Worker Nodes (EC2 or Fargate)

In the control plane of your cluster, there is a cloud controller manager component, for cloud-specific control logic. This is where Kubernetes creates your cloud resources. The cloud controller manager has been designed to accept plugins from different cloud providers.

NamespaceIsolate groups of resources in a cluster.
If two applications never interact with each other, it can make sense to create each application resource in different namespaces.
This provides some isolation within a single cluster and prevents any naming clashes. 
Some Kubernetes services themselves run in a special namespace.
DeploymentA declarative template for Pods.
The deployment and its associated controllers create the Pods for you, and continues to monitor the cluster state.
If one of my Pods failed, another will be launched to get us into the desired state. 
PodThe smallest Kubernetes ephemeral object.
You can define a Pod with multiple tightly coupled containers, like side car pattern.

The files in your Pod containers are ephemeral. When the container goes away, the files in the container go away. However volumes are persistent. A volume is storage you can mount inside containers in your Pod.

Scaling and Service Discovery

Kubernetes is built to scale with your workloads. An application is run on multiple Pods across multiple nodes. We can scale both Pods and cluster nodes.

Horizontal and Vertical Pod Autoscaler

You can scale the application horizontally by increasing the replica count (number of Pods). Here the Horizontal Pod Autoscaler can scale a deployment resource based on a metric (CPU or memory). The Horizontal Pod Autoscaler will grow or shrink the number of Pods to try and match your target metrics.

We also have a Vertical Pod Autoscaler. A Pod’s spec contains one or more container definitions, where you can also include resource requirements. You can set resource requests manually, or consider the Vertical Pod Autoscaler. The Vertical Pod Autoscaler observes usage for your containers, and uses this to set the requests value for your containers. A better-informed request setting will allow Kubernetes to better place your Pods within the cluster.

Kubernetes Cluster Autoscaler

Kubernetes Autoscaler is part of the core Kubernetes project and runs in the cluster control plane. It can automaticallyโ€œ:

  1. add nodes when the cluster is failing to launch Pods because the scheduler cannot find resources
  2. shrinks nodes when under-utilization is detected. 

Kubernetes Cluster Autoscaler is cloud-provider aware. In AWS, EC2 Auto Scaling groups are used by the autoscaler to grow and shrink your worker nodes.


Karpenter is an open-source project in the AWS GitHub account. Like the Kubernetes Cluster Autoscaler, Karpenter detects Pod schedule errors and turns on nodes to host them. With Karpenter, you configure a provisioner that defines properties. Requirements are used to determine the type of node launch.

Karpenter looks at scheduling constraints on your Pod definitions, and the requirements in the provisioner to turn on an instance that matches your requirements. Karpenter schedules and binds Pods to a node when the node is launched. This gives you an improvement over the Kubernetes Cluster Autoscaler for node startup latency.

Service Discovery

Service Discovery in Kubernetes will help dependent applications to find your services. A service that is only accessed from inside the cluster will be created with a cluster IP. The cluster IP is a load-balanced IP address. Traffic to this IP address will be forwarded to the matching Pods for the service.

The cluster IP is proxied by the kube-proxy running on each of your Kubernetes nodes. kube-proxy contains rules that are updated as the Pods and services are created and removed. It’s important to note that we are relying on proxying to route our communications into Pods, not round-robin DNS, which has historically caused issues with clients ignoring time-to-live settings and caching DNS records.

Kubernetes offers two ways to allow a consuming service to discover a service’s cluster IP:

Environment variablesYour Pods in a Kubernetes cluster will be created with environment variables with information about every active service in the cluster
DNS recordsIf you are running a DNS service in your cluster, you can query for service information 
via DNS-based service discovery. Each service will have:
1. an A record that contains the cluster IP, and
2. an SRV record, which contains information like priority, weight, and port number. 

But still, remember traffic to a cluster IP is still proxied to multiple Pods on multiple nodes via kube-proxy running on your cluster nodes.

Amazon EKS

AWS has a service to host Kubernetes workloads called Amazon Elastic Kubernetes Service. Amazon EKS is certified Kubernetes conformant. EKS runs the Kubernetes control plane across multiple Availability Zones to eliminate a single point of failure, further providing reliability to your setup. This control plane is managed, meaning AWS will deal with setting it all up and scaling the control plane for you as you use it. You can use kubectl or eksctl to interact with control plane.

One big benefit of running container-based workloads with Amazon EKS is the integration with other AWS services:

  • Identity and Access Management (IAM)
  • CloudWatch or CloudTrail
  • Route 53 for DNS and service discovery
  • Load Balancing

EKS uses the Kubernetes scheduler, which runs a series of filters to exclude ineligible nodes for Pod placement: volume, CPU, memory, disk space, available ports. Finally, the scheduler considers constraints that have been set to fine-tune Pod placement. You can set up scheduling constraints at the node level, as well as at the Pod level. At the end of this process, the scheduler calculates a score for each node that has not been filtered out, and the Pod is placed on the highest-scoring node.

Container Related Services on AWS

ECS / EKS Anywhere

ECS Anywhere is a feature of ECS that enables you to run and manage container workloads on customer-managed infrastructure. ECS Anywhere provides a consistent tooling and API experience across your container-based workloads. Whether on-premises or in the cloud, you’ll have similar cluster management, workload scheduling, and monitoring.

EKS Anywhere is an installable software package for creating and operating on-premises Kubernetes clusters that are based on Amazon EKS distribution. EKS Anywhere helps to simplify the creation and operation of on-premises Kubernetes clusters while automating cluster management, so that you can reduce your support costs and avoid the maintenance of hosting multiple installs of open-source and third-party tools. You can manage everything in one place.

Container Monitoring

CloudWatch Container Insights is applicable to both ECS and EKS based workloads, it makes it easy to collect metrics like CPU, memory, disk, and network utilization, as well as log information, in one centralized location.

Beyond CloudWatch Container Insights, there is another managed monitoring service for EKS and self-managed Kubernetes clusters that is called Amazon Managed Service for Prometheus.

Amazon Lambda

Lambda is a serverless compute service, where you can run your code without provisioning or managing servers. With Lambda, you pay for what you use. I write some code, hand it on to Lambda, where it is run for me. Lambda has built-in support for node.js, Python, Java, .Net, Go, and Ruby programming languages.

Unsupported language and too large package size can be solved by supplying Lambda a container image that contains all the code and dependencies we want to run. The container image supplied to Lambda will need to implement the Lambda runtime API. This is how your container will retrieve invocations and parameters from the Lambda service, and respond back with results. The AWS Lambda Runtime Interface Emulator or RIE, can simulate the communication from the Lambda service, so you can test locally.

Amazon App Mesh

Microservices architecture needs some consideration of networking requirements. A service mesh is application-level networking, to make it easy for your services to talk to each other. A service mesh also builds in a lot of observability for your applications, and some high availability features. A service mesh is implemented with a sidecar proxy. If the network calls between services are routed via proxies, the service mesh can implement the monitoring, routing, discovery, and deployment features.

External traffic <-> Proxy as a container <-> Application code

AWS App Mesh is a managed service mesh from AWS that does all of this, and more. Your interaction with AWS App Mesh will happen in two places:

  1. The Envoy Proxy for the sidecar proxy deployedย alongside your service containers
    • Envoy is a cloud-native, open-source, high-performance service proxy.
    • Envoy was accepted into the Cloud-Native Computing Foundation in 2017.
  2. The App Mesh control plane

When you upgrade your application to use AWS App Mesh, you add an Envoy as a sidecar proxy in your ECS tasks or Kubernetes Pods. The control plane of App Mesh converts your requirements to configuration, and deploys the configuration to all of your service proxies. The control plane is:

  1. building and distributing the initial configuration for your proxy
  2. also monitoring your services for any dynamic state changes

App Mesh uses proxies to build observability into your service mesh. Observability is made possible with App Mesh logging, tracing, and metrics. App Mesh resources include:

  1. A mesh is simply the logical boundary for network traffic between your services. You describe your service mesh by creating resources inside a mesh. An egress filter allows or denies traffic to non AWS resources outside the mesh.
  2. A virtual node is a pointer to a task group. You configure a listener for any inbound traffic that will be coming into the node.
  3. A virtual service is a representation of a real service provided in your mesh. The service can be a direct link to a virtual node, or connected to a virtual router.
  4. A virtual router contains routes to direct incoming traffic to nodes using rules, like weight and matching, on URL paths.
Virtual service <-> Virtual router <-> Virtual nodes <-> Amazon ECS / EKS

My Certificate

For more on Containerized Applications on AWS, please refer to the wonderful course here https://www.coursera.org/learn/containerized-applications-on-aws

Related Quick Recap

I am Kesler Zhu, thank you for visiting my website. Check out more course reviews at https://KZHU.ai

All of your support will be used for maintenance of this site and more great content. I am humbled and grateful for your generosity. Thank you!

Don't forget to sign up newsletter, don't miss any chance to learn.

Or share what you've learned with friends!

Leave a Reply

Your email address will not be published. Required fields are marked *