Deploy the AWS Node Termination Handler

We need Helm to deploy the AWS Node Termination Handler, see installing Helm for instructions.

In this section, we will prepare our cluster to handle Spot interruptions.

Demand for Spot Instances can vary significantly, and as a consequence the availability of Spot Instances will also vary depending on how many unused EC2 instances are available. It is always possible that your Spot Instance might be interrupted. In this case the Spot Instances are sent an interruption notice two minutes ahead to gracefully wrap up things. We will deploy a pod on each spot instance to detect and redeploy applications elsewhere in the cluster.

The first thing that we need to do is deploy the AWS Node Termination Handler on each Spot Instance. This will monitor the EC2 metadata service on the instance for an interruption notice. The termination handler consists of a ServiceAccount, ClusterRole, ClusterRoleBinding, and a DaemonSet.

The workflow can be summarized as:

  • Identify that a Spot Instance is being reclaimed.
  • Use the 2-minute notification window to gracefully prepare the node for termination.
  • <strong>Taint</strong> the node and cordon it off to prevent new pods from being placed.
  • <strong>Drain</strong> connections on the running pods.
  • Replace the pods on remaining nodes to maintain the desired capacity.

By default, the aws-node-termination-handler will run on all of your nodes (on-demand and spot). If your spot instances are labeled, you can configure aws-node-termination-handler to only run on your labeled spot nodes. If you're using the tag lifecycle=Ec2Spot, you can run the following to apply our spot-node-selector overlay.

helm repo add eks

helm upgrade --install aws-node-termination-handler \
             --namespace kube-system \
             --set nodeSelector.lifecycle=Ec2Spot \

Verify that the pods are only running on node with label lifecycle=Ec2Spot

kubectl -n kube-system get daemonsets 

Spot Daemonset