Your project can be on or off. Your project's priority can be changed. Your job can be changed. But the technology is always heading north!
Saturday, April 18, 2020
HPA based on external metric in cloudwatch
Saturday, April 11, 2020
EKS HPA based on basic metric such as CPU
1.Terraform code in service.tf:
resource "kubernetes_horizontal_pod_autoscaler" "co-ec-hpa" {
for_each = var.service_parameters
metadata {
name = each.value.desc
}
spec {
max_replicas = each.value.service_hpa_max
min_replicas = each.value.service_hpa_min
target_cpu_utilization_percentage = 60 // TODO
scale_target_ref {
api_version = "extensions/v1beta1"
kind = "Deployment"
name = "${each.value.name}-${var.eks_cluster_name}"
}
}
}
2. Install Metrics Server
DOWNLOAD_URL=$(curl -Ls "https://api.github.com/repos/kubernetes-sigs/metrics-server/releases/latest" | jq -r .tarball_url)DOWNLOAD_VERSION=$(grep -o '[^/v]*$' <<< $DOWNLOAD_URL)curl -Ls $DOWNLOAD_URL -o metrics-server-$DOWNLOAD_VERSION.tar.gz
mkdir metrics-server-$DOWNLOAD_VERSION
tar -xzf metrics-server-$DOWNLOAD_VERSION.tar.gz --directory metrics-server-$DOWNLOAD_VERSION --strip-components 1kubectl apply -f metrics-server-$DOWNLOAD_VERSION/deploy/1.8+/
rm metrics-server-$DOWNLOAD_VERSION.tar.gz
rm -rf metrics-server-$DOWNLOAD_VERSION
3. Check the pod under 'kube-system' name space
kube-system metrics-server-7fcf9cc98b-eeeee 1/1 Running 0 47h
Reference
(For HPA based on external metrics, please contact me)
EKS security
1. Limit the access to cluster api server
name = local.eks_cluster_name
role_arn = aws_iam_role.co-ec-eks-cluster-iam-role.arn
vpc_config {
security_group_ids = [aws_security_group.co-ec-eks-cluster-security-group.id]
subnet_ids = local.subnet_ids
endpoint_private_access = true // allow access to EKS network
// https://www.cloudflare.com/ips-v4 for list of IPs from Cloudflare
public_access_cidrs = toset(concat(data.cloudflare_ip_ranges.cloudflare.ipv4_cidr_blocks, local.workstation-external-cidr))
}
depends_on = [
aws_iam_role_policy_attachment.co-ec-eks-cluster-AmazonEKSClusterPolicy,
aws_iam_role_policy_attachment.co-ec-eks-cluster-AmazonEKSServicePolicy,
]
}
2. Access control to EKS cluster node (so the services running on it will have access to the resources) through terraform:
resource "aws_iam_role" "fs-ec-eks-node-iam-role" { name = "fs-ec-eks-node-iam-role-${local.vpc_id}" assume_role_policy = <<POLICY { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } POLICY } resource "aws_iam_role_policy" "fs-ec-eks-node-auto-scale-policy" { name = "fs-ec-eks-node-auto-scale-policy" role = aws_iam_role.fs-ec-eks-node-iam-role.id policy = <<-EOF { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "autoscaling:DescribeAutoScalingGroups", "autoscaling:DescribeAutoScalingInstances", "autoscaling:DescribeLaunchConfigurations", "autoscaling:DescribeTags", "autoscaling:SetDesiredCapacity", "autoscaling:TerminateInstanceInAutoScalingGroup" ], "Resource": "*" } ] } EOF } resource "aws_iam_role_policy" "fs-ec-eks-node-metrics-access-policy" { name = "fs-ec-eks-node-metrics-access-policy" role = aws_iam_role.fs-ec-eks-node-iam-role.id policy = <<-EOF { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics" ], "Resource": "*" } ] } EOF } resource "aws_iam_role_policy_attachment" "fs-ec-eks-node-AmazonEKSWorkerNodePolicy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy" role = aws_iam_role.fs-ec-eks-node-iam-role.name} resource "aws_iam_role_policy_attachment" "fs-ec-eks-node-AmazonEKS_CNI_Policy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy" role = aws_iam_role.fs-ec-eks-node-iam-role.name} resource "aws_iam_role_policy_attachment" "fs-ec-eks-node-AmazonEC2ContainerRegistryReadOnly" { policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly" role = aws_iam_role.fs-ec-eks-node-iam-role.name} resource "aws_iam_role_policy_attachment" "fs-ec-eks-node-CloudWatchAgentServerPolicy" { policy_arn = "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy" role = aws_iam_role.fs-ec-eks-node-iam-role.name} resource "aws_iam_role_policy_attachment" "fs-ec-eks-node-AmazonDynamoDBFullAccess" { policy_arn = "arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess" role = aws_iam_role.fs-ec-eks-node-iam-role.name} # Using the new feature from reinvent:19 to provisioning node automatically without the need# for EC2 provisioning. EKS-optimized AMIs will be used automatically for each node.# Nodes launched as part of a managed node group are automatically tagged for auto-discovery# by k8s cluster autoscaler.# https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html# https://www.terraform.io/docs/providers/aws/r/eks_node_group.htmlresource "aws_eks_node_group" "fs-ec-eks-node-group" { cluster_name = aws_eks_cluster.fs-ec-eks-cluster.name node_group_name = "fs-ec-eks-node-group-${local.vpc_id}" node_role_arn = aws_iam_role.fs-ec-eks-node-iam-role.arn subnet_ids = local.subnet_ids instance_types = [var.instance_type] scaling_config { desired_size = 3 max_size = 8 // TODO min_size = 3 } depends_on = [ aws_iam_role_policy_attachment.fs-ec-eks-node-AmazonEKSWorkerNodePolicy, aws_iam_role_policy_attachment.fs-ec-eks-node-AmazonEKS_CNI_Policy, aws_iam_role_policy_attachment.fs-ec-eks-node-AmazonEC2ContainerRegistryReadOnly, aws_iam_role_policy_attachment.fs-ec-eks-node-CloudWatchAgentServerPolicy, aws_iam_role_policy_attachment.fs-ec-eks-node-AmazonDynamoDBFullAccess, ] }
3. Control the access to MSK(Kafka):
resource "aws_security_group" "fs-ec-msk-cluster-security-group" {
name = "fs-ec-msk-cluster-security-group-${local.vpc_id}" description = "Cluster communication with worker nodes" vpc_id = local.vpc_id
egress {
from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "fs-ec-msk-cluster-${local.vpc_id}" }
}
# allow access from every host in the same vpc.// TODOresource "aws_security_group_rule" "fs-ec-msk-cluster-ingress-workstation-http" {
cidr_blocks = [local.vpc_cidr]
description = "Allow access to Kafka from same VPC" from_port = 9092 protocol = "tcp" security_group_id = aws_security_group.fs-ec-msk-cluster-security-group.id to_port = 9092 type = "ingress"}
resource "aws_security_group_rule" "fs-ec-msk-cluster-ingress-workstation-https" {
cidr_blocks = [local.vpc_cidr]
description = "Allow access to Kafka from same VPC" from_port = 9094 protocol = "tcp" security_group_id = aws_security_group.fs-ec-msk-cluster-security-group.id to_port = 9094 type = "ingress"}
resource "aws_security_group_rule" "fs-ec-msk-cluster-ingress-workstation-zookeeper" {
cidr_blocks = [local.vpc_cidr]
description = "Allow access to Zookeeper from same VPC" from_port = 2181 protocol = "tcp" security_group_id = aws_security_group.fs-ec-msk-cluster-security-group.id to_port = 2181 type = "ingress"}
resource "aws_kms_key" "fs-ec-kms" {
description = "KMS key"}
resource "aws_msk_cluster" "fs-ec-msk-cluster" {
cluster_name = "fs-ec-msk-cluster-${local.vpc_id}" kafka_version = var.kafka_version number_of_broker_nodes = length(local.subnets_ids)
configuration_info {
arn = aws_msk_configuration.fs-ec-msk-configuration.arn revision = aws_msk_configuration.fs-ec-msk-configuration.latest_revision }
broker_node_group_info {
instance_type = var.broker_type ebs_volume_size = var.broker_ebs_size client_subnets = local.subnets_ids
security_groups = [aws_security_group.fs-ec-msk-cluster-security-group.id]
}
encryption_info {
encryption_at_rest_kms_key_arn = aws_kms_key.fs-ec-kms.arn encryption_in_transit {
client_broker = "TLS" // PLAINTEXT" in_cluster = true }
}
tags = {
Name = "fs-ec-msk-cluster-${local.vpc_id}" }
}
// it is not possible to destroy cluster configs so a random number is usedresource "random_id" "msk" {
byte_length = 4}
resource "aws_msk_configuration" "fs-ec-msk-configuration" {
kafka_versions = [var.kafka_version]
name = "${var.msk_config_name_prefix}fs-ec-msk-configuration-${local.vpc_id}-${random_id.msk.hex}"
server_properties = <<PROPERTIES
auto.create.topics.enable = true
delete.topic.enable = false
num.partitions = 96
PROPERTIES
}
Reference:
EKS cluster autoscaler
1. Enable CA in eks-worker-nodes.tf
# Using the new feature from reinvent:19 to provisioning node automatically without the need
# for EC2 provisioning. EKS-optimized AMIs will be used automatically for each node.
# Nodes launched as part of a managed node group are automatically tagged for auto-discovery
# by k8s cluster autoscaler.
# https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html
# https://www.terraform.io/docs/providers/aws/r/eks_node_group.html
resource "aws_eks_node_group" "co-ec-eks-node-group" {
cluster_name = aws_eks_cluster.co-ec-eks-cluster.name
node_group_name = "co-ec-eks-node-group-${local.vpc_id}"
node_role_arn = aws_iam_role.co-ec-eks-node-iam-role.arn
subnet_ids = local.subnet_ids instance_types = [var.instance_type] scaling_config {
desired_size = 3
max_size = 8
min_size = 3 } depends_on = [
aws_iam_role_policy_attachment.co-ec-eks-node-AmazonEKSWorkerNodePolicy,
aws_iam_role_policy_attachment.co-ec-eks-node-AmazonEKS_CNI_Policy,
aws_iam_role_policy_attachment.co-ec-eks-node-AmazonEC2ContainerRegistryReadOnly,
aws_iam_role_policy_attachment.co-ec-eks-node-CloudWatchAgentServerPolicy,
aws_iam_role_policy_attachment.co-ec-eks-node-AmazonDynamoDBFullAccess,
]
}
2. Add new IAM policy to EKS node IAM role
resource "aws_iam_role_policy" "co-ec-eks-node-auto-scale-policy" {
name = "co-ec-eks-node-auto-scale-policy"
role = aws_iam_role.co-ec-eks-node-iam-role.id
policy = <<-EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "*"
}
]
}
(you can also add such policy from AWS Console to the Node IAM Role)
3. Download the sample file and rename it to cluster_autoscaler_asg, yaml file:
https://aws.amazon.com/premiumsupport/knowledge-center/eks-cluster-autoscaler-setup/
4. Make following changes to the yaml file:
< - --expander=least-waste
< - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
---
> - --nodes={{MIN_NODE}}:{{MAX_NODE}}:{{K8S_NODE_ASG}}
One example is:
5. Apply autoscaler deployment:
cat cluster-autoscaler-asg.yaml | sed "s/{{K8S_NODE_ASG}}/$K8S_NODE_ASG/;s/{{MIN_NODE}}/$MIN_NODE/;s/{{MAX_NODE}}/$MAX_NODE/" | kubectl apply -f -
References
https://aws.amazon.com/premiumsupport/knowledge-center/eks-cluster-autoscaler-setup/