Back to Insights
AI/ML

EKS AI Langchain - Part 2 Setting Up EKS Cluster with Terraform

Deploying AI applications on Kubernetes provides scalability and efficient resource management. I'll explore setting up an Amazon EKS (Elastic Kubernete...

Todd Bernson

2024-06-25

Deploying AI applications on Kubernetes provides scalability and efficient resource management. I'll explore setting up an Amazon EKS (Elastic Kubernetes Service) cluster using Terraform in this article. This setup will be the backbone for efficiently deploying AI Langchain applications.

Clone the repo here.

Prerequisites

Before we begin, ensure you have the following tools installed and configured:

  • Terraform installed.
  • AWS CLI is installed and configured with the necessary permissions.
  • OpenVPN is installed to provide secure access to the VPN.

Step 1: Define Variables

First, create a terraform.tfvars file with the necessary variables:

company = ""

domain = ""

openvpn_instance_type = ""

region = ""

Step 2: Create VPC Configuration

In part 1, I show how to set up the landing zone, which includes a VPC and VPN instance.

Step 3: EKS Cluster Configuration

Create an eks.tf file to define the EKS cluster:

locals {
  assumed_role_arn = data.aws_caller_identity.current.arn
  account_id       = data.aws_caller_identity.current.account_id
  role_name        = regex("arn:aws:sts::\\d+:assumed-role/(.+?)/", local.assumed_role_arn)[0]
  iam_role_arn     = "arn:aws:iam::${local.account_id}:role/${local.role_name}"
}

data "aws_caller_identity" "current" {}

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "20.11.1"

  cluster_name                    = var.environment
  cluster_version                 = var.eks_cluster_version
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true

  cluster_ip_family = "ipv4"

  cluster_addons = {
    coredns = {
      most_recent = true
    }
    kube-proxy = {
      most_recent = true
    }
    vpc-cni = {
      most_recent    = true
      before_compute = true
      configuration_values = jsonencode({
        env = {
          ENABLE_PREFIX_DELEGATION = "true"
          WARM_PREFIX_TARGET       = "1"
        }
      })
    }
  }

  iam_role_additional_policies = {
    AmazonEC2ContainerRegistryReadOnly = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  }

  enable_cluster_creator_admin_permissions = true

  cluster_tags = local.tags

  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets

  eks_managed_node_group_defaults = {
    ami_type       = "AL2_x86_64"
    instance_types = ["t3.medium"]
  }

  eks_managed_node_groups = {
    default_node_group = {
      use_custom_launch_template = false

      disk_size = 50

      al2023_nodeadm = {
        ami_type = "AL2023_x86_64_STANDARD"

        use_latest_ami_release_version = true

        cloudinit_pre_nodeadm = [
          {
            content_type = "application/node.eks.aws"
            content      = <<-EOT
            ---
            apiVersion: node.eks.aws/v1alpha1
            kind: NodeConfig
            spec:
              kubelet:
                config:
                  shutdownGracePeriod: 30s
                  featureGates:
                    DisableKubeletCloudCredentialProviders: true
          EOT
          }
        ]
      }
    }
  }
}

Step 4: RDS Configuration

For database needs, configure an Aurora PostgreSQL cluster in aurora.tf:

locals {
  auto_pause   = var.environment != "prod"
  max_capacity = var.environment == "prod" ? 64 : 16

  instances = var.environment == "prod" ? {
    one = {}
    two = {}
    } : {
    one = {}
  }
}

module "postgresql" {
  source  = "terraform-aws-modules/rds-aurora/aws"
  version = "~> 9.3.1"

  apply_immediately               = true
  backup_retention_period         = var.rds_backup_retention
  copy_tags_to_snapshot           = true
  create_monitoring_role          = true
  database_name                   = var.environment
  db_cluster_parameter_group_name = aws_rds_cluster_parameter_group.this.name
  db_subnet_group_name            = module.vpc.database_subnet_group
  deletion_protection             = true
  enable_http_endpoint            = true
  engine                          = "aurora-postgresql"
  engine_mode                     = "provisioned"
  engine_version                  = var.rds_engine_version
  master_username                 = "postgres"
  name                            = var.environment
  storage_encrypted               = true
  subnets                         = module.vpc.database_subnets
  tags                            = local.tags
  vpc_id                          = module.vpc.vpc_id

  security_group_rules = {
    eks_ingress = {
      source_security_group_id = module.eks.node_security_group_id
    }

    openvpn_ingress = {
      source_security_group_id = var.openvpn_sg
    }

    egress = {
      cidr_blocks = ["0.0.0.0/0"]
      description = "Egress to everything"
    }
  }

  serverlessv2_scaling_configuration = {
    auto_pause               = local.auto_pause
    max_capacity             = local.max_capacity
    min_capacity             = 2
    seconds_until_auto_pause = 3600
    timeout_action           = "ForceApplyCapacityChange"
  }

  instance_class = "db.serverless"
  instances      = local.instances
}

resource "aws_secretsmanager_secret" "rds_credentials" {
  name                    = "${var.environment}-aurora-serverless-credentials"
  description             = "${var.environment} aurora username and password"
  recovery_window_in_days = "7"

  depends_on = [module.postgresql]
}

resource "aws_secretsmanager_secret_version" "rds_credentials" {
  secret_id = aws_secretsmanager_secret.rds_credentials.id
  secret_string = jsonencode(
    {
      username = module.postgresql.cluster_master_username
      password = module.postgresql.cluster_master_password
    }
  )

  depends_on = [module.postgresql]
}

resource "aws_rds_cluster_parameter_group" "this" {
  name        = var.environment
  family      = "aurora-postgresql15"
  description = "RDS default cluster parameter group"

  parameter {
    name  = "rds.force_ssl"
    value = "1"
  }
}

Step 5: Initialize and Apply Terraform

Run the following commands to set up the infrastructure:

terraform init
terraform validate
terraform plan -out=plan.out
terraform apply plan.out

Step 6: Accessing the VPN

Secure access to your infrastructure is crucial. Follow these steps to set up and access the VPN:

  1. Connect to the VPN at https://vpn.example.com
  2. Admin access is available at https://vpn.example.com/admin

Step 7: Configure kubectl for EKS

To interact with your EKS cluster, configure kubectl:

aws eks update-kubeconfig --region <AWS_REGION> --name dev

Added new context arn:aws:eks:::cluster/dev

Verify the configuration by listing the nodes:

kubectl get nodes

Following these steps, you've successfully set up a robust and scalable EKS cluster using Terraform. This setup will facilitate the deployment of AI Langchain applications, ensuring efficient resource management and scalability.

Further Steps

Once the EKS cluster runs, you can deploy your AI Langchain applications. Stay tuned for our next articles, which will include detailed deployment guides and best practices.

Visit my website here.

Todd Bernson

CTO