AI/ML

EKS AI Langchain - Part 4 Optimizing and Securing AI Deployments on EKS

In previous articles, I set up a robust Amazon EKS cluster and deployed AI Langchain applications. This article will focus on optimizing and securing yo...

Todd Bernson

2024-10-29

In previous articles, I set up a robust Amazon EKS cluster and deployed AI Langchain applications. This article will focus on optimizing and securing your deployments to ensure maximum performance and security.

Prerequisites

Please ensure your EKS cluster and AI Langchain applications are up and running, as detailed in previous articles. Additionally, you should have:

kubectl installed and configured.
Basic knowledge of Kubernetes security practices.

Step 1: Optimizing Resource Usage

Efficient resource usage is crucial for AI applications. Kubernetes provides several ways to optimize resource allocation.

Define Resource Requests and Limits

Setting resource requests and limits ensures your applications have the necessary resources without overcommitting.

Example deployment.yaml snippet:

resources:
  requests:
    memory: "1Gi"
    cpu: "500m"
  limits:
    memory: "1Gi"
    cpu: "1"

It is best not to oversubscribe memory, but with CPU, oversubscription is best practice. Many microservices will sit idle much of the time after initial startup.

Step 2: Implementing Autoscaling

Autoscaling ensures your application can handle varying loads efficiently.

Horizontal Pod Autoscaler (HPA)

An HPA automatically adjusts the number of pods based on CPU or memory utilization.

Example hpa.yaml:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: ai-langchain-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ai-langchain-deployment
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

This will horizontally scale the deployment based on the average CPU. When using this kind of HPA, make sure the metrics server is installed.

Step 3: Securing Your Deployment

Security is paramount in any deployment, especially for AI applications handling sensitive data.

Network Policies

Network policies control the communication between pods.

Example network-policy.yaml:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: ai-langchain-network-policy
spec:
  podSelector:
    matchLabels:
      app: ai-langchain
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: ai-langchain
    ports:
    - protocol: TCP
      port: 80
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: ai-langchain
    ports:
    - protocol: TCP
      port: 80

North-south is normally what is thought of when securing Kubernetes. However, since everything can talk by default in a cluster, east-west security should not be overlooked.

Using Secrets for Sensitive Data

Use Kubernetes Secrets to manage sensitive information like database credentials.

Example secret.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: ai-langchain-secret
type: Opaque
data:
  db-username: VG9kZAo=
  db-password: aXMgYXdlc29tZS4K

The secret values are applied and stored base64 encoded. Use the secret in your deployment:

env:
- name: DB_USERNAME
  valueFrom:
    secretKeyRef:
      name: ai-langchain-secret
      key: db-username
- name: DB_PASSWORD
  valueFrom:
    secretKeyRef:
      name: ai-langchain-secret
      key: db-password

By optimizing and securing your AI Langchain deployments on EKS, you ensure they run efficiently and securely. Implementing resource limits, autoscaling, network policies, and monitoring can significantly enhance your application's performance and reliability.

Todd Bernson

CTO

View all posts

AI/ML

Why Enterprise AI Must Be Application-Led, Not Agent-Led

A deep dive by Todd Bernson, CTO and Chief AI Officer, on why enterprise AI systems should be architected as application-led, deterministic platforms with embedded agentic AI—not fully autonomous agents. This article explains how API-first, governed, multi-channel architectures deliver higher reliability, compliance, scalability, and business value in real-world Fortune-500 environments.

Todd Bernson

2025-12-02

AI/ML

Application-First Agentic AI

Application-first agentic AI is emerging as the only reliable path to real enterprise ROI. In this in-depth analysis, Todd Bernson, CTO & CAIO, breaks down why most generative AI initiatives stall in production—and how disciplined enterprise architecture, deterministic workflows, and narrowly scoped AI agents can finally unlock repeatable business value. Using a real sprint-intelligence system as a case study, the article shows how organizations can combine serverless engineering, structured orchestration, and constrained LLM reasoning to reduce reporting effort, increase trust, eliminate hallucinations, and deliver actionable insights across engineering, operations, compliance, and customer experience.

Todd Bernson

2025-11-28

AI/ML

Why 95% of AI Projects Fail and How to Be Among the 5% That Succeed

Lee Hylton

2025-08-22

EKS AI Langchain - Part 4 Optimizing and Securing AI Deployments on EKS

Prerequisites

Step 1: Optimizing Resource Usage

Step 2: Implementing Autoscaling

Step 3: Securing Your Deployment

Read More

Why Enterprise AI Must Be Application-Led, Not Agent-Led

Application-First Agentic AI

Why 95% of AI Projects Fail and How to Be Among the 5% That Succeed