Back to Insights
AI/ML

Building a Churn Prediction Model with SageMaker Canvas: Infrastructure and Deployment with Terraform

Todd Bernson, CTO, explains how to deploy an AWS SageMaker Canvas model for customer churn prediction using Terraform. This guide covers setting up SageMaker with IAM roles, secure S3 access, and automation practices, emphasizing security, reproducibility, and scalability for effective infrastructure management.

Todd Bernson

2024-10-31

Predicting customer churn is important in many industries, including the telecom industry, for customer retention and business growth. This project leveraged AWS SageMaker Canvas, providing a robust and accessible platform for creating churn prediction models. Terraform enables efficient, reproducible infrastructure setup, ensuring consistency and reducing deployment errors. This guide walks through the Terraform setup, security considerations, and the role of automation in deploying a SageMaker Canvas model for churn prediction.


Terraform Setup

To set up the necessary infrastructure, use the following Terraform configuration.

IAM Role for SageMaker Execution

This IAM role allows SageMaker to assume the necessary permissions. Here's the configuration:

data "aws_iam_policy_document" "sagemaker_execution_role" {
  statement {
    actions = ["sts:AssumeRole"]
    effect  = "Allow"
    principals {
      type        = "Service"
      identifiers = ["sagemaker.amazonaws.com"]
    }
  }
}

resource "aws_iam_role" "sagemaker_execution_role" {
  name = "${var.environment}_sagemaker_execution_role"
  assume_role_policy = data.aws_iam_policy_document.sagemaker_execution_role.json
  tags = var.tags
}

S3 Access Policy

This policy grants SageMaker access to an S3 bucket for reading and writing data.

data "aws_iam_policy_document" "sagemaker_s3_access" {
  statement {
    effect = "Allow"
    actions = [
      "s3:DeleteObject",
      "s3:GetObject",
      "s3:ListBucket",
      "s3:PutObject",
    ]
    resources = [
      module.sagemaker_s3_bucket.s3_bucket_arn,
      "${module.sagemaker_s3_bucket.s3_bucket_arn}/*"
    ]
  }
}

resource "aws_iam_role_policy" "s3_access" {
  name = "${var.environment}-sagemaker-s3-access"
  role = aws_iam_role.sagemaker_execution_role.id
  policy = data.aws_iam_policy_document.sagemaker_s3_access.json
}

Attaching Managed Policies to the Role

Attach SageMaker Canvas-specific managed policies to the IAM role.

resource "aws_iam_role_policy_attachment" "canvas_ai_services" {
  role       = aws_iam_role.sagemaker_execution_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSageMakerCanvasAIServicesAccess"
}

resource "aws_iam_role_policy_attachment" "canvas_full_access" {
  role       = aws_iam_role.sagemaker_execution_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSageMakerCanvasFullAccess"
}

resource "aws_iam_role_policy_attachment" "sagemaker_full_access" {
  role       = aws_iam_role.sagemaker_execution_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSageMakerFullAccess"
}

SageMaker Domain Setup

Define the SageMaker domain, linking it to the IAM role created above.

resource "aws_sagemaker_domain" "canvas" {
  domain_name = "${local.environment}-canvas-domain"
  auth_mode   = "IAM"
  vpc_id      = module.vpc.vpc_id
  subnet_ids  = module.vpc.private_subnets

  default_user_settings {
    execution_role = aws_iam_role.sagemaker_execution_role.arn

    canvas_app_settings {
      time_series_forecasting_settings {
        status = "ENABLED"
      }
    }
  }

  default_space_settings {
    execution_role = aws_iam_role.sagemaker_execution_role.arn
  }

  tags = var.tags
}

SageMaker User Profile

Configure a user profile to interact with the SageMaker Canvas domain.

resource "aws_sagemaker_user_profile" "canvas_user" {
  domain_id         = aws_sagemaker_domain.canvas.id
  user_profile_name = "${local.environment}-canvas-user"

  user_settings {
    execution_role = aws_iam_role.sagemaker_execution_role.arn

    canvas_app_settings {
      time_series_forecasting_settings {
        status = "ENABLED"
      }
    }
  }

  tags = var.tags
}

Security Group for SageMaker Domain

Define a security group for the SageMaker domain, allowing necessary inbound and outbound traffic.

resource "aws_security_group" "sagemaker_domain" {
  name_prefix = "${var.environment}_sagemaker_domain-"
  description = "Security group for SageMaker Domain"
  vpc_id      = module.vpc.vpc_id

  ingress {
    from_port = 0
    to_port   = 0
    protocol  = "-1"
    self      = true
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = merge(
    var.tags,
    {
      Name = "${var.environment}_sagemaker_domain_sg"
    }
  )
}

SageMaker-Specific Considerations

Permissions

The IAM role and policies defined above ensure that SageMaker Canvas can access S3 and other AWS resources securely. Avoid granting excessive permissions by adhering to the principle of least privilege.

Data Storage

Configure your S3 bucket to store training data securely, with access limited to the SageMaker execution role. Consider versioning and encryption for added security.


Automation

Terraform allows for easy re-deployment and scaling. By defining infrastructure as code, you minimize human error and facilitate version control. Integrating Terraform with CI/CD pipelines further automates and streamlines the deployment process.


Key Takeaways

  • Consistency and Reproducibility: Terraform ensures infrastructure remains consistent across environments.
  • Security: IAM roles and policies follow best practices for least privilege.
  • Scalability: Using Terraform enables quick scaling of infrastructure as demands increase.

By leveraging SageMaker Canvas and Terraform, you achieve a scalable, secure, and efficient setup for churn prediction, empowering business stakeholders with actionable insights.

Todd Bernson

CTO