Skip to content
AWS Lake Formation: Part 4 Fine-Grained Access Control with Lake Formation and IAM

This series installment on AWS Lake Formation dives deep into fine-grained access control mechanisms, focusing on how Lake Formation integrates with AWS IAM to enforce detailed security policies. Let's explore how to implement and manage these controls to ensure precise and secure data access within your data lake environment.

Clone the project repo here.

Understanding Fine-Grained Access Control in Lake Formation

AWS Lake Formation enhances the security features IAM provides, allowing you to define precise access controls to your data lake resources at the column, row, and cell levels. This granularity ensures that users and services can access only the data necessary for their role, enhancing security and compliance.

Key Concepts:

  • Data Permissions: Control who can access data within tables and columns.
  • Table and Database Permissions: Manage access at the database or table level, specifying who can create, modify, or delete resources.
  • Row-Level Security: Apply filters to data queries, ensuring users see only the data they are authorized to view.

Implementing Granular Security Policies with Lake Formation

Configuring granular security policies involves setting up permissions that align with your organization's data governance policies. Lake Formation provides a comprehensive set of tools to manage these permissions effectively.

Terraform Configuration for Lake Formation Permissions:

resource "aws_lakeformation_permissions" "data_access" {

  principal      = aws_iam_role.analyst.arn

  database_name  = aws_glue_catalog_database.financial_data.name

  table_name     = aws_glue_catalog_table.sales_data.name

  permissions    = ["SELECT"]

  permissions_with_grant_option = []

}



resource "aws_lakeformation_permissions" "row_level_security" {

  principal      = aws_iam_role.analyst.arn

  database_name  = aws_glue_catalog_database.financial_data.name

  table_name     = aws_glue_catalog_table.sales_data.name

  column_names   = ["customer_id", "transaction_value"]

  permissions    = ["SELECT"]

  permissions_with_grant_option = []

  row_filter {

    filter_expression = "customer_region = 'EU'"

  }

}

In this example, we set up permissions for an analyst role, allowing them to perform SELECT queries on specific columns of the sales_data table and applying a row filter to restrict data visibility to a certain region.

Integrating IAM with Lake Formation

While Lake Formation provides the tools for fine-grained access control within the data lake, IAM is crucial in managing overall permissions and identities.

Best Practices for IAM Integration:

  • RBAC: Use IAM roles to manage access rights, associating these roles with Lake Formation permissions.
  • Least Privilege Principle: Assign the minimum necessary permissions to roles and individuals to reduce the risk of unauthorized data access.

Terraform Configuration for IAM and Lake Formation Integration:

data "aws_iam_policy_document" "lakeformation_policy" {

  statement {

    actions = [

      "glue:CreateDatabase",

      "glue:GetDatabase",

      "glue:UpdateDatabase",

      "glue:DeleteDatabase",

      "glue:CreateTable",

      "glue:GetTable",

      "glue:UpdateTable",

      "glue:DeleteTable",

      "glue:BatchGetJobs",

      "glue:GetJob",

      "glue:StartJobRun",

      "glue:BatchStopJobRun",

      "glue:CreateCrawler",

      "glue:GetCrawler",

      "glue:UpdateCrawler",

      "glue:StartCrawler",

      "glue:StopCrawler"

    ]

    resources = [

      "arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:catalog",

      "arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:crawler/${local.environment}",

      "arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:database/${local.environment}",

      "arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:job/${local.environment}",

      "arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:table/${local.environment}/*",

    ]

    effect = "Allow"

  }

  statement {

    effect = "Allow"

    actions = [

      "s3:DeleteObject",

      "s3:GetObject",

      "s3:PutObject",

    ]

    resources = [

      "${data.aws_s3_bucket.bucket.arn}/*"

    ]

  }

  statement {

    effect = "Allow"

    actions = [

      "s3:ListBucket"

    ]

    resources = [data.aws_s3_bucket.bucket.arn]

  }

}



data "aws_iam_policy_document" "lakeformation_role" {

  statement {

    actions = ["sts:AssumeRole"]

    effect  = "Allow"

    principals {

      identifiers = ["lakeformation.amazonaws.com"]

      type        = "Service"

    }

  }

}



locals {

  environment = "${var.environment}_${random_string.this.result}"

}



resource "aws_iam_policy" "lakeformation_service_policy" {

  name        = "${local.environment}_policy"

  description = "Policy that allows sufficient permissions for the crawler"



  policy = data.aws_iam_policy_document.lakeformation_policy.json

}



resource "aws_iam_role" "lakeformation_service_role" {

  name = "${local.environment}_role"



  assume_role_policy = data.aws_iam_policy_document.lakeformation_role.json



  tags = var.tags

}



resource "aws_iam_role_policy_attachment" "lakeformation_service_policy_attachment" {

  role       = aws_iam_role.lakeformation_service_role.name

  policy_arn = aws_iam_policy.lakeformation_service_policy.arn

}



resource "aws_lakeformation_data_lake_settings" "this" {

  admins = [data.aws_iam_session_context.current.issuer_arn]

}



resource "aws_lakeformation_permissions" "caller_catalog_database_permissions" {

  principal   = data.aws_iam_role.terraform.arn

  permissions = ["ALL"]



  database {

    name = aws_glue_catalog_database.this.name

  }

}



resource "aws_lakeformation_permissions" "caller_catalog_table_permissions" {

  principal   = data.aws_iam_role.terraform.arn

  permissions = ["ALL"]



  table {

    database_name = aws_glue_catalog_database.this.name

    wildcard      = true

  }

}



resource "aws_lakeformation_permissions" "glue_catalog_database_permissions" {

  principal = aws_iam_role.glue_service_role.arn

  permissions = [

    "ALTER",

    "CREATE_TABLE",

    "DROP",

  ]



  database {

    name = aws_glue_catalog_database.this.name

  }

}



resource "aws_lakeformation_permissions" "glue_catalog_table_permissions" {

  principal   = aws_iam_role.glue_service_role.arn

  permissions = ["ALL"]



  table {

    database_name = aws_glue_catalog_database.this.name

    wildcard      = true

  }

}



resource "aws_lakeformation_permissions" "s3_data_location_permissions" {

  principal   = aws_iam_role.glue_service_role.arn

  permissions = ["DATA_LOCATION_ACCESS"]



  data_location {

    arn = data.aws_s3_bucket.bucket.arn

  }

}



resource "aws_lakeformation_resource" "this" {

  arn = data.aws_s3_bucket.bucket.arn



  role_arn = aws_iam_role.lakeformation_service_role.arn

}

I used these actual permissions, which would be too broad for use in a production instance.

Implementing fine-grained access control with AWS Lake Formation and IAM provides robust security for your data lake. By leveraging Lake Formation's detailed access control features and IAM's comprehensive identity management capabilities, you can ensure that data within your data lake is secure and compliant with internal and regulatory standards. Using Terraform to manage these configurations as code enhances security and adds a layer of automation that keeps your data governance updated with organizational changes.

Visit my website here.

Related Articles

Moving at the Speed of Cryptocurrency with Infrastructure as Code

Read more

Automating API Information Storage with AWS - Introduction

Read more

AWS EKS Identity is Not Mapped Error

Read more

Contact Us

Achieve a competitive advantage through BSC data analytics and cloud solutions.

Contact Us