AWS Lake Formation: Part 4 Fine-Grained Access Control with Lake Formation and IAM
This series installment on AWS Lake Formation dives deep into fine-grained access control mechanisms, focusing on how Lake Formation integrates with AWS...

Todd Bernson
2024-09-28

This series installment on AWS Lake Formation dives deep into fine-grained access control mechanisms, focusing on how Lake Formation integrates with AWS IAM to enforce detailed security policies.

Let's explore how to implement and manage these controls to ensure precise and secure data access within your data lake environment.
Clone the project repo here.
Understanding Fine-Grained Access Control in Lake Formation
AWS Lake Formation enhances the security features IAM provides, allowing you to define precise access controls to your data lake resources at the column, row, and cell levels. This granularity ensures that users and services can access only the data necessary for their role, enhancing security and compliance.
Key Concepts:
- Data Permissions: Control who can access data within tables and columns.
- Table and Database Permissions: Manage access at the database or table level, specifying who can create, modify, or delete resources.
- Row-Level Security: Apply filters to data queries, ensuring users see only the data they are authorized to view.
Implementing Granular Security Policies with Lake Formation
Configuring granular security policies involves setting up permissions that align with your organization's data governance policies. Lake Formation provides a comprehensive set of tools to manage these permissions effectively.
Terraform Configuration for Lake Formation Permissions:
resource "aws_lakeformation_permissions" "data_access" {
principal = aws_iam_role.analyst.arn
database_name = aws_glue_catalog_database.financial_data.name
table_name = aws_glue_catalog_table.sales_data.name
permissions = ["SELECT"]
permissions_with_grant_option = []
}
resource "aws_lakeformation_permissions" "row_level_security" {
principal = aws_iam_role.analyst.arn
database_name = aws_glue_catalog_database.financial_data.name
table_name = aws_glue_catalog_table.sales_data.name
column_names = ["customer_id", "transaction_value"]
permissions = ["SELECT"]
permissions_with_grant_option = []
row_filter {
filter_expression = "customer_region = 'EU'"
}
}
In this example, we set up permissions for an analyst role, allowing them to perform SELECT queries on specific columns of the sales_data table and applying a row filter to restrict data visibility to a certain region.
Integrating IAM with Lake Formation
While Lake Formation provides the tools for fine-grained access control within the data lake, IAM is crucial in managing overall permissions and identities.
Best Practices for IAM Integration:
- RBAC: Use IAM roles to manage access rights, associating these roles with Lake Formation permissions.
- Least Privilege Principle: Assign the minimum necessary permissions to roles and individuals to reduce the risk of unauthorized data access.
Terraform Configuration for IAM and Lake Formation Integration:
data "aws_iam_policy_document" "lakeformation_policy" {
statement {
actions = [
"glue:CreateDatabase",
"glue:GetDatabase",
"glue:UpdateDatabase",
"glue:DeleteDatabase",
"glue:CreateTable",
"glue:GetTable",
"glue:UpdateTable",
"glue:DeleteTable",
"glue:BatchGetJobs",
"glue:GetJob",
"glue:StartJobRun",
"glue:BatchStopJobRun",
"glue:CreateCrawler",
"glue:GetCrawler",
"glue:UpdateCrawler",
"glue:StartCrawler",
"glue:StopCrawler"
]
resources = [
"arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:catalog",
"arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:crawler/${local.environment}",
"arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:database/${local.environment}",
"arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:job/${local.environment}",
"arn:aws:glue:${var.region}:${data.aws_caller_identity.current.account_id}:table/${local.environment}/*",
]
effect = "Allow"
}
statement {
effect = "Allow"
actions = [
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject",
]
resources = [
"${data.aws_s3_bucket.bucket.arn}/*"
]
}
statement {
effect = "Allow"
actions = [
"s3:ListBucket"
]
resources = [data.aws_s3_bucket.bucket.arn]
}
}
data "aws_iam_policy_document" "lakeformation_role" {
statement {
actions = ["sts:AssumeRole"]
effect = "Allow"
principals {
identifiers = ["lakeformation.amazonaws.com"]
type = "Service"
}
}
}
locals {
environment = "${var.environment}_${random_string.this.result}"
}
resource "aws_iam_policy" "lakeformation_service_policy" {
name = "${local.environment}_policy"
description = "Policy that allows sufficient permissions for the crawler"
policy = data.aws_iam_policy_document.lakeformation_policy.json
}
resource "aws_iam_role" "lakeformation_service_role" {
name = "${local.environment}_role"
assume_role_policy = data.aws_iam_policy_document.lakeformation_role.json
tags = var.tags
}
resource "aws_iam_role_policy_attachment" "lakeformation_service_policy_attachment" {
role = aws_iam_role.lakeformation_service_role.name
policy_arn = aws_iam_policy.lakeformation_service_policy.arn
}
resource "aws_lakeformation_data_lake_settings" "this" {
admins = [data.aws_iam_session_context.current.issuer_arn]
}
resource "aws_lakeformation_permissions" "caller_catalog_database_permissions" {
principal = data.aws_iam_role.terraform.arn
permissions = ["ALL"]
database {
name = aws_glue_catalog_database.this.name
}
}
resource "aws_lakeformation_permissions" "caller_catalog_table_permissions" {
principal = data.aws_iam_role.terraform.arn
permissions = ["ALL"]
table {
database_name = aws_glue_catalog_database.this.name
wildcard = true
}
}
resource "aws_lakeformation_permissions" "glue_catalog_database_permissions" {
principal = aws_iam_role.glue_service_role.arn
permissions = [
"ALTER",
"CREATE_TABLE",
"DROP",
]
database {
name = aws_glue_catalog_database.this.name
}
}
resource "aws_lakeformation_permissions" "glue_catalog_table_permissions" {
principal = aws_iam_role.glue_service_role.arn
permissions = ["ALL"]
table {
database_name = aws_glue_catalog_database.this.name
wildcard = true
}
}
resource "aws_lakeformation_permissions" "s3_data_location_permissions" {
principal = aws_iam_role.glue_service_role.arn
permissions = ["DATA_LOCATION_ACCESS"]
data_location {
arn = data.aws_s3_bucket.bucket.arn
}
}
resource "aws_lakeformation_resource" "this" {
arn = data.aws_s3_bucket.bucket.arn
role_arn = aws_iam_role.lakeformation_service_role.arn
}
I used these actual permissions, which would be too broad for use in a production instance.
Implementing fine-grained access control with AWS Lake Formation and IAM provides robust security for your data lake. By leveraging Lake Formation's detailed access control features and IAM's comprehensive identity management capabilities, you can ensure that data within your data lake is secure and compliant with internal and regulatory standards. Using Terraform to manage these configurations as code enhances security and adds a layer of automation that keeps your data governance updated with organizational changes.
Visit my website here.
Read More
View all posts
AI/ML
Why Enterprise AI Must Be Application-Led, Not Agent-Led
A deep dive by Todd Bernson, CTO and Chief AI Officer, on why enterprise AI systems should be architected as application-led, deterministic platforms with embedded agentic AI—not fully autonomous agents. This article explains how API-first, governed, multi-channel architectures deliver higher reliability, compliance, scalability, and business value in real-world Fortune-500 environments.

Todd Bernson
2025-12-02

AI/ML
Application-First Agentic AI
Application-first agentic AI is emerging as the only reliable path to real enterprise ROI. In this in-depth analysis, Todd Bernson, CTO & CAIO, breaks down why most generative AI initiatives stall in production—and how disciplined enterprise architecture, deterministic workflows, and narrowly scoped AI agents can finally unlock repeatable business value. Using a real sprint-intelligence system as a case study, the article shows how organizations can combine serverless engineering, structured orchestration, and constrained LLM reasoning to reduce reporting effort, increase trust, eliminate hallucinations, and deliver actionable insights across engineering, operations, compliance, and customer experience.

Todd Bernson
2025-11-28
AI/ML
Why 95% of AI Projects Fail and How to Be Among the 5% That Succeed

Lee Hylton
2025-08-22