S3Tables Setup
This guide covers the new approach for streaming data from Amazon Data Firehose to S3 Tables without requiring resource links, based on the AWS updates from May 2025.
Prerequisites
-
AWS Account with appropriate permissions
-
AWS CLI installed and configured
-
Familiarity with AWS CDK (if using Infrastructure as Code)
-
S3 Tables integration with AWS Analytics services enabled
Overview
The new approach eliminates the need for resource links by allowing Firehose to access S3 Tables directly through the s3tablescatalog catalog format.
Step 1: Enable S3 Tables Integration with AWS Analytics Services
1.1 Via AWS Console
-
Navigate to Amazon S3 Console → Table buckets
-
Click Enable integration if not already enabled
-
This allows S3 Tables to be discovered by AWS analytics services
1.2 Via AWS CLI
aws s3tables put-table-bucket-policy \
--table-bucket-arn "arn:aws:s3tables:REGION:ACCOUNT:bucket/BUCKET_NAME" \
--policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"Service": "analytics.amazonaws.com"},
"Action": "s3tables:*",
"Resource": "*"
}
]
}' \
--region REGION
Step 2: Create S3 Table Bucket and Table
2.1 Create Table Bucket
aws s3tables create-table-bucket \
--name "your-table-bucket-name" \
--region REGION
2.2 Create Namespace
aws s3tables create-namespace \
--table-bucket-arn "arn:aws:s3tables:REGION:ACCOUNT:bucket/BUCKET_NAME" \
--namespace "your-namespace" \
--region REGION
2.3 Create Table with Schema
# Create table definition file
cat > table-definition.json << 'EOF'
{
"tableBucketARN": "arn:aws:s3tables:REGION:ACCOUNT:bucket/BUCKET_NAME",
"namespace": "your-namespace",
"name": "your-table-name",
"format": "ICEBERG",
"metadata": {
"iceberg": {
"schema": {
"fields": [
{"name": "id", "type": "int", "required": true},
{"name": "timestamp", "type": "timestamp"},
{"name": "data", "type": "string"}
]
}
}
}
}
EOF
# Create the table
aws s3tables create-table --cli-input-json file://table-definition.json --region REGION
Step 3: Create IAM Role for Firehose
3.1 Create Trust Policy
cat > firehose-trust-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "firehose.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
3.2 Create IAM Role
aws iam create-role \
--role-name "firehose-s3-tables-role" \
--assume-role-policy-document file://firehose-trust-policy.json \
--region REGION
Step 4: Create IAM Policies for S3 Tables Access
4.1 Create S3 Tables Access Policy
cat > s3-tables-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3TableAccessViaGlueFederation",
"Effect": "Allow",
"Action": [
"glue:GetTable",
"glue:GetDatabase",
"glue:UpdateTable"
],
"Resource": [
"arn:aws:glue:REGION:ACCOUNT:catalog/s3tablescatalog/TABLE_BUCKET_NAME",
"arn:aws:glue:REGION:ACCOUNT:catalog/s3tablescatalog",
"arn:aws:glue:REGION:ACCOUNT:catalog",
"arn:aws:glue:REGION:ACCOUNT:database/s3tablescatalog/TABLE_BUCKET_NAME/NAMESPACE_NAME",
"arn:aws:glue:REGION:ACCOUNT:table/s3tablescatalog/TABLE_BUCKET_NAME/NAMESPACE_NAME/TABLE_NAME"
]
},
{
"Sid": "LakeFormationDataAccess",
"Effect": "Allow",
"Action": [
"lakeformation:GetDataAccess"
],
"Resource": "*"
},
{
"Sid": "S3TablesDirectAccess",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3tables:REGION:ACCOUNT:bucket/TABLE_BUCKET_NAME",
"arn:aws:s3tables:REGION:ACCOUNT:bucket/TABLE_BUCKET_NAME/*"
]
},
{
"Sid": "S3BackupBucketAccess",
"Effect": "Allow",
"Action": [
"s3:AbortMultipartUpload",
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::backup-bucket-name",
"arn:aws:s3:::backup-bucket-name/*"
]
}
]
}
EOF
# Attach policy to role
aws iam put-role-policy \
--role-name "firehose-s3-tables-role" \
--policy-name "S3TablesFirehosePolicy" \
--policy-document file://s3-tables-policy.json \
--region REGION
Important: Replace the following placeholders:
-
REGION: Your AWS region (e.g.,ap-northeast-1) -
ACCOUNT: Your AWS account ID -
TABLE_BUCKET_NAME: Your S3 table bucket name -
NAMESPACE_NAME: Your S3 Tables namespace -
TABLE_NAME: Your S3 Tables table name -
backup-bucket-name: Your backup S3 bucket name
Step 5: Configure Lake Formation Permissions
5.1 Grant Permissions via AWS Console
-
Navigate to AWS Lake Formation Console
-
Go to Data permissions → Grant
-
Choose Named Data Catalog resources
-
Principal: Select your Firehose role (
firehose-s3-tables-role) -
Catalog: Select
ACCOUNT:s3tablescatalog/TABLE_BUCKET_NAME -
Database: Select your namespace
-
Table: Select your table or “All tables”
-
Permissions: Grant Super permissions
-
Click Grant
5.2 Grant Permissions via AWS CLI
aws lakeformation grant-permissions \
--principal "arn:aws:iam::ACCOUNT:role/firehose-s3-tables-role" \
--resource '{
"Table": {
"CatalogId": "ACCOUNT:s3tablescatalog/TABLE_BUCKET_NAME",
"DatabaseName": "NAMESPACE_NAME",
"Name": "TABLE_NAME"
}
}' \
--permissions "ALL" \
--region REGION
Step 6: Create Firehose Delivery Stream
6.1 Via AWS Console
-
Navigate to Amazon Data Firehose Console
-
Click Create delivery stream
-
Source: Choose Direct PUT
-
Destination: Choose Apache Iceberg Tables
-
Stream name: Enter your stream name
-
Destination settings:
-
Catalog:
ACCOUNT:s3tablescatalog/TABLE_BUCKET_NAME -
Database: Your namespace name
-
Table: Your table name
-
IAM role: Select your Firehose role
-
S3 backup: Configure backup bucket
-
Click Create delivery stream
6.2 Via AWS CLI
cat > firehose-config.json << 'EOF'
{
"DeliveryStreamName": "your-firehose-stream-name",
"DeliveryStreamType": "DirectPut",
"IcebergDestinationConfiguration": {
"RoleARN": "arn:aws:iam::ACCOUNT:role/firehose-s3-tables-role",
"CatalogConfiguration": {
"CatalogARN": "arn:aws:glue:REGION:ACCOUNT:catalog/s3tablescatalog/TABLE_BUCKET_NAME"
},
"DestinationTableConfigurationList": [
{
"DestinationDatabaseName": "NAMESPACE_NAME",
"DestinationTableName": "TABLE_NAME"
}
],
"BufferingHints": {
"IntervalInSeconds": 60,
"SizeInMBs": 64
},
"S3Configuration": {
"RoleARN": "arn:aws:iam::ACCOUNT:role/firehose-s3-tables-role",
"BucketARN": "arn:aws:s3:::backup-bucket-name"
}
}
}
EOF
aws firehose create-delivery-stream --cli-input-json file://firehose-config.json --region REGION
Step 7: CDK Implementation (Optional)
7.1 CDK Construct Example
import { ITableBucket } from '@aws-cdk/aws-s3tables-alpha';
import {
aws_kinesisfirehose as firehose,
aws_iam as iam,
aws_s3 as s3,
Stack,
} from 'aws-cdk-lib';
import { Construct } from 'constructs';
export class S3TablesFirehoseConstruct extends Construct {
readonly deliveryStream: firehose.CfnDeliveryStream;
readonly firehoseRole: iam.IRole;
constructor(scope: Construct, id: string, props: {
namespaceName: string;
tableName: string;
tableBucket: ITableBucket;
}) {
super(scope, id);
const { region, account } = Stack.of(this);
// Create backup bucket
const backupBucket = new s3.Bucket(this, 'FirehoseBackupBucket');
// Create Firehose role
this.firehoseRole = new iam.Role(this, 'FirehoseRole', {
assumedBy: new iam.ServicePrincipal('firehose.amazonaws.com'),
inlinePolicies: {
S3TablesAccess: new iam.PolicyDocument({
statements: [
new iam.PolicyStatement({
sid: 'S3TableDirectCatalogAccess',
actions: ['glue:GetDatabase', 'glue:GetTable', 'glue:UpdateTable'],
resources: [
`arn:aws:glue:${region}:${account}:catalog/s3tablescatalog/${props.tableBucket.tableBucketName}`,
`arn:aws:glue:${region}:${account}:catalog/s3tablescatalog`,
`arn:aws:glue:${region}:${account}:catalog`,
`arn:aws:glue:${region}:${account}:database/s3tablescatalog/${props.tableBucket.tableBucketName}/${props.namespaceName}`,
`arn:aws:glue:${region}:${account}:table/s3tablescatalog/${props.tableBucket.tableBucketName}/${props.namespaceName}/${props.tableName}`,
],
}),
new iam.PolicyStatement({
sid: 'LakeFormationDataAccess',
actions: ['lakeformation:GetDataAccess'],
resources: ['*'],
}),
new iam.PolicyStatement({
sid: 'S3TablesDirectBucketAccess',
actions: ['s3:GetObject', 's3:PutObject', 's3:DeleteObject', 's3:ListBucket'],
resources: [
props.tableBucket.tableBucketArn,
`${props.tableBucket.tableBucketArn}/*`,
],
}),
],
}),
},
});
// Grant backup bucket permissions
backupBucket.grantReadWrite(this.firehoseRole);
// Create Firehose delivery stream
this.deliveryStream = new firehose.CfnDeliveryStream(this, 'DeliveryStream', {
deliveryStreamType: 'DirectPut',
icebergDestinationConfiguration: {
roleArn: this.firehoseRole.roleArn,
catalogConfiguration: {
catalogArn: `arn:aws:glue:${region}:${account}:catalog/s3tablescatalog/${props.tableBucket.tableBucketName}`,
},
destinationTableConfigurationList: [{
destinationDatabaseName: props.namespaceName,
destinationTableName: props.tableName,
}],
bufferingHints: {
intervalInSeconds: 60,
sizeInMBs: 64,
},
s3Configuration: {
roleArn: this.firehoseRole.roleArn,
bucketArn: backupBucket.bucketArn,
},
} as any,
});
}
}
Step 8: Test the Setup
8.1 Send Test Data
# Create test data
cat > test-data.json << 'EOF'
{"id": 1, "timestamp": "2025-07-18T10:00:00Z", "data": "test message"}
EOF
# Send data to Firehose
aws firehose put-record \
--delivery-stream-name "your-firehose-stream-name" \
--record '{"Data": "{\"id\": 1, \"timestamp\": \"2025-07-18T10:00:00Z\", \"data\": \"test message\"}"}' \
--region REGION
8.2 Verify Data in S3 Tables
# Query using Athena
aws athena start-query-execution \
--query-string "SELECT * FROM\"s3tablescatalog/TABLE_BUCKET_NAME\".\"NAMESPACE_NAME\".\"TABLE_NAME\" LIMIT 10" \
--result-configuration "OutputLocation=s3://athena-results-bucket/" \
--region REGION
Key Differences from Previous Approach
❌ Old Approach (With Resource Links)
-
Required creating resource links in default Glue catalog
-
Complex setup with multiple catalogs
-
Potential synchronization issues
✅ New Approach (Direct S3 Tables Access)
-
No resource links required
-
Direct access to S3 Tables through
s3tablescatalog -
Simpler, more reliable setup
-
Better performance
Troubleshooting
Common Issues
- Permission Errors
-
Verify IAM role has all required permissions
-
Check Lake Formation permissions are granted
-
Ensure S3 Tables integration is enabled
- Table Not Found
-
Verify table exists in S3 Tables
-
Check namespace and table names match exactly
-
Confirm catalog ARN format
- Access Denied
-
Check Lake Formation permissions
-
Verify IAM policies use correct ARN format
-
Ensure S3 Tables bucket permissions
Verification Commands
# Check S3 Tables resources
aws s3tables list-table-buckets --region REGION
aws s3tables list-namespaces --table-bucket-arn "arn:aws:s3tables:REGION:ACCOUNT:bucket/BUCKET_NAME" --region REGION
aws s3tables list-tables --table-bucket-arn "arn:aws:s3tables:REGION:ACCOUNT:bucket/BUCKET_NAME" --namespace "NAMESPACE_NAME" --region REGION
# Check IAM role
aws iam get-role --role-name "firehose-s3-tables-role"
aws iam list-role-policies --role-name "firehose-s3-tables-role"
# Check Lake Formation permissions
aws lakeformation list-permissions --principal "arn:aws:iam::ACCOUNT:role/firehose-s3-tables-role" --region REGION
Conclusion
This new approach significantly simplifies the setup of Firehose with S3 Tables by eliminating the need for resource links. The direct catalog access provides better performance and reliability while reducing the complexity of the overall architecture. The key success factors are:
-
Proper IAM permissions with the new ARN format
-
Correct Lake Formation permissions
-
Using the
s3tablescatalogcatalog ARN format -
Direct namespace and table references