Dynamodb Table Setup (IAC)
A comprehensive reference for creating, deploying, and managing Amazon DynamoDB tables using Infrastructure as Code (IaC) with TypeScript (AWS CDK), CloudFormation, and AWS Glue for schema discovery.
Table of Contents
-
DynamoDB Core Concepts
-
Key Schema Design Fundamentals
-
Capacity Modes & Billing
-
Secondary Indexes (GSI & LSI)
-
Creating a DynamoDB Table with AWS CDK (TypeScript)
-
Creating a DynamoDB Table with CloudFormation (YAML)
-
Console-to-Code: Converting Manual Configuration to IaC
-
End-to-End Deployment Workflow
-
AWS Glue: Schema Discovery for DynamoDB
-
Production Best Practices
-
Reference Links
DynamoDB Core Concepts
Amazon DynamoDB is a fully managed NoSQL key-value and document database designed for single-digit millisecond performance at any scale. Unlike relational databases, DynamoDB is schemaless — only the primary key attributes must be defined at table creation time. All other attributes can vary from item to item.
Data Model Hierarchy
| Concept | Description |
| Table | A collection of items. Similar to a "table" in relational databases. |
| Item | A single data record in a table. Similar to a "row." Each item is uniquely identified by its primary key. |
| Attribute | A fundamental data element within an item. Similar to a "column." Attributes can be scalar (string, number, binary), document (list, map), or set types. |
Supported Attribute Types
| Type Code | Type | Example |
S | String | "Hello" |
N | Number | "42" or "3.14" |
B | Binary | Base64-encoded binary data |
BOOL | Boolean | true / false |
NULL | Null | true |
L | List | ["a", 1, true] |
M | Map | {"name": "John", "age": 30} |
SS | String Set | ["a", "b", "c"] |
NS | Number Set | ["1", "2", "3"] |
BS | Binary Set | Set of binary values |
Note: Only
S,N, andBtypes can be used for primary key attributes and index key attributes.
Key Schema Design Fundamentals
Every DynamoDB table requires a primary key, defined at creation time. There are two types:
Simple Primary Key (Partition Key Only)
A single attribute that uniquely identifies each item. DynamoDB uses the partition key's value as input to an internal hash function to determine the physical partition where the item is stored.
┌─────────────────────────────────┐
│ Table: Users │
│ Partition Key: user_id (S) │
├─────────────────────────────────┤
│ user_id = "u-001" → Partition A │
│ user_id = "u-002" → Partition B │
│ user_id = "u-003" → Partition A │
└─────────────────────────────────┘
Composite Primary Key (Partition Key + Sort Key)
Two attributes together form the primary key. Multiple items can share the same partition key, but each item within a partition must have a unique sort key. Items with the same partition key are stored together, sorted by the sort key value.
┌───────────────────────────────────────────────────┐
│ Table: Orders │
│ Partition Key: customer_id (S) │
│ Sort Key: order_date (S) │
├───────────────────────────────────────────────────┤
│ customer_id = "c-100", order_date = "2025-01-15" │
│ customer_id = "c-100", order_date = "2025-03-22" │ ← Same partition
│ customer_id = "c-200", order_date = "2025-02-10" │ ← Different partition
└───────────────────────────────────────────────────┘
Key Design Tips
-
High cardinality partition keys distribute data more evenly across partitions, avoiding hot partitions.
-
Use composite sort keys (e.g.,
STATUS#2025-01-15) to enable range queries and hierarchical data models. -
Design your keys based on access patterns first, not entity relationships.
Capacity Modes & Billing
DynamoDB offers two capacity modes. You choose the mode at table creation, and you can switch between them later (once every 24 hours).
On-Demand Mode (PAY_PER_REQUEST)
-
No capacity planning required — DynamoDB scales automatically.
-
You pay per read/write request.
-
Ideal for unpredictable workloads, new tables, or spiky traffic.
-
No throttling as long as traffic doesn't exceed account-level throughput quotas.
Provisioned Mode (PROVISIONED)
-
You specify Read Capacity Units (RCUs) and Write Capacity Units (WCUs).
-
1 RCU = one strongly consistent read/sec for items up to 4 KB.
-
1 WCU = one write/sec for items up to 1 KB.
-
Typically paired with Auto Scaling to adjust capacity based on utilization.
-
More cost-effective for predictable, steady workloads.
| Factor | On-Demand | Provisioned |
| Cost model | Per-request pricing | Hourly rate for reserved capacity |
| Scaling | Automatic, instant | Manual or via Auto Scaling |
| Best for | Unpredictable traffic | Steady, predictable traffic |
| Capacity planning | None required | Must estimate RCU/WCU |
Secondary Indexes (GSI & LSI)
Secondary indexes let you query data using attributes other than the table's primary key.
Global Secondary Index (GSI)
-
Can have a different partition key and sort key from the base table.
-
Can be added or removed after table creation.
-
Has its own provisioned throughput (in provisioned mode).
-
Eventually consistent reads only.
-
Maximum of 20 GSIs per table.
Local Secondary Index (LSI)
-
Shares the same partition key as the base table but has a different sort key.
-
Must be created at table creation time — cannot be added later.
-
Supports both strongly consistent and eventually consistent reads.
-
Maximum of 5 LSIs per table.
-
Table with LSIs has a 10 GB per partition key limit.
Projection Types
When creating an index, you choose which attributes to project (copy) into the index:
| Projection Type | Description |
KEYS_ONLY | Only the table and index keys are projected. Cheapest storage. |
INCLUDE | Keys plus specific non-key attributes you specify. |
ALL | All attributes are projected. Most flexible but highest storage cost. |
Creating a DynamoDB Table with AWS CDK (TypeScript)
AWS CDK (Cloud Development Kit) lets you define cloud infrastructure using familiar programming languages. The aws-cdk-lib/aws-dynamodb module provides high-level constructs for DynamoDB.
Prerequisites
# Verify Node.js is installed (v18+ recommended)
node -v
# Install AWS CDK globally
npm install -g aws-cdk
# Verify CDK version
cdk --version
# Verify AWS credentials are configured
aws sts get-caller-identity
Project Setup
# Create a new CDK project
mkdir dynamodb-cdk && cd dynamodb-cdk
cdk init app --language=typescript
# Install dependencies (aws-cdk-lib includes DynamoDB constructs)
npm install aws-cdk-lib constructs
Example 1: Basic Table with On-Demand Billing
// lib/dynamodb-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import { Construct } from 'constructs';
export class DynamoDbStack extends cdk.Stack {
public readonly usersTable: dynamodb.Table;
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
this.usersTable = new dynamodb.Table(this, 'UsersTable', {
tableName: 'Users',
partitionKey: {
name: 'user_id',
type: dynamodb.AttributeType.STRING,
},
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
removalPolicy: cdk.RemovalPolicy.DESTROY, // Use RETAIN for production
});
// Output the table name
new cdk.CfnOutput(this, 'TableName', {
value: this.usersTable.tableName,
});
}
}
Example 2: Composite Key with Sort Key
const ordersTable = new dynamodb.Table(this, 'OrdersTable', {
tableName: 'Orders',
partitionKey: {
name: 'customer_id',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'order_date',
type: dynamodb.AttributeType.STRING,
},
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
removalPolicy: cdk.RemovalPolicy.RETAIN,
pointInTimeRecovery: true, // Enable PITR backups
});
Example 3: Table with Global Secondary Indexes
const productsTable = new dynamodb.Table(this, 'ProductsTable', {
tableName: 'Products',
partitionKey: {
name: 'product_id',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'created_at',
type: dynamodb.AttributeType.STRING,
},
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
removalPolicy: cdk.RemovalPolicy.DESTROY,
});
// GSI 1: Query products by category and price
productsTable.addGlobalSecondaryIndex({
indexName: 'CategoryPriceIndex',
partitionKey: {
name: 'category',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'price',
type: dynamodb.AttributeType.NUMBER,
},
projectionType: dynamodb.ProjectionType.ALL,
});
// GSI 2: Query products by seller (keys only for lightweight lookups)
productsTable.addGlobalSecondaryIndex({
indexName: 'SellerIndex',
partitionKey: {
name: 'seller_id',
type: dynamodb.AttributeType.STRING,
},
projectionType: dynamodb.ProjectionType.KEYS_ONLY,
});
Example 4: Table with Local Secondary Index
const messagesTable = new dynamodb.Table(this, 'MessagesTable', {
tableName: 'Messages',
partitionKey: {
name: 'conversation_id',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'timestamp',
type: dynamodb.AttributeType.NUMBER,
},
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
removalPolicy: cdk.RemovalPolicy.DESTROY,
});
// LSI: Query messages by sender within a conversation
// NOTE: LSIs must be defined at table creation time
messagesTable.addLocalSecondaryIndex({
indexName: 'SenderIndex',
sortKey: {
name: 'sender_id',
type: dynamodb.AttributeType.STRING,
},
projectionType: dynamodb.ProjectionType.ALL,
});
Example 5: Provisioned Capacity with Auto Scaling
const sessionsTable = new dynamodb.Table(this, 'SessionsTable', {
tableName: 'Sessions',
partitionKey: {
name: 'session_id',
type: dynamodb.AttributeType.STRING,
},
billingMode: dynamodb.BillingMode.PROVISIONED,
readCapacity: 100,
writeCapacity: 50,
removalPolicy: cdk.RemovalPolicy.RETAIN,
});
// Auto-scale read capacity between 100 and 5000 RCU
const readScaling = sessionsTable.autoScaleReadCapacity({
minCapacity: 100,
maxCapacity: 5000,
});
readScaling.scaleOnUtilization({
targetUtilizationPercent: 70,
scaleInCooldown: cdk.Duration.minutes(1),
scaleOutCooldown: cdk.Duration.minutes(1),
});
// Auto-scale write capacity between 50 and 2000 WCU
const writeScaling = sessionsTable.autoScaleWriteCapacity({
minCapacity: 50,
maxCapacity: 2000,
});
writeScaling.scaleOnUtilization({
targetUtilizationPercent: 70,
});
Example 6: DynamoDB Streams & TTL
const eventsTable = new dynamodb.Table(this, 'EventsTable', {
tableName: 'Events',
partitionKey: {
name: 'event_id',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'timestamp',
type: dynamodb.AttributeType.NUMBER,
},
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
stream: dynamodb.StreamViewType.NEW_AND_OLD_IMAGES,
timeToLiveAttribute: 'ttl', // Items with expired TTL are auto-deleted
removalPolicy: cdk.RemovalPolicy.DESTROY,
});
Example 7: Encryption with Customer-Managed KMS Key
import * as kms from 'aws-cdk-lib/aws-kms';
const encryptionKey = new kms.Key(this, 'DynamoDbKey', {
description: 'KMS key for DynamoDB encryption',
enableKeyRotation: true,
});
const sensitiveTable = new dynamodb.Table(this, 'SensitiveDataTable', {
tableName: 'SensitiveData',
partitionKey: {
name: 'record_id',
type: dynamodb.AttributeType.STRING,
},
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
encryption: dynamodb.TableEncryption.CUSTOMER_MANAGED,
encryptionKey: encryptionKey,
pointInTimeRecovery: true,
removalPolicy: cdk.RemovalPolicy.RETAIN,
contributorInsightsEnabled: true, // Enable CloudWatch contributor insights
});
// Add tags for cost tracking
cdk.Tags.of(sensitiveTable).add('Environment', 'production');
cdk.Tags.of(sensitiveTable).add('Team', 'backend');
Example 8: Global Tables (Multi-Region Replication)
For global tables, use the TableV2 construct which is the recommended approach for new projects:
import * as cdk from 'aws-cdk-lib';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
// Stack must have a defined region for global tables
const stack = new cdk.Stack(app, 'GlobalTableStack', {
env: { region: 'us-west-2' },
});
const globalTable = new dynamodb.TableV2(stack, 'GlobalTable', {
tableName: 'GlobalUsers',
partitionKey: {
name: 'pk',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'sk',
type: dynamodb.AttributeType.STRING,
},
billing: dynamodb.Billing.onDemand(),
replicas: [
{ region: 'us-east-1' },
{ region: 'eu-west-1' },
],
});
Granting IAM Permissions
CDK provides convenient grant methods to manage access to your tables:
import * as iam from 'aws-cdk-lib/aws-iam';
import * as lambda from 'aws-cdk-lib/aws-lambda';
declare const myFunction: lambda.Function;
// Grant the Lambda function read/write access
usersTable.grantReadWriteData(myFunction);
// Or more granular permissions
usersTable.grantReadData(myFunction); // Read-only
usersTable.grantWriteData(myFunction); // Write-only
usersTable.grant(myFunction, 'dynamodb:Query'); // Specific actions
App Entry Point
// bin/app.ts
import * as cdk from 'aws-cdk-lib';
import { DynamoDbStack } from '../lib/dynamodb-stack';
const app = new cdk.App();
new DynamoDbStack(app, 'DynamoDbStack', {
env: {
account: process.env.CDK_DEFAULT_ACCOUNT,
region: process.env.CDK_DEFAULT_REGION,
},
});
Creating a DynamoDB Table with CloudFormation (YAML)
If you prefer declarative YAML/JSON templates over CDK, you can use AWS CloudFormation directly.
Example 1: Simple Table
AWSTemplateFormatVersion: '2010-09-09'
Description: DynamoDB Table with On-Demand Billing
Resources:
UsersTable:
Type: AWS::DynamoDB::Table
DeletionPolicy: Retain
UpdateReplacePolicy: Retain
Properties:
TableName: Users
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: user_id
AttributeType: S
KeySchema:
- AttributeName: user_id
KeyType: HASH
PointInTimeRecoverySpecification:
PointInTimeRecoveryEnabled: true
Outputs:
TableName:
Value: !Ref UsersTable
TableArn:
Value: !GetAtt UsersTable.Arn
Example 2: Table with GSI and Provisioned Throughput
AWSTemplateFormatVersion: '2010-09-09'
Description: DynamoDB Table with GSI and Auto Scaling
Resources:
OrdersTable:
Type: AWS::DynamoDB::Table
DeletionPolicy: Retain
Properties:
TableName: Orders
AttributeDefinitions:
- AttributeName: customer_id
AttributeType: S
- AttributeName: order_date
AttributeType: S
- AttributeName: status
AttributeType: S
KeySchema:
- AttributeName: customer_id
KeyType: HASH
- AttributeName: order_date
KeyType: RANGE
GlobalSecondaryIndexes:
- IndexName: StatusDateIndex
KeySchema:
- AttributeName: status
KeyType: HASH
- AttributeName: order_date
KeyType: RANGE
Projection:
ProjectionType: ALL
ProvisionedThroughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
ProvisionedThroughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
StreamSpecification:
StreamViewType: NEW_AND_OLD_IMAGES
TimeToLiveSpecification:
AttributeName: ttl
Enabled: true
Deploying a CloudFormation Stack
# Create stack
aws cloudformation create-stack \
--stack-name my-dynamodb-stack \
--template-body file://template.yaml
# Update stack
aws cloudformation update-stack \
--stack-name my-dynamodb-stack \
--template-body file://template.yaml
# Delete stack
aws cloudformation delete-stack \
--stack-name my-dynamodb-stack
Important: When your template includes multiple DynamoDB tables with secondary indexes, you must declare
DependsOnrelationships so they are created sequentially. DynamoDB limits the number of tables with secondary indexes that can be in theCREATINGstate simultaneously.
Console-to-Code: Converting Manual Configuration to IaC
AWS provides a feature called Console-to-Code (powered by Amazon Q Developer) that captures your manual DynamoDB console actions and converts them into reusable infrastructure code.
How It Works
-
Prototype in the console — Use the DynamoDB console to create and configure tables with your desired settings (partition keys, sort keys, throughput, indexes, etc.).
-
Record actions — Console-to-Code records these configuration actions as you perform them.
-
Generate code — The tool uses generative AI to transform your console actions into code in your preferred format.
-
Customize & deploy — Copy or download the generated code and adapt it for production.
Supported Output Formats
-
AWS CDK in TypeScript, Python, and Java
-
CloudFormation in YAML or JSON
Getting Started with Console-to-Code
-
Sign in to the AWS Management Console.
-
Open the DynamoDB console at
https://console.aws.amazon.com/dynamodbv2/. -
Begin creating or modifying DynamoDB resources through the console.
-
Use the Console-to-Code panel to generate code for your actions.
-
Copy or download the generated code. This feature is available in all commercial AWS Regions. For detailed instructions, see the Amazon Q Developer User Guide on Console-to-Code.
Reference
-
AWS Documentation: Generate infrastructure code for DynamoDB using Console-to-Code
-
Amazon Q Developer: Automating AWS services with Console-to-Code
End-to-End Deployment Workflow
This section walks through deploying a DynamoDB table from scratch using AWS CDK with TypeScript.
Step 1: Bootstrap Your AWS Environment
CDK requires a one-time bootstrap per account/region to provision resources CDK needs to deploy (S3 bucket for assets, IAM roles, etc.):
cdk bootstrap aws://ACCOUNT_ID/REGION
# Example:
cdk bootstrap aws://123456789012/us-east-1
Step 2: Initialize the CDK Project
mkdir my-dynamodb-app && cd my-dynamodb-app
cdk init app --language=typescript
Step 3: Define Your Stack
Edit lib/my-dynamodb-app-stack.ts:
import * as cdk from 'aws-cdk-lib';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import { Construct } from 'constructs';
export class MyDynamodbAppStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// Create the DynamoDB table
const table = new dynamodb.Table(this, 'MyAppTable', {
tableName: 'MyAppData',
partitionKey: {
name: 'pk',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'sk',
type: dynamodb.AttributeType.STRING,
},
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
removalPolicy: cdk.RemovalPolicy.DESTROY,
pointInTimeRecovery: true,
timeToLiveAttribute: 'ttl',
});
// Add a GSI for querying by type and date
table.addGlobalSecondaryIndex({
indexName: 'GSI1',
partitionKey: {
name: 'GSI1PK',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'GSI1SK',
type: dynamodb.AttributeType.STRING,
},
projectionType: dynamodb.ProjectionType.ALL,
});
// Outputs
new cdk.CfnOutput(this, 'TableNameOutput', {
value: table.tableName,
exportName: 'MyAppTableName',
});
new cdk.CfnOutput(this, 'TableArnOutput', {
value: table.tableArn,
exportName: 'MyAppTableArn',
});
}
}
Step 4: Synthesize the CloudFormation Template
# Preview the generated CloudFormation template
cdk synth
This outputs the CloudFormation YAML to stdout and writes it to cdk.out/.
Step 5: Diff Against Existing Infrastructure
# See what changes will be made
cdk diff
Step 6: Deploy
# Deploy the stack
cdk deploy
# Deploy with a specific AWS profile
cdk deploy --profile my-profile
# Deploy without confirmation prompts
cdk deploy --require-approval never
Step 7: Verify
# Verify the table exists
aws dynamodb describe-table --table-name MyAppData
# Get the table name from CloudFormation outputs
aws cloudformation describe-stacks \
--stack-name MyDynamodbAppStack \
--query 'Stacks[0].Outputs'
Step 8: Clean Up
# Destroy the stack (only works if removalPolicy is DESTROY)
cdk destroy
Deployment Flow Diagram
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ TypeScript │ │ CloudFormation│ │ AWS │
│ CDK Code │────▶│ Template │────▶│ Resources │
│ (lib/*.ts) │ │ (cdk.out/) │ │ (DynamoDB) │
└──────────────┘ └──────────────┘ └──────────────┘
cdk synth cdk deploy Live table
AWS Glue: Schema Discovery for DynamoDB
AWS Glue can automatically discover and catalog the schema of your DynamoDB tables. This is useful for analytics, querying DynamoDB data with Athena, or documenting your table structures.
What Is a Glue Crawler?
An AWS Glue crawler connects to a data store (such as DynamoDB), scans items, infers the schema (column names, data types), and writes the metadata to the AWS Glue Data Catalog. The Data Catalog tables can then be used by services like Amazon Athena, Redshift Spectrum, and Glue ETL jobs.
How Glue Crawls DynamoDB
When a Glue crawler runs against a DynamoDB table, it performs a Scan operation and reads the first 1 MB of data (data sampling) to infer the schema. If your table has highly variable schemas across items, you may want to disable data sampling and let the crawler scan the entire table for more accurate results.
Setting Up a Glue Crawler for DynamoDB with CDK
Here is a complete example that creates a DynamoDB table, populates it with sample data using a custom resource, and sets up a Glue crawler to discover its schema:
// lib/glue-dynamodb-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import * as glue from 'aws-cdk-lib/aws-glue';
import * as iam from 'aws-cdk-lib/aws-iam';
import { Construct } from 'constructs';
export class GlueDynamoDbStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// ── 1. Create the DynamoDB Table ──
const productTable = new dynamodb.Table(this, 'ProductCatalog', {
tableName: 'ProductCatalog',
partitionKey: {
name: 'product_id',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'category',
type: dynamodb.AttributeType.STRING,
},
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
removalPolicy: cdk.RemovalPolicy.DESTROY,
});
// ── 2. Create the Glue Database ──
const glueDatabase = new glue.CfnDatabase(this, 'GlueDatabase', {
catalogId: this.account,
databaseInput: {
name: 'dynamodb_catalog',
description: 'Glue catalog for DynamoDB table schemas',
},
});
// ── 3. Create IAM Role for the Crawler ──
const crawlerRole = new iam.Role(this, 'GlueCrawlerRole', {
assumedBy: new iam.ServicePrincipal('glue.amazonaws.com'),
managedPolicies: [
iam.ManagedPolicy.fromAwsManagedPolicyName(
'service-role/AWSGlueServiceRole'
),
],
});
// Grant the crawler read access to the DynamoDB table
productTable.grantReadData(crawlerRole);
// ── 4. Create the Glue Crawler ──
const crawler = new glue.CfnCrawler(this, 'DynamoDbCrawler', {
name: 'product-catalog-crawler',
role: crawlerRole.roleArn,
databaseName: 'dynamodb_catalog',
targets: {
dynamoDbTargets: [
{
path: productTable.tableName,
},
],
},
schemaChangePolicy: {
updateBehavior: 'UPDATE_IN_DATABASE',
deleteBehavior: 'LOG',
},
schedule: {
// Run daily at 2 AM UTC
scheduleExpression: 'cron(0 2 * * ? *)',
},
});
crawler.addDependency(glueDatabase);
// ── 5. Outputs ──
new cdk.CfnOutput(this, 'CrawlerName', {
value: crawler.name!,
});
new cdk.CfnOutput(this, 'GlueDatabaseName', {
value: 'dynamodb_catalog',
});
}
}
Running the Crawler Manually
After deploying the stack, run the crawler to discover the schema:
# Start the crawler
aws glue start-crawler --name product-catalog-crawler
# Check the crawler status
aws glue get-crawler --name product-catalog-crawler \
--query 'Crawler.State'
# Once completed, view the discovered table schema
aws glue get-table \
--database-name dynamodb_catalog \
--name productcatalog
Viewing the Schema in Athena
Once the crawler has created the catalog table, you can query the schema metadata:
-- In Athena, select the 'dynamodb_catalog' database
-- The crawler creates a table matching your DynamoDB table name
-- View the table metadata
SHOW CREATE TABLE productcatalog;
-- Query the data (requires the Athena DynamoDB connector)
SELECT * FROM productcatalog LIMIT 10;
Glue Crawler Schema Output Example
After the crawler runs, the Glue Data Catalog table will contain schema information like:
| Column Name | Data Type | Comment |
product_id | string | Partition key |
category | string | Sort key |
name | string | Inferred from data |
price | double | Inferred from data |
in_stock | boolean | Inferred from data |
tags | array\<string> | Inferred from data |
metadata | struct<...> | Inferred from nested map |
Glue Crawler Configuration Options
| Setting | Options | Description |
UpdateBehavior | UPDATE_IN_DATABASE, LOG | What happens when schema changes are detected |
DeleteBehavior | DELETE_FROM_DATABASE, LOG, DEPRECATE_IN_DATABASE | What happens when a table is no longer detected |
RecrawlPolicy | CRAWL_EVERYTHING, CRAWL_NEW_FOLDERS_ONLY | Controls what data is re-examined on each run |
Production Best Practices
Table Design
-
Use
RETAINremoval policy for production tables — never useDESTROYon tables with real data. -
Enable Point-in-Time Recovery (PITR) for continuous backups.
-
Enable deletion protection to prevent accidental table deletion.
-
Use TTL to automatically clean up expired data and reduce storage costs.
-
Enable DynamoDB Streams if you need change data capture for event-driven architectures.
Capacity & Performance
-
Start with on-demand billing for new tables until you understand traffic patterns.
-
Switch to provisioned capacity with auto scaling once traffic is predictable, to reduce costs.
-
Monitor ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits CloudWatch metrics.
-
Enable Contributor Insights to identify hot partition keys.
Security
-
Use customer-managed KMS keys for encryption on sensitive tables.
-
Follow the principle of least privilege with IAM — use CDK grant methods like
grantReadData()instead of broad policies. -
Enable CloudTrail logging for DynamoDB API calls.
IaC Best Practices
-
Never hardcode table names in application code — use CloudFormation outputs or SSM parameters.
-
Use stack outputs and exports to share table names/ARNs across stacks.
-
Tag all resources for cost allocation and governance.
-
Use separate stacks for stateful resources (DynamoDB, S3) and stateless resources (Lambda, API Gateway) so you can update them independently.
-
**Declare **
DependsOnwhen creating multiple tables with indexes in the same stack to avoidLimitExceededException.
CDK-Specific Tips
-
Use
TableV2construct for new tables requiring global table features or multi-account replication. -
The default removal policy for
TableisRETAIN— this is intentional to protect your data. Explicitly setDESTROYonly for dev/test tables. -
Local Secondary Indexes can only be added at table creation time via CDK — they cannot be added after the initial deployment.
-
When switching from on-demand to provisioned billing mode (or vice versa), DynamoDB allows this switch only once every 24 hours.
Reference Links
AWS Official Documentation
| Resource | URL |
| DynamoDB Developer Guide | https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ |
| Console-to-Code for DynamoDB | https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/console-to-code.html |
CloudFormation — AWS::DynamoDB::Table | https://docs.aws.amazon.com/AWSCloudFormation/latest/TemplateReference/aws-resource-dynamodb-table.html |
| CloudFormation DynamoDB Snippets | https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/quickref-dynamodb.html |
CDK aws_dynamodb Module Reference | https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_dynamodb-readme.html |
CDK Table Construct API | https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_dynamodb.Table.html |
| AWS Glue Crawlers | https://docs.aws.amazon.com/glue/latest/dg/add-crawler.html |
| DynamoDB Best Practices | https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html |
CDK Examples & Tutorials
| Resource | URL |
| AWS CDK Examples (GitHub) | https://github.com/aws-samples/aws-cdk-examples |
| CDK Workshop | https://cdkworkshop.com |
Last updated: April 2026. Always verify against the latest AWS documentation for the most current API changes and feature additions.