AWS S3 table buckets
This guide walks you through configuring AWS S3 table buckets as storage for your data in Embucket. S3 table buckets provide extra benefits over standard S3 based volumes. These benefits include automatic table maintenance and AWS-managed optimization features.
What you’ll learn
Section titled “What you’ll learn”Follow this guide to:
- Create an AWS S3 table bucket using the AWS command-line tool
- Configure an Embucket volume to use S3 table buckets
- Create tables and load data using familiar SQL commands
- Verify table creation and query data through AWS Console
What this guide covers
Section titled “What this guide covers”This guide covers basic setup and usage of S3 table buckets with Embucket. Advanced table maintenance features, cross-region replication, and enterprise security configurations fall outside the scope of this guide.
Prerequisites
Section titled “Prerequisites”Before you begin, verify you have:
- AWS command-line tool installed and configured with appropriate permissions
- Embucket instance running locally or in your environment
- Valid AWS credentials with S3 Tables service permissions
Understanding AWS S3 table buckets
Section titled “Understanding AWS S3 table buckets”AWS designed S3 table buckets specifically for tabular data storage. They provide:
- Automatic optimization: Built-in compaction and file organization
- Schema enforcement: Native support for table schemas and metadata
- Query performance: Optimized for analytical workloads
- AWS integration: Seamless integration with Athena, Glue, and other AWS services
Create an S3 table bucket
Section titled “Create an S3 table bucket”-
Create the table bucket
Use the AWS command-line tool to create your S3 table bucket:
Terminal window aws s3tables create-table-bucket --name my-table-bucket --region us-east-2The command returns the bucket ARN:
{"arn": "arn:aws:s3tables:us-east-2:123456789012:bucket/my-table-bucket"} -
Record the bucket information
Save the following information for the next step:
- Bucket name:
my-table-bucket
- Region:
us-east-2
- ARN: The full ARN returned by the command
- Bucket name:
Configure Embucket volume
Section titled “Configure Embucket volume”Create a volume in Embucket that connects to your S3 table bucket. Use either the API or SQL interface.
Use HTTPie
or curl
to create the volume via API:
http POST localhost:3000/v1/metastore/volumes \ ident=demo \ type=s3-tables \ database=demo \ credentials:='{ "credential_type": "access_key", "aws-access-key-id": "YOUR_ACCESS_KEY", "aws-secret-access-key": "YOUR_SECRET_KEY" }' \ arn=arn:aws:s3tables:us-east-2:123456789012:bucket/my-table-bucket
Required parameters:
ident
: Volume identifiertype
: Volume types3-tables
database
: Database name that maps to this volumecredentials
: AWS access credentialsarn
: Full S3 table bucket ARN
Use the SQL interface to create the volume:
CREATE EXTERNAL VOLUME IF NOT EXISTS demoSTORAGE_LOCATIONS = (( NAME = 'demo' STORAGE_PROVIDER = 's3-tables' ARN = 'arn:aws:s3tables:us-east-2:123456789012:bucket/my-table-bucket'));
Create and query tables
Section titled “Create and query tables”Create tables, load data, and run queries using the Snowflake command-line tool or any Snowflake-compatible tool. Use Snowflake command-line tool guide for the information on how to connect to Embucket.
-
Connect to Embucket
Start a Snowflake command-line tool session:
Terminal window snow sql -c local -
Create a schema
Set up the schema structure:
CREATE SCHEMA demo.public;Output:
+-------+| count ||-------|| 0 |+-------+ -
Create a table with data
Create and populate a table in one command:
CREATE TABLE demo.public.users (id INT,name VARCHAR(100),email VARCHAR(100)) AS VALUES(1, 'John Doe', 'john.doe@example.com'),(2, 'Jane Doe', 'jane.doe@example.com');Output:
+-------+| count ||-------|| 2 |+-------+ -
Query the table
Verify you can read the data:
SELECT * FROM demo.public.users;Output:
+----+----------+----------------------+| id | name | email ||----|----------|----------------------|| 1 | John Doe | john.doe@example.com || 2 | Jane Doe | jane.doe@example.com |+----+----------+----------------------+
Verify in AWS console
Section titled “Verify in AWS console”Verify table creation and query your data directly through AWS services:
-
Open AWS Console
Navigate to the S3 Tables service in the AWS Console.
-
Locate your table bucket
Find the
my-table-bucket
you created earlier. -
Browse tables
Inside the table bucket, you see:
- Database:
demo
- Table:
users
- Database:
-
Query with Athena
Select the
users
table and choose “Query table with Athena.” The SQL editor opens with your table ready for queries.
Next steps
Section titled “Next steps”Now that you have S3 table buckets working with Embucket, consider:
- Performance optimization: Configure table partitioning for large datasets
- Security: Set up IAM roles for fine-grained access control
- Monitoring: Enable AWS CloudTrail for audit logging
- Integration: Connect BI tools and data pipelines to your Embucket instance
For more advanced configuration options, see the volumes documentation.