Skip to content

AWS S3 table buckets

This guide walks you through configuring AWS S3 table buckets as storage for your data in Embucket. S3 table buckets provide extra benefits over standard S3 based volumes. These benefits include automatic table maintenance and AWS-managed optimization features.

Follow this guide to:

  • Create an AWS S3 table bucket using the AWS command-line tool
  • Configure an Embucket volume to use S3 table buckets
  • Create tables and load data using familiar SQL commands
  • Verify table creation and query data through AWS Console

This guide covers basic setup and usage of S3 table buckets with Embucket. Advanced table maintenance features, cross-region replication, and enterprise security configurations fall outside the scope of this guide.

Before you begin, verify you have:

  • AWS command-line tool installed and configured with appropriate permissions
  • Embucket instance running locally or in your environment
  • Valid AWS credentials with S3 Tables service permissions

AWS designed S3 table buckets specifically for tabular data storage. They provide:

  • Automatic optimization: Built-in compaction and file organization
  • Schema enforcement: Native support for table schemas and metadata
  • Query performance: Optimized for analytical workloads
  • AWS integration: Seamless integration with Athena, Glue, and other AWS services
  1. Create the table bucket

    Use the AWS command-line tool to create your S3 table bucket:

    Terminal window
    aws s3tables create-table-bucket --name my-table-bucket --region us-east-2

    The command returns the bucket ARN:

    {
    "arn": "arn:aws:s3tables:us-east-2:123456789012:bucket/my-table-bucket"
    }
  2. Record the bucket information

    Save the following information for the next step:

    • Bucket name: my-table-bucket
    • Region: us-east-2
    • ARN: The full ARN returned by the command

Create a volume in Embucket that connects to your S3 table bucket. Use either the API or SQL interface.

Use HTTPie or curl to create the volume via API:

Terminal window
http POST localhost:3000/v1/metastore/volumes \
ident=demo \
type=s3-tables \
database=demo \
credentials:='{
"credential_type": "access_key",
"aws-access-key-id": "YOUR_ACCESS_KEY",
"aws-secret-access-key": "YOUR_SECRET_KEY"
}' \
arn=arn:aws:s3tables:us-east-2:123456789012:bucket/my-table-bucket

Required parameters:

  • ident: Volume identifier
  • type: Volume type s3-tables
  • database: Database name that maps to this volume
  • credentials: AWS access credentials
  • arn: Full S3 table bucket ARN

Create tables, load data, and run queries using the Snowflake command-line tool or any Snowflake-compatible tool. Use Snowflake command-line tool guide for the information on how to connect to Embucket.

  1. Connect to Embucket

    Start a Snowflake command-line tool session:

    Terminal window
    snow sql -c local
  2. Create a schema

    Set up the schema structure:

    CREATE SCHEMA demo.public;

    Output:

    +-------+
    | count |
    |-------|
    | 0 |
    +-------+
  3. Create a table with data

    Create and populate a table in one command:

    CREATE TABLE demo.public.users (
    id INT,
    name VARCHAR(100),
    email VARCHAR(100)
    ) AS VALUES
    (1, 'John Doe', 'john.doe@example.com'),
    (2, 'Jane Doe', 'jane.doe@example.com');

    Output:

    +-------+
    | count |
    |-------|
    | 2 |
    +-------+
  4. Query the table

    Verify you can read the data:

    SELECT * FROM demo.public.users;

    Output:

    +----+----------+----------------------+
    | id | name | email |
    |----|----------|----------------------|
    | 1 | John Doe | john.doe@example.com |
    | 2 | Jane Doe | jane.doe@example.com |
    +----+----------+----------------------+

Verify table creation and query your data directly through AWS services:

  1. Open AWS Console

    Navigate to the S3 Tables service in the AWS Console.

  2. Locate your table bucket

    Find the my-table-bucket you created earlier.

  3. Browse tables

    Inside the table bucket, you see:

    • Database: demo
    • Table: users
  4. Query with Athena

    Select the users table and choose “Query table with Athena.” The SQL editor opens with your table ready for queries.

S3 table bucket query interface

Now that you have S3 table buckets working with Embucket, consider:

  • Performance optimization: Configure table partitioning for large datasets
  • Security: Set up IAM roles for fine-grained access control
  • Monitoring: Enable AWS CloudTrail for audit logging
  • Integration: Connect BI tools and data pipelines to your Embucket instance

For more advanced configuration options, see the volumes documentation.