Use Snowflake Iceberg Tables #
Note: This feature is only available in Arcion self-hosted CLI.
From version 23.01.05.3, Arcion supports Snowflake Iceberg tables as target for both snapshot-based and realtime replication. To use Snowflake Iceberg tables as target, follow these instructions.
Prerequisites #
-
Create an Amazon S3 bucket if it doesn’t exist.
-
Create external volume in Snowflake for your AWS S3 bucket using the
CREATE EXTERNAL VOLUME
command:CREATE EXTERNAL VOLUME <volume_name> STORAGE_LOCATIONS = ( ( NAME = '<volume_name>' STORAGE_PROVIDER = 'S3' STORAGE_AWS_ROLE_ARN = '<iam_role>' STORAGE_BASE_URL = 's3://<bucket>[/<path>/]' ) );
Replace the following:
<volume_name>
: the name of the new external volume<iam_role>
: the Amazon Resource Name (ARN) of the IAM role<path>
: an optional path that provides granular control over objects in the bucket
For more information on granting Snowflake access to your Amazon S3 bucket, see Accessing Amazon S3 Using External Volumes .
Specify Iceberg as table type in Applier configuration file #
In your Applier configuration file, you need to set the table-type
property to ICEBERG
under the per-table-config
configuration. For example, look at the following sample Applier configuration:
snapshot:
threads: 8
batch-size-rows: 600_000
txn-size-rows: 600_000
per-table-config:
- catalog: "CATALOG"
schema: "SCHEMA"
tables:
TABLE_NAME:
table-type: ICEBERG
bulk-load:
enable: true
type: FILE
save-file-on-error: true
Attention: In realtime replication, Replicant first creates the destination tables with a one-time data snapshot to transfer all existing data from the source. In this “snapshot phase”, Replicant needs to know beforehand whether or not you’re using Iceberg tables. For this reason, you must always use thesnapshot
section of the Applier configuration file to specify yourper-table-config
parameters, including the value oftable-type
. For more information about how different Replicant modes work, see Running Replicant.