Destination AlloyDB #
This page describes how to load data in real time into Google’s AlloyDB, a fully managed PostgreSQL-compatible database service.
The following steps refer to the extracted Arcion self-hosted CLI download as the $REPLICANT_HOME
directory.
Required permissions #
Privilege to create tables #
Make sure that the user being used for replication has the privilege to create tables on the target catalogs or schemas where you want to replicate the tables to. Use the following command to grant the privileges:
GRANT CREATE ON DATABASE <catalog_name> TO <replication_user>;
Replace the following:
catalog_name
: the catalog name in the targetreplication_user
: the user being used for replication
Replicant metadata #
In order to store Arcion’s replication metadata, you must ensure one of the following:
-
Point to an external metadata database. For more information, see Metadata configuration.
-
Grant the
CREATEDB
privilege to the user being used for replication. This allows the user to create theio
database. The user must also possess the privilege to create tables in theio
database. Replicant uses thisio
database to maintain internal checkpoint and metadata.The following command assigns the
CREATEDB
privilege to a useralex
:ALTER USER alex CREATEDB;
If the user does not have
CREATEDB
privilege, then follow these two steps:- Create a database manually with the name
io
:CREATE DATABASE io;
- Grant all privileges for the
io
database to that user:GRANT ALL ON DATABASE io TO alex;
- Create a database manually with the name
I. Set up connection configuration #
Specify your AlloyDB connection details to Replicant with a connection configuration file. You can find a sample connection configuration file alloydb.yaml
in the $REPLICANT_HOME/conf/conn
directory.
Specify the connection details in the following manner:
type: ALLOYDB
host: ALLOYDB_IP
port: PORT_NUMBER
database: 'DATABASE_NAME'
username: 'USERNAME'
password: 'PASSWORD'
max-connections: 30
max-retries: 10
retry-wait-duration-ms: 1000
socket-timeout-s: 60
Replace the following:
ALLOYDB_IP
: the IP address of the AlloyDB instancePORT_NUMBER
: the port numberDATABASE_NAME
: the AlloyDB database nameUSERNAME
: the username of theDATABASE_NAME
userPASSWORD
: the password associated withUSERNAME
Feel free to change the following parameter values as you need:
max-connections
: the maximum number of connections Replicant opens in AlloyDB.max-retries
: number of times Replicant retries a failed operation.retry-wait-duration-ms
: duration in milliseconds Replicant waits between each retry of a failed operation.socket-timeout-s
: the timeout value in seconds specifying socket read operations. A value of0
disables socket reads.
The following demonstrates a sample connnection configuration:
type: ALLOYDB
host: 12.34.456.78
port: 5444
database: 'tpch'
username: 'replicate'
password: 'Replicate#123'
max-connections: 30
max-retries: 10
retry-wait-duration-ms: 1000
socket-timeout-s: 60
II. Configure mapper file (optional) #
If you want to define data mapping from your source to AlloyDB, specify the mapping rules in the mapper file. For more information on how to define the mapping rules and run Replicant CLI with the mapper file, see Mapper configuration.
For example, the following sample applies to a PostgreSQL-to-AlloyDB pipeline:
rules:
[tpch, public]:
source:
- [tpch, public]
III. Configure metadata (optional) #
To ensure fault tolerance and resilience in replication, Arcion Replicant needs to maintain a number of metadata tables. Replicant uses a metadata configuration file to handle metadata-related operations. For more information, see Metadata configuration.
The following shows a sample metadata configuration:
type: ALLOYDB
connection:
host: localhost
port: 5435
database: 'tpch'
username: 'replicant'
password: 'Replicant#123'
max-connections: 30
catalog: 'io'
schema: 'replicate'
IV. Set up Applier configuration #
To configure replication mode according to your requirements, specify your configuration in the Applier configuration file. You can find a sample Applier configuration file alloydb.yaml
in the $REPLICANT_HOME/conf/dst
directory.
Configure snapshot
mode
#
For operating in snapshot
mode, specify your configuration under the snapshot
section of the configuration file. For example:
snapshot:
threads: 16
batch-size-rows: 5_000
txn-size-rows: 1_000_000
skip-tables-on-failures : false
bulk-load:
enable: true
type: FILE
_traceDBTasks: true
use-quoted-identifiers: true
use-upsert-based-recovery: false
For more information about the Applier parameters for snapshot
mode, see Snapshot mode.
Configure realtime
mode
#
For operating in realtime
mode, specify your configuration under the realtime
section of the conifiguration file. For example:
realtime:
threads: 8
txn-size-rows: 10000
batch-size-rows: 1000
skip-tables-on-failures : false
replay-replace-as-upsert: false
use-quoted-identifiers: true
# Transactional mode config
# realtime:
# threads: 1
# batch-size-rows: 1000
# replay-consistency: GLOBAL
# txn-group-count: 100
# _oper-queue-size-rows: 20000
# skip-upto-cursors: #last failed cursor
For more information about the configuration parameters for realtime
mode, see Realtime mode.