Destination PostgreSQL #
The extracted replicant-cli will be referred to as the $REPLICANT_HOME directory in the following steps.
Required permissions #
-
Make sure that the user being used for replication, has the
CREATE TABLEprivilege on the target catalogs or schemas where you want to replicate the tables to. Use the following command to grant the privileges:GRANT CREATE ON DATABASE <catalog_name> TO <replication_user>;Replace the following:
catalog_name: the catalog name in the targetreplication_user: the user being used for replication
-
In order to store Arcion’s replication metadata, you must ensure one of the following:
- Point to an external metadata database. For more information, see Metadata configuration.
- Grant the
CREATEDBprivilege to the user being used for replication. This allows the user to create theiodatabase. The user must also possess the privilege to create tables in theiodatabase. Replicant uses thisiodatabase to maintain internal checkpoint and metadata.
The following command assigns the
CREATEDBprivilege to a useralex:ALTER USER alex CREATEDB;If the user does not have
CREATEDBprivilege, then follow these two steps:- Create a database manually with the name
io:CREATE DATABASE io; - Grant all privileges for the
iodatabase to that user:GRANT ALL ON DATABASE io TO alex;
I. Set up connection configuration #
To connect to your PostgreSQL target database, you have these two options:
You can specify your connection details to Replicant with a YAML connection configuration file. You can find a sample connection configuration file cloudsql_postgresql.yaml in the $REPLICANT_HOME/conf/conn directory.
type: POSTGRESQL
host: HOSTNAME
port: PORT_NUMBER
database: 'DATABASE_NAME'
username: 'USERNAME'
password: 'PASSWORD'
max-connections: 30
socket-timeout-s: 60
max-retries: 10
retry-wait-duration-ms: 1000
Replace the following:
HOSTNAME: the hostname of the target PostgreSQL instancePORT_NUMBER: the port numberDATABASE_NAME: the database nameUSERNAME: the username of the user that connects to the PostgreSQL serverPASSWORD: the password associated withUSERNAME
Feel free to change the following parameter values as you need:
max-connections: the maximum number of connections Replicant opens in Cloud SQL instance.max-retries: number of times Replicant retries a failed operation.retry-wait-duration-ms: duration in milliseconds Replicant waits between each retry of a failed operation.socket-timeout-s: the timeout value in seconds specifying socket read operations. A value of0disables socket reads. This parameter is only supported for Arcion self-hosted CLI versions 22.02.12.16 and newer.
Important: Make sure thatmax_connectionsin PostgreSQL exceeds themax-connectionsparameter in the preceding connection configuration file.
II. Configure mapper file (optional) #
If you want to define data mapping from source to your target PostgreSQL, specify the mapping rules in the mapper file. The following is a sample mapper configuration for a MySQL-to-PostgreSQL pipeline:
rules:
[tpch, public]:
source:
- [tpch]
For more information on how to define the mapping rules and run Replicant CLI with the mapper file, see Mapper configuration.
III. Set up Applier configuration #
To configure replication mode according to your requirements, specify your configuration in the Applier configuration file. You can find a sample Applier configuration file postgresql.yaml in the $REPLICANT_HOME/conf/dst directory. For example:
You can configure PostgreSQL for operating in either snapshot or realtime modes.
Configure snapshot mode
#
For operating in snapshot mode, specify your configuration under the snapshot section of the conifiguration file. For example:
snapshot:
threads: 16
batch-size-rows: 5_000
txn-size-rows: 1_000_000
skip-tables-on-failures: false
map-bit-to-boolean: false
bulk-load:
enable: true
type: FILE # FILE or PIPE
_traceDBTasks: true
use-quoted-identifiers: true
Additional snapshot parameters
#
map-bit-to-boolean- Tells Replicant whether to map
bit(1)andvarbit(1)data types from source tobooleanon target.If
true, Replicant mapsbit(1)/varbit(1)data types from source tobooleanon target PostgreSQL. Iffalse, Replicant mapsbit(1)/varbit(1)data types from source tobit(1)/varbit(1)on target PostgreSQL.Default:
false.
For more information about the Applier parameters for snapshot mode, see Snapshot mode.
Configure realtime mode
#
If you want to operate in real time, use the realtime section to specify your configuration. For example:
realtime:
threads: 8
txn-size-rows: 10000
batch-size-rows: 1000
skip-tables-on-failures : false
use-quoted-identifiers: true
For more information about the configuration parameters for realtime mode, see Realtime mode.