Destination PostgreSQL #
The extracted replicant-cli
will be referred to as the $REPLICANT_HOME
directory in the following steps.
Required permissions #
-
Make sure that the user being used for replication, has the
CREATE TABLE
privilege on the target catalogs or schemas where you want to replicate the tables to. Use the following command to grant the privileges:GRANT CREATE ON DATABASE <catalog_name> TO <replication_user>;
Replace the following:
catalog_name
: the catalog name in the targetreplication_user
: the user being used for replication
-
In order to store Arcion’s replication metadata, you must ensure one of the following:
- Point to an external metadata database. For more information, see Metadata configuration.
- Grant the
CREATEDB
privilege to the user being used for replication. This allows the user to create theio
database. The user must also possess the privilege to create tables in theio
database. Replicant uses thisio
database to maintain internal checkpoint and metadata.
The following command assigns the
CREATEDB
privilege to a useralex
:ALTER USER alex CREATEDB;
If the user does not have
CREATEDB
privilege, then follow these two steps:- Create a database manually with the name
io
:CREATE DATABASE io;
- Grant all privileges for the
io
database to that user:GRANT ALL ON DATABASE io TO alex;
I. Set up connection configuration #
To connect to your PostgreSQL target database, you have these two options:
You can specify your connection details to Replicant with a YAML connection configuration file. You can find a sample connection configuration file cloudsql_postgresql.yaml
in the $REPLICANT_HOME/conf/conn
directory.
type: POSTGRESQL
host: HOSTNAME
port: PORT_NUMBER
database: 'DATABASE_NAME'
username: 'USERNAME'
password: 'PASSWORD'
max-connections: 30
socket-timeout-s: 60
max-retries: 10
retry-wait-duration-ms: 1000
Replace the following:
HOSTNAME
: the hostname of the target PostgreSQL instancePORT_NUMBER
: the port numberDATABASE_NAME
: the database nameUSERNAME
: the username of the user that connects to the PostgreSQL serverPASSWORD
: the password associated withUSERNAME
Feel free to change the following parameter values as you need:
max-connections
: the maximum number of connections Replicant opens in Cloud SQL instance.max-retries
: number of times Replicant retries a failed operation.retry-wait-duration-ms
: duration in milliseconds Replicant waits between each retry of a failed operation.socket-timeout-s
: the timeout value in seconds specifying socket read operations. A value of0
disables socket reads. This parameter is only supported for Arcion self-hosted CLI versions 22.02.12.16 and newer.
Important: Make sure thatmax_connections
in PostgreSQL exceeds themax-connections
parameter in the preceding connection configuration file.
II. Configure mapper file (optional) #
If you want to define data mapping from source to your target PostgreSQL, specify the mapping rules in the mapper file. The following is a sample mapper configuration for a MySQL-to-PostgreSQL pipeline:
rules:
[tpch, public]:
source:
- [tpch]
For more information on how to define the mapping rules and run Replicant CLI with the mapper file, see Mapper configuration.
III. Set up Applier configuration #
To configure replication mode according to your requirements, specify your configuration in the Applier configuration file. You can find a sample Applier configuration file postgresql.yaml
in the $REPLICANT_HOME/conf/dst
directory. For example:
You can configure PostgreSQL for operating in either snapshot or realtime modes.
Configure snapshot
mode
#
For operating in snapshot mode, specify your configuration under the snapshot
section of the conifiguration file. For example:
snapshot:
threads: 16
batch-size-rows: 5_000
txn-size-rows: 1_000_000
skip-tables-on-failures: false
map-bit-to-boolean: false
bulk-load:
enable: true
type: FILE # FILE or PIPE
_traceDBTasks: true
use-quoted-identifiers: true
Additional snapshot
parameters
#
map-bit-to-boolean
- Tells Replicant whether to map
bit(1)
andvarbit(1)
data types from source toboolean
on target.If
true
, Replicant mapsbit(1)
/varbit(1)
data types from source toboolean
on target PostgreSQL. Iffalse
, Replicant mapsbit(1)
/varbit(1)
data types from source tobit(1)
/varbit(1)
on target PostgreSQL.Default:
false
.
For more information about the Applier parameters for snapshot
mode, see Snapshot mode.
Configure realtime
mode
#
If you want to operate in real time, use the realtime
section to specify your configuration. For example:
realtime:
threads: 8
txn-size-rows: 10000
batch-size-rows: 1000
skip-tables-on-failures : false
use-quoted-identifiers: true
For more information about the configuration parameters for realtime
mode, see Realtime mode.