Source IBM Db2 with Native LUW #
You may want to use the db2ReadLog API to read log records from the Db2 database logs, or query the Log Manager for current log state information. This page describes how to use the db2ReadLog API in Arcion when using Db2 as source.
I. Check permissions #
The user must possess the following permissions:
Read access on all the databases, schemas, and tables that the user wants to replicate.
Read access to following system tables and views:
Execute permissions on the following system procedures:
Users need these permissions only once at the start of a fresh replication.
II. Create the heartbeat table #
For CDC replication, you must create the heartbeat table on the source database with the following DDL:
CREATE TABLE "tpch"."replicate_io_cdc_heartbeat"("timestamp" BIGINT NOT NULL, CONSTRAINT "cdc_heartbeat_id_default_default" PRIMARY KEY("timestamp"))
III. Enable CDC-based Rreplication #
If you want CDC-based replication from the source Db2 server, follow these steps:
On system running source Db2 server: #
For each table that you want to replicate, run the following command:
ALTER TABLE <TABLE> DATA CAPTURE CHANGES
Check if the database is recoverable by running the following command:
db2 get db cfg for <DATABASE> show detail | grep -i "logarch"
LOGARCHMETH2is set, the database is already recoverable.
Skip the next step if the database is already recoverable.
Update the Db2 logging method by running the following command:
db2 update db cfg for <DATABASE> using LOGARCHMETH1 LOGRETAIN
The database enters a backup pending state after updating the logging method. To recover from this state, you need to backup the database with the following:
db2 backup db <DATABASE> to <DESTINATION>
Make sure to restart the database using
db2startcommands respectively before running the
On system running Arcion Replicant #
Skip steps 2-6 if Replicant runs from the same system as the source Db2 database.
JAVA_OPTSenvironment variable with:
Install Db2 Data Server Client prerequisites by running the following commands:
sudo dpkg --add-architecture i386 sudo apt install libaio1 libstdc++6:i386 libpam0g:i386 sudo apt install binutils
You can install Db2 Data server Client as a root user or a non-root user. Non-root installations come with certain limitations. For more information, see Installing DB2 database servers as a non-root user
Follow these steps to install Db2 Data Server Client:
Download latest version of Db2 Data Server Client from IBM.
Extract and start the installer by running
Select Custom installation in the Configuration window.
Select the Base application development tools checkbox in the Select Features window.
In the Instance Owner window, specify user information for the Db2 instance owner. You can either create a new user, or use an existing user. In both cases, note down the user configuration and make sure to run Replicant as the same user.
Leave the remaining installation options as default and complete the installation.
Catalog the source Db2 Server node by running the following:
db2 catalog tcpip node <NODE_NAME> remote <REMOTE> server <PORT>
Catalog the source Db2 database:
db2 catalog database <DATABASE> at node <NODE_NAME>
Finally, test the connection with:
db2 connect to <DATABASE> user <USER>
IV. Configure Replicant connection #
Specify your Db2 connection details to Replicant with a connection configuration file. You can find a sample connection configuration file
The following shows a sample configuration:
type: DB2 database: tpch host: localhost port: 50002 node: db2inst1 username: replicant password: "Replicant#123" max-connections: 30 max-retries: 10 retry-wait-duration-ms: 1000
nodeproperty. It represents the name of the Db2 node you are connecting to. It can go anywhere in the root of the file. The default node name is the Db2 user’s name.
Set the value of
READ_LOGin the connection configuration file. This instructs Replicant to use the native db2ReadLog as the CDC log reader:
cdc-log-config: cdc-log-storage: READ_LOG
V. Set up Extractor Configuration #
$REPLICANT_HOME, navigate to the sample Extractor configuration file:
The Extractor configuration file contains three parts:
- Parameters related to snapshot mode.
- Parameters related to delta snapshot mode.
- Parameters related to realtime mode.
To know about different Replicant modes, see Running Replicant.
Parameters related to snapshot mode #
For snapshot mode, you can make use of the following sample:
snapshot: threads: 16 fetch-size-rows: 5_000 _traceDBTasks: true #fetch-schemas-from-system-tables: true per-table-config: - catalog: tpch schema: db2user tables: lineitem: row-identifier-key: [l_orderkey, l_linenumber]
Parameters related to delta snapshot mode #
For delta delta snapshot mode, you can make use of the following sample:
delta-snapshot: #threads: 32 #fetch-size-rows: 10_000 #min-job-size-rows: 1_000_000 max-jobs-per-chunk: 32 _max-delete-jobs-per-chunk: 32 delta-snapshot-key: last_update_time delta-snapshot-interval: 10 delta-snapshot-delete-interval: 10 _traceDBTasks: true replicate-deletes: false per-table-config: - catalog: tpch schema: db2user tables: # testTable # split-key: split-key-column # split-hints: # row-count-estimate: 100000 # split-key-min-value: 1 # split-key-max-value: 60_000 # delta-snapshot-key: delta-snapshot-key-column # row-identifier-key: [col1, col2] # update-key: [col1, col2] partsupp: split-key: partkey
Parameters related to realtime mode #
For realtime mode, you can make use of the following sample:
realtime: #threads: 1 #fetch-size-rows: 10000 _traceDBTasks: true #fetch-interval-s: 0 replicate-empty-string-as-null: true # idempotent-replay: false heartbeat: enable: true catalog: tpch schema: db2user #table-name: replicate_io_cdc_heartbeat #column-name: timestamp interval-ms: 10000
For a detailed explanation of configuration parameters in the Extractor file, read Extractor Reference.