Source IBM Db2 with Native LUW #
You may want to use the db2ReadLog API to read log records from the Db2 database logs, or query the Log Manager for current log state information. This page describes how to use the db2ReadLog API in Arcion when using Db2 as source.
Obtain the Db2 JDBC driver #
Arcion Replicant requires Db2 JDBC driver as a dependency. To make sure Replicant possesses the necessary dependency files, follow these steps:
- Download the Db2 JDBC driver JAR file.
- After downloading the JAR file, put it inside the
lib/
directory of your Replicant self-hosted download folder.
I. Check permissions #
The user must possess the following permissions:
-
Read access on all the databases, schemas, and tables that the user wants to replicate.
-
Read access to following system tables and views:
a.
SYSIBM.SYSTABLES
b.
SYSIBM.SQLTABLETYPES
c.
SYSIBM.SYSCOLUMNS
d.
SYSIBM.SYSTABCONST
e.
SYSIBM.SQLCOLUMNS
f.
SYSCAT.COLUMNS
(required forfetch-schemas
mode). -
Execute permissions on the following system procedures:
a.
SYSIBM.SQLTABLES
b.
SYSIBM.SQLCOLUMNS
c.
SYSIBM.SQLPRIMARYKEYS
d.
SYSIBM.SQLSTATISTICS
Users need these permissions only once at the start of a fresh replication.
II. Create the heartbeat table #
For CDC replication, you must create the heartbeat table on the source database with the following DDL:
CREATE TABLE "tpch"."replicate_io_cdc_heartbeat"("timestamp" BIGINT NOT NULL,
CONSTRAINT "cdc_heartbeat_id_default_default" PRIMARY KEY("timestamp"))
III. Enable CDC-based Rreplication #
If you want CDC-based replication from the source Db2 server, follow these steps:
On system running source Db2 server: #
-
For each table that you want to replicate, run the following command:
ALTER TABLE <TABLE> DATA CAPTURE CHANGES
-
Check if the database is recoverable by running the following command:
db2 get db cfg for <DATABASE> show detail | grep -i "logarch"
If either
LOGARCHMETH1
orLOGARCHMETH2
is set, the database is already recoverable.Skip the next step if the database is already recoverable.
-
Update the Db2 logging method by running the following command:
db2 update db cfg for <DATABASE> using LOGARCHMETH1 LOGRETAIN
The database enters a backup pending state after updating the logging method. To recover from this state, you need to backup the database with the following:
db2 backup db <DATABASE> to <DESTINATION>
Make sure to restart the database using
db2stop
anddb2start
commands respectively before running thebackup
command.
On system running Arcion Replicant #
Skip steps 2-6 if Replicant runs from the same system as the source Db2 database.
-
Configure the
JAVA_OPTS
environment variable with:export JAVA_OPTS=-Djava.library.path=lib
-
Install Db2 Data Server Client prerequisites by running the following commands:
sudo dpkg --add-architecture i386 sudo apt install libaio1 libstdc++6:i386 libpam0g:i386 sudo apt install binutils
-
You can install Db2 Data server Client as a root user or a non-root user. Non-root installations come with certain limitations. For more information, see Installing DB2 database servers as a non-root user
Follow these steps to install Db2 Data Server Client:
-
Download latest version of Db2 Data Server Client from IBM.
-
Extract and start the installer by running
db2setup
. -
Select Custom installation in the Configuration window.
-
Select the Base application development tools checkbox in the Select Features window.
-
In the Instance Owner window, specify user information for the Db2 instance owner. You can either create a new user, or use an existing user. In both cases, note down the user configuration and make sure to run Replicant as the same user.
-
Leave the remaining installation options as default and complete the installation.
-
-
Catalog the source Db2 Server node by running the following:
db2 catalog tcpip node <NODE_NAME> remote <REMOTE> server <PORT>
-
Catalog the source Db2 database:
db2 catalog database <DATABASE> at node <NODE_NAME>
-
Finally, test the connection with:
db2 connect to <DATABASE> user <USER>
IV. Configure Replicant connection #
-
Specify your Db2 connection details to Replicant with a connection configuration file. You can find a sample connection configuration file
db2_src.yaml
in the$REPLICANT_HOME/conf/conn
directory.The following shows a sample configuration:
type: DB2 database: tpch host: localhost port: 50002 node: db2inst1 username: replicant password: "Replicant#123" max-connections: 30 max-retries: 10 retry-wait-duration-ms: 1000
Notice the
node
property. It represents the name of the Db2 node you are connecting to. It can go anywhere in the root of the file. The default node name is the Db2 user’s name. -
Set the value of
cdc-log-storage
toREAD_LOG
in the connection configuration file. This instructs Replicant to use the native db2ReadLog as the CDC log reader:cdc-log-config: cdc-log-storage: READ_LOG
V. Set up Extractor Configuration #
-
From
$REPLICANT_HOME
, navigate to the sample Extractor configuration file:vi conf/src/db2.yaml
-
The Extractor configuration file contains three parts:
- Parameters related to snapshot mode.
- Parameters related to delta snapshot mode.
- Parameters related to realtime mode.
To know about different Replicant modes, see Running Replicant.
Parameters related to snapshot mode #
For snapshot mode, you can make use of the following sample:
snapshot: threads: 16 fetch-size-rows: 5_000 _traceDBTasks: true #fetch-schemas-from-system-tables: true per-table-config: - catalog: tpch schema: db2user tables: lineitem: row-identifier-key: [l_orderkey, l_linenumber]
Parameters related to delta snapshot mode #
For delta delta snapshot mode, you can make use of the following sample:
delta-snapshot: #threads: 32 #fetch-size-rows: 10_000 #min-job-size-rows: 1_000_000 max-jobs-per-chunk: 32 _max-delete-jobs-per-chunk: 32 delta-snapshot-key: last_update_time delta-snapshot-interval: 10 delta-snapshot-delete-interval: 10 _traceDBTasks: true replicate-deletes: false per-table-config: - catalog: tpch schema: db2user tables: # testTable # split-key: split-key-column # split-hints: # row-count-estimate: 100000 # split-key-min-value: 1 # split-key-max-value: 60_000 # delta-snapshot-key: delta-snapshot-key-column # row-identifier-key: [col1, col2] # update-key: [col1, col2] partsupp: split-key: partkey
Parameters related to realtime mode #
For realtime mode, you can make use of the following sample:
realtime: #threads: 1 #fetch-size-rows: 10000 _traceDBTasks: true #fetch-interval-s: 0 replicate-empty-string-as-null: true # idempotent-replay: false heartbeat: enable: true catalog: tpch schema: db2user #table-name: replicate_io_cdc_heartbeat #column-name: timestamp interval-ms: 10000
Support for DDL replication #
Replicant supports DDL replication for real-time Db2 LUW source. For more information, contact us.
For a detailed explanation of configuration parameters in the Extractor file, see Extractor Reference.