Source IBM Db2 on Native i Series AS/400 #
This page describes how to set up CDC-based replication from Source Db2 server on the IBM i AS/400 platform.
Obtain the Db2 JDBC driver #
Arcion Replicant requires Db2 JDBC driver as a dependency. To make sure Replicant can access the necessary dependency files, follow these steps:
- Download the Db2 JDBC driver JAR file.
- After downloading the JAR file, put it inside the
lib/
directory of your Replicant self-hosted download folder.
I. Check Permissions #
-
The user should have read access on all the databases, schemas, and tables to be replicated.
-
The user should have read access to following system tables and views:
a.
QSYS2.SYSTABLES
b.
SYSIBM.SQLTABLETYPES
c.
QSYS2.SYSCOLUMNS
d.
QSYS2.SYSCSTCO
e.
SYSIBM.SQLCOLUMNS
f.
SYSIBM.COLUMNS
(required for forfetch-schemas
mode mode) -
The user should have execute permissions on following system procedures:
a.
SYSIBM.SQLTABLES
b.
SYSIBM.SQLCOLUMNS
c.
SYSIBM.SQLPRIMARYKEYS
d.
SYSIBM.SQLSTATISTICS
Users need these permissions only once at the start of a fresh replication.
II. Create the Heartbeat Table #
For CDC replication, you must create the heartbeat table on the Source database with the following DDL:
CREATE TABLE "tpch"."replicate_io_cdc_heartbeat"("timestamp" BIGINT NOT NULL,
CONSTRAINT "cdc_heartbeat_id_default_default" PRIMARY KEY("timestamp"))
III. Enable CDC Replication #
For enabling CDC-based replication, follow the steps described below.
Configure Journal #
-
Configure one journal for each library being replicated.
-
You must configure journals with the following format options for CDC replication:
a. Minimize entry data - data areas: NO
b. Minimize entry data - database files: NO
c. Minimize fixed portion of entries: NO
d. Ensure that journals have commitment control enabled. If it’s not, run the following command on the database server:
STRCMTCTL LCKLVL(*CHG) CMTSCOPE(*JOB) TEXT('Replication Group')
Configure Replicant #
You need to configure your Source Db2 connection configuration file for Replicant. You can store your connection credentials in a secrets management service and tell Replicant to retrieve the credentials. For more information, see Secrets management.
Otherwise, you can put your credentials like usernames and passwords in plain form like the sample below:
type: DB2
database: database
host: localhost
port: 446
username: replicant
password: "Replicant123"
transaction-store-location: "~/temp/"
transaction-store-cache-limit: 100000
platform: AS400
max-connections: 30
max-retries: 10
retry-wait-duration-ms: 1000
cdc-log-config:
cdc-log-storage: READ_LOG
journals:
- journal-library: TESTSCH
journal-name: QSQJRN
journals
: Represents the list of Db2 journals that Replicant will read CDC logs from.journal-library
: The library (schema) that the journal is in.
IV. Set up Extractor Configuration #
-
From
$REPLICANT_HOME
, navigate to the Extractor configuration file:vi conf/src/db2.yaml
-
The Extractor configuration file has three parts:
- Parameters related to snapshot mode.
- Parameters related to delta snapshot mode.
- Parameters related to realtime mode.
Parameters related to snapshot mode #
For snapshot mode, you can make use of the following sample:
snapshot: threads: 16 fetch-size-rows: 5_000 _traceDBTasks: true #fetch-schemas-from-system-tables: true per-table-config: - catalog: tpch schema: db2user tables: lineitem: row-identifier-key: [l_orderkey, l_linenumber]
Parameters related to delta snapshot mode #
For delta delta snapshot mode, you can make use of the following sample:
delta-snapshot: #threads: 32 #fetch-size-rows: 10_000 #min-job-size-rows: 1_000_000 max-jobs-per-chunk: 32 _max-delete-jobs-per-chunk: 32 delta-snapshot-key: last_update_time delta-snapshot-interval: 10 delta-snapshot-delete-interval: 10 _traceDBTasks: true replicate-deletes: false per-table-config: - catalog: tpch schema: db2user tables: # testTable # split-key: split-key-column # split-hints: # row-count-estimate: 100000 # split-key-min-value: 1 # split-key-max-value: 60_000 # delta-snapshot-key: delta-snapshot-key-column # row-identifier-key: [col1, col2] # update-key: [col1, col2] partsupp: split-key: partkey
Parameters related to realtime mode #
For realtime mode, you can make use of the following sample:
realtime: #threads: 1 #fetch-size-rows: 10000 _traceDBTasks: true #fetch-interval-s: 0 replicate-empty-string-as-null: true # start-position: # commit-time: '2020-08-24 08:16:38.019002' # idempotent-replay: false heartbeat: enable: true catalog: tpch schema: db2user #table-name: replicate_io_cdc_heartbeat #column-name: timestamp interval-ms: 10000
In the sample above, notice the following details:
-
The parameter
commit-time
specifies the timestamp in UTC understart-position
, which indicates the starting log position for realtime replication. To get a timestamp in UTC, you can execute the following query:SELECT CURRENT TIMESTAMP - CURRENT TIMEZONE AS UTC_TIMESTAMP FROM SYSIBM.SYSDUMMY1
-
If you’ve set
message-format
toDELIMITED
, setreplicate-empty-string-as-null
totrue
.
For a detailed explanation of configuration parameters in the Extractor file, read Extractor Reference.