Oracle

Source Oracle #

The first five steps (I-V) are to prepare the Oracle Database for replication and must be executed in an Oracle client. The remaining steps (VI-VIII) are to configure Replicant.

I. Obtain the JDBC Driver for Oracle #

Replicant requires the JDBC driver for Oracle as a dependency. To obtain the appropriate driver, follow the steps below:

II. Set up Oracle User #

  1. Create a new user for Replicant with the following command:

    CREATE USER <USERNAME> IDENTIFIED BY <PASSWORD>
    DEFAULT TABLESPACE <user-defined-tablesace>
    QUOTA unlimited on <user-defined-tablespace>
    TEMPORARY TABLESPACE TEMP;
    
  2. Provide the create session permission:

    GRANT CREATE SESSION TO <USERNAME>;
    
  3. Grant the select permission for all the tables that are part of the replication:

    GRANT SELECT ON <TABLENAME> TO <USERNAME>;
    

    OR

    The select permission can be granted on all of the tables

    GRANT SELECT ANY TABLE TO <USERNAME>;
    

III. Set up Change Data Capture (CDC) #

  1. Set the destination for the log archive:

    ALTER SYSTEM SET log_archive_dest = '$PATH_TO_REDO_LOG_FILES' scope=spfile  
    

    To use log-based CDC, the Oracle database must be in ARCHIVELOG mode. To check what mode the database is in, use the ARCHIVE LOG LIST command. To set the database in ARCHIVELOG mode, use the following commands:

    SHUTDOWN IMMEDIATE
    STARTUP MOUNT
    ALTER DATABASE ARCHIVELOG
    ALTER DATABASE OPEN
    
  2. Once the database is in ARCHIVELOG mode, grant the EXECUTE_CATALOG_ROLE role to use the DBMS_LOGMNR package:

    GRANT EXECUTE_CATALOG_ROLE TO <USERNAME>
    
  3. Provide the following permissions to allow Replicant to access the START_LOGMNR procedure For Oracle 11g:

    GRANT SELECT ANY TRANSACTION TO <USERNAME>;
    

    For Oracle 12C and beyond:

    GRANT LOGMINING TO <USERNAME>;
    
  4. Provide the following permission to allow Replicant to access v_$logmnr_contents:

    GRANT SELECT ON v_$logmnr_contents TO <USERNAME>;
    GRANT SELECT ON gv$archived_log to $USERNAME;
    

    For Oracle 19C and beyond, Replicant requires additional access to v_$logfile:

    GRANT SELECT ON v_$logfile TO <USERNAME>;
    

Enabling logs #

You have to enable either primary key or all column logging at either the database level or the table level.

If you use table level logging, you must enable it for the CDC heartbeat table as well.

Database level Supplemental logging #

  • Enable Force Logging:

    ALTER DATABASE FORCE LOGGING
    
  • Enable PRIMARY KEY logging:

    ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS
    

    OR

  • Enable ALL Column logging:

    ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS
    

Table level supplemental logging #

  • PRIMARY KEY logging

    ALTER <TABLE_NAME> ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS
    

    OR

  • ALL Column logging

    ALTER <TABLE_NAME> ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS
    
  • If table level logging is enabled, then force logging should not be enabled. We need force logging for database level logging:

    ALTER DATABASE FORCE LOGGING
    ALTER SYSTEM SWITCH LOGFILE
    
  • The following additional permissions are required:

    GRANT SELECT ON gv_$instance TO $USERNAME;
    GRANT SELECT ON gv_$PDBS TO $USERNAME;
    GRANT SELECT ON gv_$log TO $USERNAME;
    GRANT SELECT ON gv_$database_incarnation to $USERNAME;
    
Only enable either PRIMARY KEY logging or ALL column logging, not both.

IV. Set up Global Permissions #

Quick Definitions

  • One Time Access: The granted permission is only required for the initial data load (snapshot) and for reinit operations (snapshots of newly added tables). The permission can be revoked after the necessary operation is complete.
  • Continuous Access: The granted permission is required throughout the replication process.
  1. Provide the following privilege for one time access:

    GRANT SELECT ON DBA_SEGMENTS TO <USERNAME>;
    
  2. Provide the following continuous access permissions; these are necessary during snapshot as well as continious real-time replication:

    GRANT SELECT ON gv_$database TO <USERNAME>;
    GRANT SELECT ON gv_$transaction TO <USERNAME>;
    GRANT SELECT ON gv_$session TO <USERNAME>;--Not required for replicant release 20.8.13.7 and above
    
  3. Grant the following continuous access permission for the tables involved in Replication:

    GRANT FLASHBACK ON <TABLE_NAME> TO <USERNAME>;
    

    OR

    Enable Flashback for all tables

    GRANT FLASHBACK ANY TABLE TO <USERNAME>;
    
  4. Provide access to the below system views for schema migration:

    GRANT SELECT ON ALL_TABLES TO <USERNAME>;
    GRANT SELECT ON ALL_VIEWS TO <USERNAME>;
    GRANT SELECT ON ALL_CONSTRAINTS TO <USERNAME>;
    GRANT SELECT ON ALL_CONS_COLUMNS TO <USERNAME>;
    GRANT SELECT ON ALL_PART_TABLES TO <USERNAME>;
    GRANT SELECT ON ALL_PART_KEY_COLUMNS TO <USERNAME>;
    GRANT SELECT ON ALL_TAB_COLUMNS TO <USERNAME>;
    GRANT SELECT ON SYS.ALL_INDEXES TO <USERNAME>;
    GRANT SELECT ON SYS.ALL_IND_COLUMNS TO <USERNAME>;
    GRANT SELECT ON SYS.ALL_IND_EXPRESSIONS TO <USERNAME>;
    

V. Grant Pluggable Database (PDB) Permissions #

  1. Ensure that you are connected as a common user with privileges granted on both CDB$ROOT and PDB

  2. Provide following additional permissions:

    GRANT SET CONTAINER TO $USERNAME CONTAINER=ALL;
    GRANT SELECT ON DBA_PDBS to <USERNAME> CONTAINER=ALL;
    
  3. Open pluggable database:

    ALTER PLUGGABLE DATABASE $PDB_NAME OPEN READ WRITE FORCE;
    



The proceeding steps are to set up Replicant. The extracted replicant-cli will be referred to as the $REPLICANT_HOME directory in the proceeding steps.

VI. Set up Connection Configuration #

  1. From $REPLICANT_HOME, navigate to the connection configuration file:

    vi conf/conn/oracle.yaml
    
  2. Make the necessary changes as follows:

    type: ORACLE
    
    host: localhost #Replace localhost with your oracle host name
    port: 1521 #Replace the default port number 1521 if needed
    service-name: IO #Replace IO with the service name of your Oracle Listener
    
    username: 'REPLICANT' #Replace REPLICANT with your username to connect to Oracle
    password: 'Replicant#123' #Replace Replicant#123 with the your user's password
    
    max-connections: 30 #Maximum number of connections replicant can open in Oracle
    

VII. Set up Filter Configuration #

  1. From $REPLICANT_HOME, navigate to the filter configuration file:

    vi filter/oracle_filter.yaml
    
  2. In accordance to your replication needs, specify the schema(s)/table(s) which is to be replicated. Use the format explained below:

    allow:
    
      #In this example, data of object type Table in the schema REPLICANT will be replicated
      schema: "REPLICANT"
      types: [TABLE]
    
      #From the shemca REPLCIANT, only the Orders, Customers, and Returns tables will be replicated.
      #Note: Unless specified, all tables in the schema will be replicated
    
      allow:
        Orders:
           #Within Orders, only the US and AUS columns will be replicated
           allow: ["US, AUS"]
    
        Lineitem: #All columns in the table Lineitem  will be replicated without any predicates
    
        Customers:  
           #Within Customers, only the product and service columns will be replicated as long as they meet the condition C_CUSTKEY < 5000
           allow: ["product", "service"]
           conditions: "C_CUSTKEY < 5000"
    

    Below is the filter file template you must follow:

    allow:
      schema: <your_schema_name>
      types: <your_object_type>
    
    
      allow:
        your_table_name_1:
          allow: ["your_column_name"]
          conditions: "your_condition"  
    
        your_table_name_2:
    
        your_table_name_3:
          allow: ["your_column_name"]
          conditions: "your_condition"  
    

For a detailed explanation of configuration parameters in the filter file, read: Filter Reference

VIII. Set up Extractor Configuration #

For real-time replication, you must create a heartbeat table in the source Oracle.

  1. Create a heartbeat table in the schema you are going to replicate with the following DDL:

    CREATE TABLE "<schema>"."replicate_io_cdc_heartbeat"(
      "timestamp" NUMBER NOT NULL,
      PRIMARY KEY("timestamp"));
    
  2. Grant INSERT, UPDATE, and DELETE privileges to the user configured for replication

  3. From $REPLICANT_HOME, navigate to the extractor configuration file:

    vi conf/src/oracle.yaml
    
  4. The configuration file has two parts:

    • Parameters related to snapshot mode.
    • Parameters related to realtime mode.

    The following parameters are specific to Oracle as source:

    • fetch-create-sql: This option can be used to apply exact create SQL on source to target.

      This parameter is available only for Oracle->Oracle pipeline.

    • fetch-create-sql-no-constraints: This option is used to split create table SQL and Primary/Unique/Foreign Key constraints as different SQLs. So as part of schema migration we create tables without constraints and after the snapshot is complete the constraints are applied. Application of constraints post snapshot is configured by applier config init-constraint-post-snapshot.

      This parameter is available only for Oracle->Oracle pipeline.

    • serialize-fetch-createSql: This option is used to fetch create SQL in serialized manner after fetching table schema.

    • serialize-fetch-create-sql-no-constraints: This option is used to fetch SQL of Primary/Unique/Foreign Key constraints in serialized manner.

      This parameter is available only for Oracle->Oracle pipeline.

    • native-load: With this parameter set, Replicant uses the Oracle Data Pump Export (expdp) utility to load table data instead of JDBC. This enables Replicant to efficiently handle large-scale data. For more information, see Oracle Native Export.

    The following configuration parameters are available under native-load:

    • enable: true or false. Whether to enable native-load.
    • stage-type: The type of staging area. Allowed values are SHARED_FS and ASM. The staging area can either be a shared directory or Oracle ASM.
    • directory: The Oracle directory object corresponding to the stage-type. For more information, see Create directory object in Source and Target Oracle.
    • path: Full path to the NFS (Network File System) representing the directory shared between Replicant and Oracle.

    The following is a sample configuration for snapshot mode:

    snapshot:
      threads: 16
      fetch-size-rows: 10_000
      verify-row-count: false
      _fetch-exact-row-count: true
      _traceDBTasks: true
    #  inf-number-behavior: EXCEPTION   # EXCEPTION, ROUND, CONVERT_TO_NULL
    #  flashback-query: true
    #  parallel-query: true
    #  fetch-user-roles: true
    
    #   native-load:
    #     enable: false
    #     stage-type: SHARED_FS
    #     directory: SHARED_STAGE
    #     shared-path: FULL_PATH_TO_NFS #full path to the NFS shared directory 
    
    
    #   per-table-config:
    #   - schema: tpch
    #     tables:
    #       lineitem1:
    #         row-identifier-key: [ORDERKEY, LINENUMBER]
    #         extraction-priority: 1 #Higher value is higher priority. Both positive and negative values are allowed. Default priority is 0 if unspecified.
    #       lineitem1:
    #         row-identifier-key: [ORDERKEY, LINENUMBER]
    #        products:
    #          per-partition-config:
    #          - partition-name: SYS_P461
    #            row-count: 0
    #          - partition-name: SYS_P462
    #            row-count: 0
    #          - partition-name: SYS_P463
    #            row-count: 1
    #          - partition-name: SYS_P464
    #            row-count: 3
    #        part:
    #          row-count: 2000
    
    • Supplying split-key in the per-table-config section is not required (and not supported) for Oracle source.
    • We strongly recommend that you specify row-identifier-key in per-table-config section for tables not having PK/UK constraints defined on the source Oracle system.

    If you want to operate in realtime mode, you can use the realtime section to specify your configuration. The following Oracle specific parameters are available:

    • block-ddl-transaction[v20.09.14.3]: This option blocks fetching logs for DDL operation from Oracle.

    • use-current-scn[v20.09.14.8]: In start-postion section, this option allows using current scn value for to start reading realtime operations.

    • start-scn[v20.09.14.3]: In start-postion section this option allows using user specified scn value for to start reading realtime operations.

    • inter-source-latency-s[v20.10.07.16]: In the start-position section this config option in seconds represents the lag between primary and standby Oracle in case the source-failover feature is enabled.

    • log-miner-dict-file[v21.09.17.6]: If specified, this file will be used as the dictionary for log mining instead of using the online dictionary. The file must be accessible by Oracle.

    • oldest-active-txn-window-hr[v22.07.19.3]: Specifies the time period in hours up to which Replicant should fetch the oldest transaction SCN.

      Default: By default, this parameter is set to 24.

    The following is a sample configuration for realtime mode:

    realtime:
      threads: 4
      _traceDBTasks: true
      #fetch-size-rows: 0
      heartbeat:
        enable: true
        schema: "tpch"
        interval-ms: 10000
        table-name: replicate_io_cdc_heartbeat
    
      #start-position:
        #start-scn: 2362927
    

For a detailed explanation of configuration parameters in the Extractor file, see Extractor Reference.

IX. Use Native Oracle Log Reader (Beta) #

Note: This feature is currently in beta.

It’s possible to configure Replicant so that it can read and make use of Oracle redo log files.

Modify Oracle Connection Configuration File #

Add the following two parameters in the Oracle connection configuration file:

log-reader: REDOLOG
transaction-store-location: PATH_TO_TRANSACTION_STORAGE

Replace PATH_TO_TRANSACTION_STORAGE with the location of Oracle transaction storage.

Grant Necessary Permissions #

Replicant user should have the following permissions granted for them in order to use the native Oracle log reader.

GRANT SELECT ON gv_$instance TO USERNAME;
GRANT SELECT ON v_$log TO USERNAME;
GRANT SELECT ON v_$logfile TO USERNAME;
GRANT SELECT ON v_$archived_log to USERNAME;

Replace USERNAME with your Oracle username.

Oracle ASM for Logs #

Replicant also supports using Oracle Automatic Storage Management (ASM) for logs. To use ASM, follow the steps below:

  1. Make sure that the following permission is granted:

    GRANT SELECT ON gv_$asm_client TO USERNAME
    

    Replace USERNAME with your ASM username.

  2. In your Oracle connection configuration file, create a new section asm-connection. This section will have the necessary ASM connection configuration. Below is a sample connection configuration file with ASM connection details specified as well:

    type: ORACLE
    host: localhost
    port: 53545
    service-name: IO
    username: 'REPLICANT_USERNAME'
    password: 'REPLICANT_PASSWORD'
    
    asm-connection:
      host: oracle-asm
      port: 1521
      service-name: +ASM
      username: 'ASM_USERNAME'
      password: 'ASM_PASSWORD'
      max-connections: 10
    

    Replace the following:

    • REPLICANT_USERNAME: your Replicant username.
    • REPLICANT_PASSWORD: the password associated with your Replicant username.
    • ASM_USERNAME: the username to connect to the ASM instance.
    • ASM_PASSWORD: the password associated with ASM_USERNAME.
  3. To use the file system directly, Replicant must have access to the redo log files for reading. If Replicant’s path(s) to redo log files is different from the database’s path, you must include the path to the redo log files explicitly in the Source connection configuration file. For example:

    log-path: /home/replicant-user/shared/redo/online
    archive-log-path: /home/replicant-user/shared/redo/archive
    

Oracle Native Export #

For Oracle as both Source and Target systems, Replicant uses Oracle’s native Data Pump Export (expdp) utility to export table data. To set up Replicant and Source Oracle to use this feature, follow the instructions below:

Set up expdp in Replicant host machine #

  • Download the Oracle Instant Client Tools Package ZIP and extract the files.
  • Copy the expdp file to the /usr/bin directory.
  • Download the Oracle Instant Client Basic Package ZIP and extract the files in a directory.
  • Copy the path to the directory where you extracted the ZIP archive.
  • Set the ORACLE_HOME and LD_LIBRARY_PATH environment variables in your ~/.bashrc file:
    export ORACLE_HOME=instantClientBasicPath
    export LD_LIBRARY_PATH="$ORACLE_HOME":$LD_LIBRARY_PATH
    export PATH="$ORACLE_HOME:$PATH"
    

Create directory object in Source and Target Oracle #

Replicant uses the external directory feature of Oracle for efficient loading of data into Target Oracle. So you need to create a directory shared between Replicant host and Oracle host (both Source and Target) with READ and WRITE access. To do so, follow the steps below:

  • Launch Oracle SQL Plus from the terminal.
  • From the SQL Plus prompt, execute the following SQL commands:
    create directory SHARED_STAGE as '/shared-volume';
    grant read,write on directory SHARED_STAGE to SYSTEM;
    

Modify the Replicant Extractor configuration file #

In Replicant’s Extractor configuration file of Source Oracle, add a new native-load section under snapshot. This section holds the necessary parameters for Replicant to start using Oracle’s native Export utility. For more information, see Parameters related to snapshot mode.