Source MongoDB #
The extracted replicant-cli
will be referred to as the $REPLICANT_HOME
directory.
I. Set up Connection Configuration #
1.From $REPLICANT_HOME
, navigate to the connection configuration file:
BASH vi conf/conn/mongodb.yaml
- Make the necessary changes as follows:
type: MONGODB url: "mongodb://localhost:27019/?w=majority" #enter Mongo's connection URL max-connections: 30 #Specify the maximum number of connections replicant can open in MongoDB replica-sets: mongors1: #Replace "mongors1" with your replica set name url: "mongodb://localhost:27017/?w=majority&replicaSet=mongors1" #Enter the URL for given replica set including sockets for all nodes mongors2: #Replace mongors2 with your second replica set name url: "mongodb://localhost:27027/?w=majority&replicaSet=mongors2" #Enter the URL for given replica set including sockets for all nodes #If you have multiple replica-sets for replication, specify all of them here using the format explained above. A sample second replica-set is also shown below:
- You can specify additional connection configurations like number of connections, read/write concern etc. can be included in the
url
string in accordance with the MongoDB syntax. - The
replica-sets
are monitored for oplog entries for carrying out real-time replication. Eachurl
of a MongoDB replica set should represent thehost:port
belonging to the replica set.url
should contain the optionreplicaSet=<replicaSet_name>
to denote it as a replica set. Additional connection configurations like number of connections, read/write concern, etc. can be included in the URL string in accordance with the MongoDB syntax.
- You can specify additional connection configurations like number of connections, read/write concern etc. can be included in the
II. Set up Filter Configuration #
-
From
$REPLICANT_HOME
, navigate to the filter configuration file:vi filter/mongodb_filter.yaml
-
In accordance to you replication needs, specify the data which is to be replicated. Use the format of the example explained below:
allow: #In this example, data of object type Table in the schema tpch will be replicated schema: "tpch" types: [TABLE] #From schema tpch, only the lineitem, ng_test, and usertable tables will be replicated. #Note: Unless specified, all tables in the catalog will be replicated allow: lineitem: #Within lineitem, only the item_one and item_two columns will be replicated allow: ["item_one, item_two"] ng_test: #Within ORDERS, only the test_one and test_two columns will be replicated as long as they meet the condition $and: [{c1: {$gt : 1}}, {c1: {$lt : 9}}]} allow: ["test_one", "test_two"] conditions: "{$and: [{c1: {$gt : 1}}, {c1: {$lt : 9}}]}" usertable: #All columns in the table usertable will be replicated without any predicates
The following is a template of the format you must follow:
allow: schema: <your_schema_name> types: <your_object_type> allow: <your_table_name>: allow: ["your_column_name"] conditions: "your_condition" <your_table_name>: allow: ["your_column_name"] conditions: "your_condition" <your_table_name>: allow: "your_column_name"] conditions: "your_condition"
For a detailed explanation of configuration parameters in the filter file, read: Filter Reference
- Using the format shown in the step above (step 2) specify the database, collections, or documents which will be part of real-time replication under the
global-filter
section
III. Set up Extractor Configuration #
For real-time replication, you must create a heartbeat table in the source MongoDB.
-
Create a heartbeat table in the schema you are going to replicate with the following DDL:
CREATE TABLE "<user_database>"."<schema>"."replicate_io_cdc_heartbeat"( "timestamp" BIGINT NOT NULL, PRIMARY KEY("timestamp"));
-
Grant
INSERT
,UPDATE
, andDELETE
privileges to the user configured for replication -
From
$REPLICANT_HOME
, navigate to the extractor configuration file:vi conf/src/mongodb.yaml
-
Under the Realtime Section, make the necessary changes as follows:
realtime: heartbeat: enable: true schema-name: "Replicant" #Replace Replicant with the name of the schema your heartbeat table is in table-name [20.09.14.3]: replicate_io_cdc_heartbeat #Replace replicate_io_cdc_heartbeat with your heartbeat table's name if applicable column-name [20.10.07.9]: timestamp #Replace timestamp with your heartbeat table's column name if applicable
For a detailed explanation of configuration parameters in the extractor file, read Extractor Reference.