Source SingleStore #
The extracted replicant-cli
will be referred to as the $REPLICANT_HOME
directory in the proceeding steps.
I. Set up Connection Configuration #
-
From
$REPLICANT_HOME
, navigate to the sample connection configuration file:vi conf/conn/singlestore.yaml
-
For connecting to the SingleStore server, you can choose between two methods for an authenticated connection:
Connect with username and password #
For connecting to SingleStore using via basic username and password authentication, you have the following two options:
You can specify your credentials in plain form in the connection configuration file like the following sample:
type: SINGLESTORE host: HOSTNAME port: PORT_NUMBER username: 'USERNAME' password: 'PASSWORD' max-connections: 30 max-retries: 10 retry-wait-duration-ms: 1000
Replace the following:
HOSTNAME
: hostname of the SingleStore serverPORT_NUMBER
: port number of the SingleStore serverUSERNAME
: the SingleStore usernamePASSWORD
: the password associated withUSERNAME
You can store your connection credentials in a secrets management service and tell Replicant to retrieve the credentials. For more information, see Secrets management.Connect using SSL #
To connect to SingleStore using SSL, follow these steps:
-
Configure the server-side requirements by following the instructions in Server Configuration for Secure Client and Intra-Cluster Connections.
-
Specify the SSL parameters to Replicant in the
ssl
section of the connection configuration file in the following format:ssl: enable: true root-cert: "PATH_TO_CA_CERTIFICATE_FILE" hostname-verification: {true|false} trust-store: path: PATH_TO_CA_TRUSTSTORE password: TRUSTSTORE_PASSWORD key-store: path: PATH_TO_KEYSTORE password: KEYSTORE_PASSWORD ssl-key-password: KEYSTORE_CERT_PASSWORD
In the preceding configuration:
-
root-cert
holds the full path to your SSL CA certificate file—for example,"/home/alex/workspace/ca-cert.pem"
. Keep in mind that thetrust-store
configuration overridesroot-cert
. -
hostname-verification
enables hostname verification against the server identity according to the specification in the server’s certificate. Defaults totrue
. -
trust-store
holds the SSL CA certificate that the client uses to authenticate the server. This configuration overridesroot-cert
.Replace
PATH_TO_CA_TRUSTSTORE
andTRUSTSTORE_PASSWORD
with the path to the TrustStore and the TrustStore password respectively. -
The server uses
key-store
to authenticate the client. ReplacePATH_TO_KEYSTORE
andKEYSTORE_PASSWORD
with the path to the KeyStore and the KeyStore password respectively. -
As an optional parameter,
ssl-key-password
holds the password of the certificate inside the KeyStore.
-
II. Set up Extractor Configuration #
-
From
$REPLICANT_HOME
, navigate to the Extractor configuration file:vi conf/src/singlestore.yaml
a. For snapshot mode, make the necessary changes as follows in the
snapshot
section of the configuration file:snapshot: #threads: 32 #fetch-size-rows: 10_000 #min-job-size-rows: 1_000_000 max-jobs-per-chunk: 32 #verify-row-count: false _traceDBTasks: true per-table-config: - catalog: tpch tables: # testTable # split-key: split-key-column part: split-key: partkey partsupp: split-key: partkey supplier: orders: split-key: orderkey lineitem: row-identifier-key: [l_orderkey, l_linenumber] split-key: l_orderkey # split-hints: # row-count-estimate: 15000 # split-key-min-value: 1 # split-key-max-value: 60_000
b. For delta snapshot mode, you can add
delta-snapshot-key
column to SingleStore tables if not present already with the followingALTER
(andUPDATE
) statement per table.ALTER TABLE tpch.LINEITEM_CS ADD COLUMN replicate_io_delta_snapshot_key TIMESTAMP NOT NULL DEFAULT current_timestamp ON UPDATE current_timestamp;
UPDATE tpch.LINEITEM_CS set replicate_io_delta_snapshot_key = current_timestamp;
Then make the necessary changes as follows in the
delta-snapshot
section of the configuration file:delta-snapshot: #threads: 32 #fetch-size-rows: 10_000 #min-job-size-rows: 1_000_000 max-jobs-per-chunk: 32 _max-delete-jobs-per-chunk: 32 delta-snapshot-key: last_update_time delta-snapshot-interval: 10 delta-snapshot-delete-interval: 10 _traceDBTasks: true replicate-deletes: false per-table-config: - schema: tpch tables: # testTable # split-key: split-key-column # Any numeric/timestamp column with sufficiently large number of distincts # split-hints: # row-count-estimate: 100000 # Estimated row count, if supplied replicant will leverage # split-key-min-value: 1 #Lower bound of split key value # split-key-max-value: 60_000 #Upper bound of split key value, if supplied replicant will leverage and avoid querying source database for the same # delta-snapshot-key: delta-snapshot-key-column # A monotonic increasing numeric/timestamp column which gets new value on each INSERT/UPDATE # row-identifier-key: [col1, col2] # A set of columns which uniquely identify a row # update-key: [col1, col2] # A set of columns which replicant should use to perform deletes/updates during incremental replication part: split-key: partkey partsupp: split-key: partkey supplier: orders: split-key: orderkey parts_view: update-key: [partkey] delta-snapshot-key: last_update_time split-key: last_update_time partsupp_macro: update-key: [partkey] delta-snapshot-key: last_update_time split-key: last_update_time
For a detailed explanation of configuration parameters in the extractor file, read Extractor Reference.