Destination Microsoft SQL Server #
The extracted replicant-cli
will be referred to as the $REPLICANT_HOME
directory in the following steps.
I. Set up connection configuration #
Specify the connection details of your SQL Server instance to Replicant with a connection configuration file. You can find a sample connection configuration file sqlserver.yaml
in $REPLICANT_HOME/conf/conn
directory. The following configuration parameters are available:
type
#
The connection type representing the database. In this case, it’s SQLServer
.
host
#
The hostname of your SQL Server system.
port
#
The port number to connect to the host
.
username
#
The username credential to access the SQL Server system.
password
#
The password associated with username
.
max-connections
#
The maximum number of connections Replicant uses to load data into the SQL Server system.
For Arcion self-hosted only #
You can store your connection credentials in a secrets management service and tell Replicant to retrieve the credentials. For more information, see Secrets management.
The following is a sample connection configuration:
type: SQLSERVER
host: 192.168.0.166
port: 1433
username: 'replicate'
password: 'Replicate#123'
max-connections: 30
II. Configure mapper file (optional) #
If you want to define data mapping from source to your target SQL Server, specify the mapping rules in the mapper file. For more information on how to define the mapping rules and run Replicant CLI with the mapper file, see Mapper Configuration.
III. Set up Applier configuration #
To configure replication mode according to your requirements, specify your configuration in the Applier configuration file. You can find a sample Applier configuration file sqlserver.yaml
in the $REPLICANT_HOME/conf/dst
directory.
Arcion Replicant supports the following Replicant modes for SQL Server as target:
snapshot
full
realtime
delta-snapshot
For more information about different Replicant modes, see Running Replicant.
You can configure replication modes by specifying the parameters under their respective sections in the configuration file. See the following sections for more information.
Configure snapshot
mode
#
The following is a sample configuration for operating in snapshot
mode:
snapshot:
threads: 32
batch-size-rows: 10_000
txn-size-rows: 1_000_000
identity-column-insert: false
Additional snapshot
parameters
#
identity-insert-column
-
true
orfalse
.Default:
false
.Controls how Replicant handles identity columns in the following manner:
true
. Replicant copies the values of identity columns from source to target.false
. Replicant doesn’t copy the values of identity columns from source to target. Instead, Replicant generates the values of identity columns in the target.
You can use this parameter both globally and in
per-table-config
s. For example, the following sample disablesidentity-insert-column
globally and enablesidentity-insert-column
for a tabletableName
:snapshot: threads: 32 batch-size-rows: 10_000 txn-size-rows: 1_000_000 identity-column-insert: false per-table-config: - catalog: catalogName schema: schemaName tables: tableName: identity-column-insert: true
Use bulk loading #
If you want to use bulk loading in snapshot mode, use the bulk-load
section to specify your configuration. For example:
bulk-load:
enable: true
type: FILE
bulk-file-location: \\192.168.0.199\Replicant
Important:
If you enable bulk loading, make sure to specify the
bulk-file-location
parameter. This is a Samba URL. Using this URL, SQL Server (running on Windows on some network) can access the files Replicant (running on the same network) generates and ingests them. If you disable bulk loading by settingenable
tofalse
, giving any value tobulk-file-location
will throw parsing error.The Samba URL must be a valid path so that the Windows machine can access the directory shared by the machine Replicant is running on.
For information on how to set up the shared directory with Samba, see Set up shared directory for bulk loading with Samba.
Replicant only supports
FILE
type bulk loading.
For more information about the configuration parameters for snapshot mode, see Snapshot Mode.
Configure realtime
mode
#
The following is a sample configuration for operating in realtime
mode:
realtime:
threads: 8
txn-size-rows: 10000
batch-size-rows: 1000
identity-column-insert: false
Additional realtime
parameters
#
identity-insert-column
-
true
orfalse
.Default:
true
.Controls how Replicant handles identity columns in the following manner:
true
. Replicant copies the values of identity columns from source to target.false
. Replicant doesn’t copy the values of identity columns from source to target. Instead, Replicant generates the values of identity columns in the target.
You can use this parameter both globally and in
per-table-config
s. For example, the following sample enablesidentity-insert-column
globally and disablesidentity-insert-column
for a tabletableName
:realtime: threads: 8 txn-size-rows: 10000 batch-size-rows: 1000 identity-column-insert: true per-table-config: - catalog: catalogName schema: schemaName tables: tableName: identity-column-insert: false
Important: Make sure to set
identity-insert-column
totrue
when you use identity column as primary key inrealtime
mode.
For more information about the configuration parameters for realtime
mode, see Realtime Mode.
Delta-snapshot modes #
Arcion Replicant supports the following delta-snapshot
modes:
UPDATE_INSERT
- This mode works in the following manner:
- Data is inserted into a temporary table.
- Non-matching data is inserted from the temporary table into the original table.
- Matching data is updated from the temporary table to the original table.
DELETE_INSERT
- This mode works in the following manner:
- Data is inserted into a temporary table.
- Matching data is deleted from the original table.
- Data is copied from the temporary table to the original table.
SINGLE_DELETE_INSERT
-
This mode works in same way as
DELETE_INSERT
except that it works in parallel for multiple Applier threads. -
Install Samba with the following command:
samba sudo apt install samba
-
Back up your original Samba configuration file:
cp /etc/samba/smb.conf /etc/samba/smb.conf.orig
-
Open
/etc/samba/smb.conf
and replace its content with the following content:[Replicant] path = PATH_TO_SHARED_DIR force user = USERNAME force group = GROUPNAME create mask = 0664 force create mode = 0664 directory mask = 0775 force directory mode = 0775 public = yes writable = yes
Replace the following:
PATH_TO_SHARED_DIR
: the path to the shared directory on the network—for example,/home/replicant/arcion_work/replicant-core/data/{replication_id}/tmp
_USERNAME
: the effective username for a user accessingpath
GROUPNAME
: the effective group for a user accessingpath
Other machine on the same network or VM sees the
path
with theReplicant
section name (specified in square brackets). Make sure thatforce user
andforce group
both have permissions to read files from the shared directory. -
Finally, restart the systemd service:
sudo systemtcl restart smbd.service
Enable recovery #
To enable recovery, you must run Replicant CLI with the --replace
option. For more information, see Running Replicant.
Set up shared directory for bulk loading with Samba #
We use Samba to share a directory between the host running Replicant and the Windows VM running SQL Server. When using bulk loading, this shared directory allows the Windows VM access to the data files Replicant generates. Follow these steps for sharing directories across network without the hassle of passwords:
Now you can run Replicant with bulk loading enabled.
Caution: Whenever you change the replication ID, you need to make changes to smb.cnf
and then restart the systemd service manually.