In the previous article, we demonstrated how easy it is to synchronize the data between different databases.
The proposed approach has some disadvantages
Writer 1 may put too much pressure on the target database
"for every incoming record runs the following SQL
select count(*) as rec_count from target table where PRIMARY_KEY=?
if rec_count is zero new record is inserted into target table"
Here is an alternative approach
Reader
Reads new records from the source table using the following SQL
Select
TBL_SOURCE.PRIMARY_KEY,
TBL_SOURCE.DATA,
TBL_SOURCE.UPDATE_FLAG
From
TBL_SOURCE
Where
TBL_SOURCE.UPDATE_FLAG = 'N'
Validator
Checks if record exists in the database and it does discard it
Is In list validation function pools list of primary key values and builds the tree in the memory, this is a very fast and efficient way to validate data
Transformer 1
Writer 1
Adds new record
Once loading is completed the following SQL is executed to mark processed records in the target table.
update TBL_TARGET
SET UPDATE_FLAG='Y'
WHERE UPDATE_FLAG='N'
Using this approach allow to easily identify newly inserted records in case of failure
Sorter
The sorter is used to delay passing records to writer 2 until writer 1 is completed
Transformer 2
Writer 2
Updates records in the source table
update TBL_SOURCE
SET UPDATE_FLAG = 'Y'
where PRIMARY_KEY=?
Download Free Trial | Learn More About Our ETL Software |