Data Synchronization part 2

In previous article we demonstrated how easy it is to synchronize the data between different databases.

The proposed approach has some disadvantages

Writer 1 may put too much pressure on the target database

"for every incoming record runs the following SQL
select count(*) as rec_count from target table where PRIMARY_KEY=?
if rec_count is zero a new record is inserted into target table"

Here is an alternative approach

Reader

Reads new records from the source table using the following SQL

Select
 TBL_SOURCE.PRIMARY_KEY,
 TBL_SOURCE.DATA,
 TBL_SOURCE.UPDATE_FLAG
From
 TBL_SOURCE
Where
 TBL_SOURCE.UPDATE_FLAG = 'N'

Validator

Checks if the record exists in the database and it does discard it

Is In list validation function pools list of primary key values and builds tree in the memory, this is very fast and efficient way to validate data

Transformer 1

Writer 1

Adds new record

Once loading is completed the following SQL is executed to mark processed records in the target table.

 update TBL_TARGET
 SET UPDATE_FLAG='Y'
 WHERE UPDATE_FLAG='N'

Using this approach allow to easily identify newly inserted records in case of failure

Sorter

Sorter is used to delay passing records to writer 2 until writer 1 is completed

Transformer 2

Writer 2

Updates records in source table

 update TBL_SOURCE
 SET    UPDATE_FLAG = 'Y'
 where  PRIMARY_KEY=?

For more technologies supported by our ETL Software see Advanced ETL Processor Versions and Visual Importer ETL Versions

Confused? Ask question on our ETL forum

  • knowledgebase/data_synchronization_part_2.txt
  • Last modified: 13/04/2019 17:24
  • by admin