- Posts: 38
- Thank you received: 0
best practice for Transformations
9 years 4 weeks ago #11198
by thatisfun
best practice for Transformations was created by thatisfun
All the sample for transformations are simple. like there are only several columns and simply rules need to be applied.
What is the best practice for large table and hundreds business need to reply. such like following cases:
Reader table: 60 columns
writer table : 80 columns
there are about 200 rules need to apply.
What is the best practice for this case? put every transformation rules in one will be huge.
What is the common best practice for this case?
What is the best practice for large table and hundreds business need to reply. such like following cases:
Reader table: 60 columns
writer table : 80 columns
there are about 200 rules need to apply.
What is the best practice for this case? put every transformation rules in one will be huge.
- using filter separate different case and put into different transformations?
- change Database schema ?
What is the common best practice for this case?
Please Log in or Create an account to join the conversation.
9 years 4 weeks ago #11200
by admin
Mike
ETL Architect
Replied by admin on topic best practice for Transformations
Thank you for posting such an interesting question.
The best practice would be to use divide and conquer strategy
Use several transformations instead of one
Use sub transformations
Use tabs inside transformations.
Use library of objects
Use version control
Another very important thing to understand that mapping is position based.
That means if you modify table structure by inserting field in the middle all your mapping will be shifted
This issue is very easy to address you can add fields to the end instead or
just insert another transformation and use auto-map function
Peter
The best practice would be to use divide and conquer strategy
Use several transformations instead of one
Use sub transformations
Use tabs inside transformations.
Use library of objects
Use version control
Another very important thing to understand that mapping is position based.
That means if you modify table structure by inserting field in the middle all your mapping will be shifted
This issue is very easy to address you can add fields to the end instead or
just insert another transformation and use auto-map function
Peter
Mike
ETL Architect
The following user(s) said Thank You: ckelsoe
Please Log in or Create an account to join the conversation.
9 years 2 days ago #11468
by thatisfun
Replied by thatisfun on topic best practice for Transformations
Thanks a lot for your suggestions. we still have following questions.
1. reader loading procedure. let's say we load everything & process it. and write into another table. how do we update the table we read from and make sure we didn't read old record which processed already.
2. for Transformation design. what is the best practice. if the destination table schema change . how do we update the transformation node? looks like we have to redo it again. refresh we lost all the link. ideally it should only remove the link which changed.
1. reader loading procedure. let's say we load everything & process it. and write into another table. how do we update the table we read from and make sure we didn't read old record which processed already.
2. for Transformation design. what is the best practice. if the destination table schema change . how do we update the transformation node? looks like we have to redo it again. refresh we lost all the link. ideally it should only remove the link which changed.
Please Log in or Create an account to join the conversation.
9 years 2 days ago - 9 years 2 days ago #11469
by admin
Mike
ETL Architect
Replied by admin on topic best practice for Transformations
1. reader loading procedure. let's say we load everything & process it. and write into another table. how do we update the table we read from and make sure we didn't read old record which processed already.
You can use sql as source
select * from table where updated_flag='N'
and two data writers
one will load data into target table
Anoter one will update updated_flag to 'Y'
2. for Transformation design. what is the best practice. if the destination table schema change . how do we update the transformation node? looks like we have to redo it again. refresh we lost all the link. ideally it should only remove the link which changed.
This question was already answered above
YOU DO NOT HAVE TO REDO EVERYTHING
Mapping is position based.
That means if you modify table structure by inserting field in the middle all your mapping will be shifted
This issue is very easy to address you can add fields to the end instead or
just insert another transformation and use auto-map function
Please post screeshots and follow the forum rules next time
Mike
You can use sql as source
select * from table where updated_flag='N'
and two data writers
one will load data into target table
Anoter one will update updated_flag to 'Y'
2. for Transformation design. what is the best practice. if the destination table schema change . how do we update the transformation node? looks like we have to redo it again. refresh we lost all the link. ideally it should only remove the link which changed.
This question was already answered above
YOU DO NOT HAVE TO REDO EVERYTHING
Mapping is position based.
That means if you modify table structure by inserting field in the middle all your mapping will be shifted
This issue is very easy to address you can add fields to the end instead or
just insert another transformation and use auto-map function
Please post screeshots and follow the forum rules next time
Mike
Mike
ETL Architect
Last edit: 9 years 2 days ago by admin.
Please Log in or Create an account to join the conversation.
9 years 2 days ago #11470
by admin
Mike
ETL Architect
Replied by admin on topic best practice for Transformations
This is an old article but still useful
www.etl-tools.com/articles/add-new-and-u...xisting-records.html
www.etl-tools.com/articles/add-new-and-u...xisting-records.html
Mike
ETL Architect
Please Log in or Create an account to join the conversation.
9 years 2 days ago #11473
by thatisfun
Replied by thatisfun on topic best practice for Transformations
Thanks for your quick reply. however your sample is for one table. read and write to the same table. when it happens between two tables. how does ETL synchronize those processes?
I believe insert and update is linear operations. however the suggestion is parallel.
I believe insert and update is linear operations. however the suggestion is parallel.
Please Log in or Create an account to join the conversation.