XML

  • Great news Advanced ETL Processor support XSLT transformations now

    A lot of XML files have very complex structure and XSLT allows to transform the data into more readable format:

    <table>
    <record>
    <field1>value1<field1>
    <field2>value2<field1>
    <record>
    <record>
    <field1>value1<field1>
    <field2>value2<field1>
    <record>
    <table>

    XML can be transformed before loading the data or if the user wants to process/validate XML which is stored in the database field two new data transformation functions can be used

    XML Transfomation Functions

    XML Editor

    We would like to thank our customers for providing useful feedback

    Direct link, no registration required.
  • Loading data from XML can be a very complex task, but the complexity of this task depends on people who design XML at the first place. In this article, we will provide you with some examples of loading data from XML files and transforming it. We will also talk about things to avoid and how to make the life of developers easier.

    What is XML anyway?

    Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards.
    The design goals of XML emphasize simplicity, generality, and usability over the Internet. It is a textual data format with strong support via Unicode for the languages of the world. Although the design of XML focuses on documents, it is widely used for the representation of arbitrary data structures, for example in web services.

    Source: Wikipedia

    XML is incredibly flexible, but it has some disadvantages as well.

    For example:

    <CustomerOrderMessage>
    <OrderNumber>1</OrderNumber>
    </CustomerOrderMessage>

    The XML above has only one byte of information the rest of it is metadata. Using too much metadata requires more processor power and increase network traffic(Which is great news for hardware vendors, but bad news for the people who have to pay for it)

    The flexibility of XML can lead to unnecessary complexity and it can make it hard for developers to understand, therefore lead to mistakes and development time and cost increases.

    In some of the cases, it can be necessary to convert XML into a simplified format so it can be loaded into the database.
    This XML can be loaded by most of the ETL tools.

    Note:CustomerTable is a "Table tag" and CustomerRecord  is a "Record Tag"

    <CustomerTable>
    <CustomerRecord>
    <CustomerID>1</CustomerID>
    <CustomerName>Peter Jones</CustomerName>
    </CustomerRecord>
    <CustomerRecord>
    <CustomerID>2</CustomerID>
    <CustomerName>Bill Watson</CustomerName>
    </CustomerRecord>
    </CustomerTable>

    This XML may need to be transformed in the format above so it can be laoded into the database later:

    <CustomerTable>
    <CustomerRecord CustomerID="1"CustomerName="Peter Jones"/>
    <CustomerRecord CustomerID="2" CustomerName="Bill Watson"/>
    </CustomerTable>

    And as a Conclusion here is some basic XML design tips:

    • Use XML when it is necessary
    • Too much metadata is a bad thing
    • Keep tags short
    • Keep it simple and clean
    • Check ETL documentation and design XML in such a way so it can be loaded without conversion
    Direct link, no registration required.
  • Here is an XML example

    XML With Attributes

    To load it all we need to do is to set data source type to XML and select appropriate XML tags for "Table Tag"and "Record Tag"

    Data reader settings :
    XML with Attributes Datareader

    Note: "Ignore Tags" is checked

    There is no data in the grid because we have to transform XML into more readable format first, using XSLT.
    XML With Attributes Data

    About XSLT:

    XSLT (Extensible Stylesheet Language Transformations) is a language for transforming XML documents into other XML documents, or other objects such as HTML for web pages, plain text or into XSL Formatting Objects which can then be converted to PDF, PostScript, and PNG.

    Typically, input documents are XML files, but anything from which the processor can build an XQuery and XPath Data Model can be used, for example, relational database tables, or geographical information systems.

    Source:Wikipedia

    XSLT:

    XSLT Transformation

    Note: To get to this dialogue: open data reader properties, click XML file, check "Transform XML" and click Magnifying glass button

    Here is the result of the XSLT transformation:
    XSLT transformation Result

    Note: 

    More information about XSLT can be found here: 
    http://www.w3schools.com/xsl/xsl_transformation.asp

    Direct link, no registration required.
  • Generate XML data using any data source

    Several New benefits of using our ETL Tools

    Support for XML

    In the latest version we've introduced 8 new transformation functions:

    • In-Place Replace
    • Encode XML Element
    • Decode XML Element
    • Encode XML Attribute
    • Decode XML Attribute
    • XSLT Transformation
    • Encode XML String
    • Decode XML String

    That makes it possible to perform following ETL Transformations:

    • Database table to XML File
    • The XML file to a database table
    • XML file to Flat file
    • Flat file to XML file
    What is XML

    XML (Extensible Markup Language) is a set of rules for encoding documents electronically. It is defined in the XML 1.0 Specification produced by the W3C and several other related specifications; all are fee-free open standards

    XML+ETL Challenges
    • XML data can be stored in the file or as a field in the database.
    • It can be created form any table or file or it can be loaded from any database or file

     Advanced ETL Processor Screenshot

    Generating XML Data

     

    Generating XML Data

     

    Direct link, no registration required.
  • We are constantly working on improving our ETL Software

    A file system as a data source

    In the latest version we've introduced file system as data-source so it is possible now:

    • Load images into a database
    • Process Separate XML files
    • Process HL7 messages from the folder
    • Process EDI Messages
    • Load context of a directory into the database
    • Load users' photos
    A file system as a data-target

    Perform following ETL Transformations

    • Export data from blob fields
    • Save every record as a separate XML file
    • Export email messages
    • Generate EDI messages

    Advanced ETL Processor Screenshot

    Loading Files into the Database
    Direct link, no registration required.
  • Here is a simple XML example

    Simple XML Example

    To load it all we need to do, is to set the data source type to XML and select appropriate XML tags for "Table Tag"and "Record Tag"

    Data reader settings :
    Simple XML Data Reader Settings

    Note: "Ignore Tags" is checked

    Here is our data in the grid:
    Simple XML Data

    More complex XML example:
  • Great news both Advanced ETL Processor and Visual Importer ETL support now XSLT transformations.

    A lot of XML files have very complex structure and XSLT allows to transform the data into more readable format:

    <table>
    <record>
    <field1>value1<field1>
    <field2>value2<field1>
    <record>
    <record>
    <field1>value1<field1>
    <field2>value2<field1>
    <record>
    <table>

    XML can be transformed before loading the data or if the user wants to process/validate XML which is stored in the database field two new data transformation functions can be used

    XML Transfomation Functions

    XML Data Source Options

    We would like to thank our customers for providing useful feedback

    Direct link, no registration required.
  • This article provides examples of complex and simple data. We recommend using  Advanced ETL Processor for working with complex data structures and Visual Importer ETL with simple; If in doubt please contact us and we will help you to choose.

    Simple data example: Excel file.

    Data has a stable structure and there is only one value in the cell. Most database tables and delimited files can be considered simple data. except when a single field has multiple values or holds XML data.

    Simple excel fileSimple XML file:

    XML tags are in the same order, the number of tags is always the same and there are no attributes, This XML does not require any additional transformation and can be easily loaded into the database.

    Simple XML

    Complex XML message:

    The structure is not stable, the number of tags is different for a different customer record, plus it uses attributes. This XML may need some additional transformations.

    Complex XML

    Complex Text file:

    Data is split into different rows, some empty rows should be ignored, some data must be taken from the header

    Complex TEXT

    SQL insert script:

    Data can be easily extracted or generated using  Advanced ETL Processor

    Insert Statement

    Direct link, no registration required.
  • XML is a very flexible format but this flexibility brings problems, for example, one of the tags can be optional or tags order can be changed.

    XSLT provides an easy way of transforming XML data into the standard format. XSLT can be applied during loading the data or in the latest version of Advanced ETL Processor and Visual Importer ETL using package action

    Note: XSLT stands for EXtensible Stylesheet Language, and is a style sheet language for XML documents. XSLT stands for XSL Transformations.

    Package Example

    XSLT Package Action

     

    XSLT Package Action1

     

    Direct link, no registration required.
 

This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies