Table of Contents

Search

  1. Preface
  2. Transformations
  3. Source transformation
  4. Target transformation
  5. Aggregator transformation
  6. Cleanse transformation
  7. Data Masking transformation
  8. Deduplicate transformation
  9. Expression transformation
  10. Filter transformation
  11. Hierarchy Builder transformation
  12. Hierarchy Parser transformation
  13. Hierarchy Processor transformation
  14. Input transformation
  15. Java transformation
  16. Java transformation API reference
  17. Joiner transformation
  18. Labeler transformation
  19. Lookup transformation
  20. Mapplet transformation
  21. Normalizer transformation
  22. Output transformation
  23. Parse transformation
  24. Python transformation
  25. Rank transformation
  26. Router transformation
  27. Rule Specification transformation
  28. Sequence Generator transformation
  29. Sorter transformation
  30. SQL transformation
  31. Structure Parser transformation
  32. Transaction Control transformation
  33. Union transformation
  34. Velocity transformation
  35. Verifier transformation
  36. Web Services transformation

Transformations

Transformations

Relational to hierarchical example

Relational to hierarchical example

You need to create a purchase order file in hierarchical format using customer sales data from two purchase order tables and the customer address table.
Use the Hierarchy Processor transformation to create purchase orders in hierarchical format.
You will perform the following high-level tasks to create and configure the target file:
  • Filter by order number and ship-to address to build the shipping address struct.
  • Order by item number and group by part number to build the items array of structs.
  • Aggregate the item price and quantity.
  • Aggregate the total price for each purchase order.
  • Join the data sources to build output data for the hierarchical purchase orders.
  • Filter same-day shipping information to build the same-day items array of structs.
The POHeader table contains basic information about the orders placed by customers:
OrderNumber
Comment
OrderDate
ConfirmDate
1
AppD for POD4
2020-10-01 00:00:00.0
2020-10-02 00:00:00.0
2
GoJS for IICS
2020-10-12 00:00:00.0
2020-10-12 00:00:00.0
The Address table contains customer address information for each order:
OrderNumber
AddressType
Name
Street
City
State
Country
Zip
1
ShipTo
Tom
2100 Seaport Blvd
Redwood City
CA
USA
94063
1
BillTo
Tom
2100 Seaport Blvd
Redwood City
CA
USA
94063
2
ShipTo
Bill
1630 S Delaware St
San Mateo
CA
USA
94402
2
BillTo
Bill
PO Box 313
San Mateo
CA
USA
94402
The PODetail table contains details about the customer purchase orders:
OrderNumber
ItemNum
ProductName
Quantity
Price
Comment
ShipDate
PartNum
1
1
AppD Agent for JVM
60
500
JVM agents
2020-10-15 00:00:00.0
1
1
3
ELB agents
10
200
ELB agents
2020-10-15 00:00:00.0
3
1
2
MySQL agents
2
120
MySQL agents
2020-10-16 00:00:00.0
2
1
4
MySQL agents
2
120
MySQL agents
2020-10-01 00:00:00.0
2
1
5
MySQL agents
2
120
MySQL agents
2020-10-01 00:00:00.0
2
2
1
GOJS OEM Edition
2
20000
GOJS Dev
2020-10-19 00:00:00.0
101
2
2
GOJS Professional Service
5
5000
GOJS Dev
2020-10-19 00:00:00.0
102
To use the Hierarchy Processor transformation to create purchase orders in hierarchical format, perform the following tasks:
Design the mapping.
Use the Mapping Designer to perform the following steps:
  1. Add a Hierarchy Processor transformation and change the output data format to Hierarchical.
  2. Add the POHeader, PODetail, and Address tables as source objects.
  3. Connect the source objects to the Hierarchy Processor transformation in the data flow.
  4. In the Hierarchy Processor transformation, add the PurchaseOrder output group and connect the target object in the data flow.
The following image shows the mapping:
The mapping contains three Source transformations that are connected to a Hierarchy Processor transformation that is connected to one Target transformation.
Build the output group and create a struct.
Use the following steps to create the output group with the basic purchase order data and add the ship-to address.
  1. Add all the incoming fields from POHeader to the PurchaseOrder output group.
  2. Add a new output field with the following properties:
    Property
    Value
    Child Of
    PurchaseOrder
    Name
    shipToAddress
    Type
    struct
    Struct Name
    address_struct
  3. Add all the incoming fields from Address to the shipToAddress struct in the output group:
    The list of incoming fields on the left includes the Address table. The output group on the right contains the empty shipToAddress struct.
  4. Delete the following fields that you do not need in the output group:
    • PurchaseOrder.shipToAddress.OrderNumber
    • PurchaseOrder.shipToAddress.AddressType
  5. Add a filter condition for the PurchaseOrder.shipToAddress struct:
    :fld.{Address.OrderNumber}=:fld.{PurchaseOrder.OrderNumber} AND :fld.{Address.AddressType}='ShipTo'
    .
Create an array of structs.
Use the following steps to add the purchase order details in the items array of structs. Configure the data processing strategies to sort by item number, group by part number, and aggregate the incoming quantity and price.
  1. Add a new output field with the following properties:
    Property
    Value
    Child Of
    PurchaseOrder
    Name
    Items_arr
    Type
    array
    Array Element Type
    struct
    Element Struct Name
    item_str
  2. Add all the incoming fields from PODetail to the Items_arr array in the output group.
  3. Delete the following field that you do not need in the output group: PurchaseOrder.Items_arr.OrderNumber.
  4. Add a filter condition for the PurchaseOrder.Items_arr array:
    :fld.{PODetail.OrderNumber}=:fld.{PurchaseOrder.OrderNumber}
    .
  5. Configure a group by field for the PurchaseOrder.Items_arr array: PODetail.PartNum.
  6. Configure an order by field in ascending order for the PurchaseOrder.Items_arr array: PODetail.ItemNum.
  7. Update the field expression for PODetail.Quantity in the PurchaseOrder.Items_arr array:
    SUM(:fld.{PODetail.Quantity})
    to aggregate quantity.
  8. Update the field expression for PODetail.Price in the PurchaseOrder.Items_arr array:
    SUM(:fld.{PODetail.Price})
    to aggregate price.
The following image shows the data configuration icons and expressions for the Items_arr array in the output group.
The Items_arr output group on the right displays active icons for data sources, filter, group by, and order by. The Quantity and Price fields show the aggregate expressions.
Aggregate the output data.
Use the following steps to aggregate all the items in a particular purchase order, providing the total price.
  1. Add a new output field with the following properties:
    Property
    Value
    Child Of
    PurchaseOrder
    Name
    TotalPrice
    Type
    bigint
    Aggregate Options: This field will aggregate values in an output field array
    Enabled
    Output Field
    Items_arr
    The following image shows the aggregate options for the TotalPrice output field:
    Aggregate Options indicates that the Items_arr array output values will be used to aggregate into TotalPrice.
  2. Configure a field expression for PurchaseOrder.TotalPrice:
    SUM(:fld.{PurchaseOrder.Items_arr.item_str.Quantity}*:fld.{PurchaseOrder.Items_arr.item_str.Price})
    to aggregate the total price.
Create an array of structs and join data sources.
Use the following steps to add and configure the same-day items array of structs. Using a filter, a join, and a field expression, you output only the items that were ordered and shipped on the same date.
  1. Add a new output field with the following properties:
    Property
    Value
    Child Of
    PurchaseOrder
    Name
    SameDayItems
    Type
    array
    Array Element Type
    struct
    Element Struct Name
    sameday_str
  2. Add all the incoming fields from PODetail to the SameDayItems array in the output group.
  3. Delete the following field that you do not need in the output group: PurchaseOrder.SameDayItems.OrderNumber.
  4. Add POHeader as a data source for PurchaseOrder.SameDayItems array.
  5. Add a join condition for the PurchaseOrder.SameDayItems array with the following properties:
    Property
    Value
    Left Group
    POHeader
    Join Type
    Inner
    Right Group
    PODetail
    Join Condition
    :fld.{POHeader.OrderDate}=:fld.{PODetail.ShipDate} AND :fld.{POHeader.OrderNumber}=:fld.{PODetail.OrderNumber}
    The following image shows the data sources and join condition for SameDayItems:
    The Data Configuration dialog shows the selected data sources: POHeader and PODetail. The join condition shows Left Group: POHeader, Join Type: Inner, Right Group: PODetail.
  6. Add a filter condition for the PurchaseOrder.SameDayItems array:
    :fld.{PODetail.OrderNumber}=:fld.{PurchaseOrder.OrderNumber}
    .
Run the mapping.
  1. Create a mapping task.
  2. Run the mapping task.
  3. Review your output.
The following JSON shows the PurchaseOrder target output after you run the mapping:
{ "OrderNumber": "1", "Comment": "AppD for POD4", "OrderDate": "2018-10-01 00:00:00.0", "ConfirmDate": "2018-10-02 00:00:00.0", "address_struct": { "Name": "Tom", "Street": "2100 Seaport blvd", "City": "Redwood City", "State": "CA", "Country": "USA", "Zip": "94063" }, "Items_arr": [{ "itemNum": "1", "ProductName": "AppD Agent for JVM", "Quantity": 60, "price": 500, "comment": "JVM agents", "shipDate": "2018-10-15 00:00:00.0", "PartNum": "1" }, { "itemNum": "2", "ProductName": "MySQL agents", "Quantity": 6, "price": 360, "comment": "MySQL agents", "shipDate": "2018-10-15 00:00:00.0", "PartNum": "2" }, { "itemNum": "3", "ProductName": "ELB agents", "Quantity": 10, "price": 200, "comment": "ELB agents", "shipDate": "2018-10-16 00:00:00.0", "PartNum": "3" }], "TotalPrice": 34160 } { "OrderNumber": "2", "Comment": "GoJS for IICS", "OrderDate": "2018-10-12 00:00:00.0", "ConfirmDate": "2018-10-12 00:00:00.0", "address_struct": { "Name": "Bill", "Street": "23rd Ave", "City": "San Mateo", "State": "CA", "Country": "USA", "Zip": "94401" }, "Items_arr": [{ "itemNum": "1", "ProductName": "GOJS OEM Edition", "Quantity": 2, "price": 20000, "comment": "GOJS Dev", "shipDate": "2018-10-19 00:00:00.0", "PartNum": "101" }, { "itemNum": "2", "ProductName": "GOJS Prefessional Service", "Quantity": 5, "price": 5000, "comment": "GOJS Dev", "shipDate": "2018-10-19 00:00:00.0", "PartNum": "102" }], "TotalPrice": 65000 }