Table of Contents

Search

  1. Preface
  2. Introduction to Transformations
  3. Transformation Ports
  4. Transformation Caches
  5. Address Validator Transformation
  6. Aggregator Transformation
  7. Association Transformation
  8. Bad Record Exception Transformation
  9. Case Converter Transformation
  10. Classifier Transformation
  11. Comparison Transformation
  12. Consolidation Transformation
  13. Data Masking Transformation
  14. Data Processor Transformation
  15. Decision Transformation
  16. Duplicate Record Exception Transformation
  17. Expression Transformation
  18. Filter Transformation
  19. Hierarchical to Relational Transformation
  20. Java Transformation
  21. Java Transformation API Reference
  22. Java Expressions
  23. Joiner Transformation
  24. Key Generator Transformation
  25. Labeler Transformation
  26. Lookup Transformation
  27. Lookup Caches
  28. Dynamic Lookup Cache
  29. Macro Transformation
  30. Match Transformation
  31. Match Transformations in Field Analysis
  32. Match Transformations in Identity Analysis
  33. Normalizer Transformation
  34. Merge Transformation
  35. Parser Transformation
  36. Python Transformation
  37. Rank Transformation
  38. Read Transformation
  39. Relational to Hierarchical Transformation
  40. REST Web Service Consumer Transformation
  41. Router Transformation
  42. Sequence Generator Transformation
  43. Sorter Transformation
  44. SQL Transformation
  45. Standardizer Transformation
  46. Union Transformation
  47. Update Strategy Transformation
  48. Web Service Consumer Transformation
  49. Parsing Web Service SOAP Messages
  50. Generating Web Service SOAP Messages
  51. Weighted Average Transformation
  52. Window Transformation
  53. Write Transformation
  54. Appendix A: Transformation Delimiters

Developer Transformation Guide

Developer Transformation Guide

Converting to Struct Data Example

Converting to Struct Data Example

Your organization needs to convert a large volume of customer data in a flat file to struct data and write it to an Avro file. The input file contains customer details such as name, age, and phone numbers. If the customer name is null in the input file, you do not want to add customer details to the output file.
You can develop a mapping with a Java transformation to define the transformation functionality. In the Hadoop environment, run the mapping on the Spark engine to transform the data and write the struct data to an Avro file.
Create a mapping and configure the following transformations:
  • Read transformation that reads customer information from a flat file source
  • Java transformation as an active transformation that converts flat data to struct data and removes inconsistent data
  • Write transformation that writes the struct data to an Avro file
The following image shows the mapping with a Read transformation, a Java transformation, and a Write transformation.
The mapping m_JavaTx_StructConversion contains a Read transformation that represents the flat file source Customer_Flat. The mapping contains a Java transformation that converts flat data to struct data and a Write transformation that represents the Avro target Customer_Avro.
On the type definition library tab of the mapping editor, create a complex data type definition Customer. The complex data type definition represents the schema of the struct data. Rename the type definition library to CustomerInfo. Add the following elements to the complex data type definition:
  • name of type string
  • age of type integer
  • phones of type array with string elements
The following image shows the complex data type definition in the type definition library:
The type definition library CustomerInfo contains the complex data type definition Customer with the elements name, age, and phones.
In the Java transformation, add a struct output port and specify the type configuration of the port to reference the complex data type definition that you created. The Java transformation generates a class Customer with setters and getters to read and set the member fields. The class contains the following member fields:
  • _name
  • _age
  • _phones
The following image shows the class created for the struct port in the
Full Code
tab of the
Java
view:
The Full Code tab of the Java view on the Java transformation Properties tab shows the outer class CustomerInfo and the inner class Customer with getters and setters for the member fields. The member fields _name, _age, and _phones are of a Java data type.
The Java data type for the struct port uses the name of the type definition library and complex data type definition. The following image shows the Java data type name CustomerInfo.Customer for the cust field in the generated code:
The Full Code tab of the Java view shows the code that the Java transformation generates. The code shows the cust field that is created for the struct port. The field is of the Java data type CustomerInfo.Customer.
In the
Java
view of the Java transformation, import any third-party, built-in, or custom Java packages that the transformation requires. Write and compile the Java code to convert the flat data into struct data and to remove the customer row if the customer name is null.
The following image shows the code in the
On Input
tab:
 The On Input tab of the Java view shows the code that defines the transformation functionality.
Validate the mapping and run the mapping on the Spark engine to write the transformed data to the Avro file output.

0 COMMENTS

We’d like to hear from you!