Effective in version 10.2.1, changes in Spark monitoring relate to the following areas:
Updates in the Summary Statistics view
Effective in version 10.2.1, only monitoring information is checked in the Spark events in the session log.
Previously, all the Spark events were relayed as is from the Spark application to the Spark executor. When the events relayed took a long time, performance issues occurred.
For more information, see the
Informatica Big Data Management 10.2.1 User Guide
Summary Statistics View
Effective in version 10.2.1, you can view the statistics for Spark execution based on the run stages. For instance, Spark Run Stages shows the statistics of spark application run stages. Stage_0 shows the statistics related to run stage with ID=0 in the spark application. Rows and Average Rows/Sec show the number of rows written out of the stage and the corresponding throughput. Bytes and Average Bytes/Sec show the bytes and throughput broadcasted in the stage.
Previously, you could only view the Source and Target rows and average rows for each second processed for the Spark run.