Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Connections
  4. Mappings in the Hadoop Environment
  5. Mapping Objects in the Hadoop Environment
  6. Processing Hierarchical Data on the Spark Engine
  7. Stateful Computing on the Spark Engine
  8. Monitoring Mappings in the Hadoop Environment
  9. Mappings in the Native Environment
  10. Profiles
  11. Native Environment Optimization
  12. Data Type Reference
  13. Complex File Data Object Properties
  14. Function Reference
  15. Parameter Reference

Spark Engine Monitoring

Spark Engine Monitoring

You can monitor statistics and view log events for a Spark engine mapping job in the Monitor tab of the Administrator tool. You can also monitor mapping jobs for the Spark engine in the YARN web user interface.
The following image shows the Monitor tab in the Administrator tool:
The Monitor tab is selected in the Administrator tool. The Execution Statistics view is selected, and the navigator shows Ad Hoc Jobs selected on the left. A list of jobs appears in the contents panel.
The Monitor tab has the following views:

Summary Statistics

Use the
Summary Statistics
view to view graphical summaries of object states and distribution across the Data Integration Services. You can also view graphs of the memory and CPU that the Data Integration Services used to run the objects.

Execution Statistics

Use the
Execution Statistics
view to monitor properties, run-time statistics, and run-time reports. In the Navigator, you can expand a Data Integration Service to monitor
Ad Hoc Jobs
or expand an application to monitor deployed mapping jobs or workflows
When you select
Ad Hoc Jobs
, deployed mapping jobs, or workflows from an application in the Navigator of the
Execution Statistics
view, a list of jobs appears in the contents panel. The contents panel displays jobs that are in the queued, running, completed, failed, aborted, and cancelled state. The Data Integration Service submits jobs in the queued state to the cluster when enough resources are available.
The contents panel groups related jobs based on the job type. You can expand a job type to view the related jobs under it.
Access the following views in the
Execution Statistics
view:
Properties
The
Properties
view shows the general properties about the selected job such as name, job type, user who started the job, and start time of the job.
Spark Execution Plan
When you view the Spark execution plan for a mapping, the Data Integration Service translates the mapping to a Scala program and an optional set of commands. The execution plan shows the commands and the Scala program code.
Summary Statistics
The
Summary Statistics
view appears in the details panel when you select a mapping job in the contents panel. The
Summary Statistics
view displays the following throughput statistics for the job:
  • Source. The name of the mapping source file.
  • Target name. The name of the target file.
  • Rows. The number of rows read for source and target.
The following image shows the
Summary Statistics
view in the details panel for a mapping run on the Spark engine:
The Monitor tab is selected in the Administrator tool. Summary Statistics is selected in the details panel below. This panel displays row counts for source and target rows.
Detailed Statistics
The
Detailed Statistics
view appears in the details panel when you select a mapping job in the contents panel. The
Detailed Statistics
view displays a graph of the row count for the job run.
The following image shows the
Detailed Statistics
view in the details panel for a mapping run on the Spark engine:
The Monitor tab is selected in the Administrator tool. Detailed Statistics is selected in the details panel below, which displays a graph showing the number of rows processed in thousands during a two-minute period.


Updated December 13, 2018