Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings
  4. Sources
  5. Targets
  6. Transformations
  7. Data Preview
  8. Cluster Workflows
  9. Profiles
  10. Monitoring
  11. Hierarchical Data Processing
  12. Hierarchical Data Processing Configuration
  13. Hierarchical Data Processing with Schema Changes
  14. Intelligent Structure Models
  15. Stateful Computing
  16. Connections
  17. Data Type Reference
  18. Function Reference

Troubleshooting Blaze Monitoring

Troubleshooting Blaze Monitoring

When I run a mapping on the Blaze engine and try to view the grid task log, the Blaze Job Monitor does not fetch the full log.
The grid task log might be too large. The Blaze Job Monitor can only fetch up to 2 MB of an aggregated log. The first line of the log reports this information and provides the location of the full log on HDFS. Follow the link to HDFS and search for "aggregated logs for grid mapping." The link to the full log contains the grid task number.
The Blaze Job Monitor will not start.
Check the Hadoop environment logs to locate the issue. If you do not find an issue, stop the Grid Manager with the infacmd stopBlazeService command and run the mapping again.
The Monitoring URL does not appear in the Properties view of the Administrator tool.
Locate the URL in the YARN log.
When Blaze processes stop unexpectedly, Blaze does not save logs in the expected location.
When Blaze stops unexpectedly, you can access Blaze service logs through the YARN monitor. Use one of these methods:
  • The Grid Manager log contains all Blaze job container IDs and identifies the host on which Blaze ran. Alter the Grid Manager log URL with the container ID and host name of the Blaze host.
  • Run the command
    yarn logs -applicationID <Blaze Grid Manager Application ID>
    .
A Blaze Job Monitor that has been running for several days loses its connection to the Application Timeline Server on the Hortonworks cluster.
The Blaze engine requires a running Application Timeline Server on the cluster. When the Blaze engine starts a mapping run, the Blaze Job Monitor checks the state of the Application Timeline Server. The Grid Manager will start it if it is not running. When the connection to the Application Timeline Server is lost, the Blaze engine attempts to reconnect to it. If the Application Timeline Server stops during a Blaze mapping run, you can restart it by restarting the Grid Manager.
When the Application Timeline Server is configured to run on the cluster by default, the cluster administrator must manually restart it on the cluster.
When a mapping takes more than 24 hours to execute, the mapping fails.
When mappings run on the Blaze engine for more than 24 hours, some mappings might fail because the Orchestrator service has a default sunset time of 24 hours. After 24 hours, the Orchestrator shuts down, which causes the Blaze Grid Manager to shut down.
To increase the sunset time to be more than 24 hours, configure the following property in the Hadoop connection advanced properties:
infagrid.orchestrator.svc.sunset.time=[HOURS]
You can also disable sunset by setting the property to 0 or a negative value. If you disable sunset, the Blaze Grid Manager never shuts down.


Updated January 20, 2020