When I run a mapping on the Blaze engine and try to view the grid task log, the Blaze Job Monitor does not fetch the full log.
The grid task log might be too large. The Blaze Job Monitor can only fetch up to 2 MB of an aggregated log. The first line of the log reports this information and provides the location of the full log on HDFS. Follow the link to HDFS and search for "aggregated logs for grid mapping." The link to the full log contains the grid task number.
The Blaze Job Monitor will not start.
Check the Hadoop environment logs to locate the issue. If you do not find an issue, stop the Grid Manager with the infacmd stopBlazeService command and run the mapping again.
The Monitoring URL does not appear in the Properties view of the Administrator tool.
Locate the URL in the YARN log.
When Blaze processes stop unexpectedly, Blaze does not save logs in the expected location.
When Blaze stops unexpectedly, you can access Blaze service logs through the YARN monitor. Use one of these methods:
The Grid Manager log contains all Blaze job container IDs and identifies the host on which Blaze ran. Alter the Grid Manager log URL with the container ID and host name of the Blaze host.
A Blaze Job Monitor that has been running for several days loses its connection to the Application Timeline Server on the Hortonworks cluster.
The Blaze engine requires a running Application Timeline Server on the cluster. When the Blaze engine starts a mapping run, the Blaze Job Monitor checks the state of the Application Timeline Server. The Grid Manager will start it if it is not running. When the connection to the Application Timeline Server is lost, the Blaze engine attempts to reconnect to it. If the Application Timeline Server stops during a Blaze mapping run, you can restart it by restarting the Grid Manager.
When the Application Timeline Server is configured to run on the cluster by default, the cluster administrator must manually restart it on the cluster.
When a mapping takes more than 24 hours to execute, the mapping fails.
When mappings run on the Blaze engine for more than 24 hours, some mappings might fail because the Orchestrator service has a default sunset time of 24 hours. After 24 hours, the Orchestrator shuts down, which causes the Blaze Grid Manager to shut down.
To increase the sunset time to be more than 24 hours, configure the following property in the Hadoop connection advanced properties:
infagrid.orchestrator.svc.sunset.time=[HOURS]
You can also disable sunset by setting the property to 0 or a negative value. If you disable sunset, the Blaze Grid Manager never shuts down.