The nodes in a high availability pair share the same node name. The HDFS target service instance that runs on one node becomes active and acquires a lease to write to the HDFS file. If an active service fails, a standby instance becomes active. The new active service has to wait several minutes for the HDFS NameNode node to close the file and recover the lease before it can write to the target file. This scenario results in data loss.
To enable the instance on the secondary node to write data to the HDFS file system, use the
variable in the
component of the HDFS URI.
The following example shows a URI that uses the
is the variable that EDS replaces with the system time in milliseconds.
The name of the file varies with system time. The primary service creates a file whose name is the current system time in milliseconds, and starts writing to the file. If an active service fails, another service does not have to wait for the HDFS NameNode node to revoke the lease on the file. The newly active service creates its own file with the current system time as its name and starts writing to the file.