Before you install and set up Elasticsearch clusters, prepare the environment and determine whether you want to configure high availability.
Tasks for All Environments
Perform the following tasks to prepare the installation environment:
Ensure that each machine satisfies the hardware requirements for the supported version of Elasticsearch. For information about hardware, see the Elasticsearch documentation.
Ensure that each machine satisfies the software requirements for the supported version of Elasticsearch, such as supported operating systems and Java version. For information about the software requirements, see the
Elasticsearch Support Matrix
.
Complete important system configurations, such as swapping, file descriptors, and virtual memory. For information about important system configurations, see the Elasticsearch documentation.
Tasks for UNIX Environments
In a UNIX environment, perform the following tasks:
To avoid data loss due to insufficient number of file descriptors, set the number of file descriptors to 65536 or higher.
To prevent memory swapping, configure the system to prevent swapping. You can configure the Java Virtual Machine (JVM) to lock the heap in memory through
mlockall
.
High Availability Requirements
If you have a large amount of data to index and search, the best practice is to implement a highly available Elasticsearch cluster. A highly available cluster has multiple nodes, and the cluster can distribute the workload among the nodes. If one node fails in a production environment, the cluster distributes the workload to the other nodes.
As a pre-installation task, decide if you want to implement a highly available Elasticsearch cluster. If so, configure the Elasticsearch cluster as usual, but ensure that you satisfy the following additional requirements:
The Elasticsearch cluster has three or more nodes.
You can set up a small cluster to start and scale it as necessary. Analyze the workload and make sure that you have enough capacity to handle a node failure.
Each node is configured on a separate, dedicated machine.
At least three of the nodes are master nodes to ensure stability and performance. Note that Elasticsearch recommends an odd number of master nodes.
If the cluster has only three nodes, configure all the nodes as master nodes.
If the cluster has more than three nodes, configure three nodes as master nodes and configure the rest of the nodes as data nodes.
Based on the Elasticsearch cluster size, decide on the number of replicas. When you use the Provisioning tool to configure the Elasticsearch index, you can specify the number of replicas to use.
For each node, set the following additional properties in the
elasticsearch.yml
configuration file:
discovery.zen.minimum_master_nodes
discovery.zen.ping.unicast.hosts
For more information about highly available clusters, including hardware requirements, system configurations, and property values, see the Elasticsearch documentation.