Setup Elasticsearch Cluster on CentOS / Ubuntu With Ansible

Posted on 330 views

Elasticsearch is a powerful open-source, RESTful, distributed real-time search and analytics engine which provides the ability for full-text search. Elasticsearch is built on Apache Lucene and the software is freely available under the Apache 2 license. In this article, we will install an Elasticsearch Cluster on CentOS 8/7 & Ubuntu 20.04/18.04 using Ansible automation tool.

This tutorial will help Linux users to install and configure a highly available multi-node Elasticsearch Cluster on CentOS 8 / CentOS 7 & Ubuntu 20.04/18.04 Linux systems. Some of the key uses of ElasticSearch are Log analytics, Search Engine, full-text search, business analytics, security intelligence, among many others.

In this setup, we will be installing Elasticsearch 7.x Cluster with the Ansible role. The role we’re using is ElasticSearch official project, and gives you flexibility of your choice.

Elasticsearch Nodes type

There are two common types of Elasticsearch nodes:

  • Master nodes: Responsible for the cluster-wide operations, such as management of indices and allocating data shards storage to data nodes.
  • Data nodes: They hold the actual shards of indexed data, and handles all CRUD, search, and aggregation operations. They consume more CPUMemory, and I/O

Setup Requirements

Before you begin, you’ll need at least three CentOS 8/7 servers installed and updated. A user with sudo privileges or root will be required for the actions to be done. My setup is based on the following nodes structure.

Server Name Specs Server role
elk-master-01 16gb ram, 8vpcus Master
elk-master-02 16gb ram, 8vpcus Master
elk-master-03 16gb ram, 8vpcus Master
elk-data01 32gb ram, 16vpcus Data
elk-data02 32gb ram, 16vpcus Data
elk-data03 32gb ram, 16vpcus Data


  • For small environments, you can use a node for both data and master operations.

Storage Considerations

For data nodes, it is recommended to configure storage properly with consideration for scalability. In my Lab, each Data node has a 500GB disk mounted under /data. This was configured with the commands below.

WARNING: Don’t copy and run the commands, they are just reference point.

sudo parted -s -a optimal -- /dev/sdb mklabel gpt
sudo parted -s -a optimal -- /dev/sdb mkpart primary 0% 100%
sudo parted -s -- /dev/sdb align-check optimal 1
sudo pvcreate /dev/sdb1
sudo vgcreate vg0 /dev/sdb1
sudo lvcreate -n lv01 -l+100%FREE vg0
sudo mkfs.xfs /dev/mapper/vg0-lv01
echo "/dev/mapper/vg0-lv01 /data xfs defaults 0 0" | sudo tee -a /etc/fstab
sudo mount -a

Step 1: Install Ansible on Workstation

We will be using Ansible to setup Elasticsearch Cluster on CentOS 8/7. Ensure Ansible is installed in your machine for ease of administration.

On Fedora:

sudo dnf install ansible

On CentOS:

sudo yum -y install epel-release
sudo yum install ansible

RHEL 7 / RHEL 8:

### RHEL 8 ###
sudo subscription-manager repos --enable ansible-2.9-for-rhel-8-x86_64-rpms
sudo yum install ansible

### RHEL 7 ###
sudo subscription-manager repos --enable rhel-7-server-ansible-2.9-rpms
sudo yum install ansible


sudo apt update
sudo apt install software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install ansible

For any other distribution, refer to official Ansible installation guide.

Confirm installation of Ansible in your machine by querying the version.

$ ansible --version
ansible 2.9.6
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/var/home/jkmutai/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 3.7.6 (default, Jan 30 2020, 09:44:41) [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]

Step 2: Import Elasticsearch ansible role

After installation of Ansible, you can now import the Elasticsearch ansible role to your local system using galaxy.

$ ansible-galaxy install elastic.elasticsearch,v7.13.3
Starting galaxy role install process
- downloading role 'elasticsearch', owned by elastic
- downloading role from
- extracting elastic.elasticsearch to /Users/jkmutai/.ansible/roles/elastic.elasticsearch
- elastic.elasticsearch (v7.13.3) was installed successfully

Where 7.13.2 is the release version of Elasticsearch role to download. You can check the releases page for a match for Elasticsearch version you want to install.

The role will be added to the ~/.ansible/roles directory.

$ ls ~/.ansible/roles
total 4.0K
drwx------. 15 jkmutai jkmutai 4.0K May  1 16:28 elastic.elasticsearch

Configure your ssh with Elasticsearch cluster hosts.

$ vim ~/.ssh/config

This how my additional configurations looks like – update to fit your environment.

# Elasticsearch master nodes
Host elk-master01
  User root
Host elk-master02
  User root
Host elk-master03
  User root

# Elasticsearch worker nodes
Host elk-data01
  User root
Host elk-data02
  User root
Host elk-data03
  User root

Ensure you’ve copied ssh keys to all machines.

### Master nodes ###
for host in elk-master01..3; do ssh-copy-id $host; done

### Worker nodes ###
for host in elk-data01..3; do ssh-copy-id $host; done

Confirm you can ssh without password authentication.

$ ssh elk-master01
Warning: Permanently added '' (ECDSA) to the list of known hosts.
[[email protected] ~]# 

If your private ssh key has a passphrase, save it to avoid prompt for each machine.

$ eval `ssh-agent -s` && ssh-add
Enter passphrase for /var/home/jkmutai/.ssh/id_rsa: 
Identity added: /var/home/jkmutai/.ssh/id_rsa (/var/home/jkmutai/.ssh/id_rsa)

Step 3: Create Elasticsearch Playbook & Run it

Now that all the pre-requisites are configured, let’s create a Playbook file for deployment.

$ vim elk.yml

Mine has the contents below.

- hosts: elk-master-nodes
    - role: elastic.elasticsearch
    es_enable_xpack: false
      - "/data/elasticsearch/data"
    es_log_dir: "/data/elasticsearch/logs"
    es_java_install: true
    es_heap_size: "1g"
    es_config: "elk-cluster"
      cluster.initial_master_nodes: ",,"
      discovery.seed_hosts: ",,"
      http.port: 9200 false
      node.master: true
      bootstrap.memory_lock: false ''
     - plugin: ingest-attachment

- hosts: elk-data-nodes
    - role: elastic.elasticsearch
    es_enable_xpack: false
      - "/data/elasticsearch/data"
    es_log_dir: "/data/elasticsearch/logs"
    es_java_install: true
    es_config: "elk-cluster"
      cluster.initial_master_nodes: ",,"
      discovery.seed_hosts: ",,"
      http.port: 9200 true
      node.master: false
      bootstrap.memory_lock: false ''
      - plugin: ingest-attachment

Key notes:

  • Master nodes have node.master set to true and set to false.
  • Data nodes have set to true and node.master set to false.
  • The es_enable_xpack variable set to false for installation of ElasticSearch open source edition.
  • cluster.initial_master_nodes discovery.seed_hosts point to master nodes
  • /data/elasticsearch/data is where Elasticsearch data shard will be stored – Recommended to be a separate partition from OS installation for performance reasons and scalability.
  • /data/elasticsearch/logs is where Elasticsearch logs will be stored.
  • The directories will be created automatically by ansible task. You only need to ensure /data is a mount point of desired data store for Elasticsearch.

For more customization options check the project’s github documentation.

Create inventory file

Create a new inventory file.

$ vim hosts


When all is set run the Playbook.

$ ansible-playbook -i hosts elk.yml

The execution should start. Just be patient as this could take some minutes.

PLAY [elk-master-nodes] ********************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************
ok: [elk-master02]
ok: [elk-master01]
ok: [elk-master03]

TASK [elastic.elasticsearch : set_fact] ****************************************************************************************************************
ok: [elk-master02]
ok: [elk-master01]
ok: [elk-master03]

TASK [elastic.elasticsearch : os-specific vars] ********************************************************************************************************
ok: [elk-master01]
ok: [elk-master02]
ok: [elk-master03]

A successful ansible execution will have output similar to below.

PLAY RECAP *********************************************************************************************************************************************
elk-data01                 : ok=38   changed=10   unreachable=0    failed=0    skipped=119  rescued=0    ignored=0   
elk-data02                 : ok=38   changed=10   unreachable=0    failed=0    skipped=118  rescued=0    ignored=0   
elk-data03                 : ok=38   changed=10   unreachable=0    failed=0    skipped=118  rescued=0    ignored=0   
elk-master01               : ok=38   changed=10   unreachable=0    failed=0    skipped=119  rescued=0    ignored=0   
elk-master02               : ok=38   changed=10   unreachable=0    failed=0    skipped=118  rescued=0    ignored=0   
elk-master03               : ok=38   changed=10   unreachable=0    failed=0    skipped=118  rescued=0    ignored=0   

See below screenshot.


Step 4: Confirm Elasticsearch Cluster installation on Ubuntu / CentOS

Login to one of the master nodes.

$ ssh elk-master01

Check cluster health status.

$ curl http://localhost:9200/_cluster/health?pretty

  "cluster_name" : "elk-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 6,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0

Check master nodes.

$ curl -XGET 'http://localhost:9200/_cat/master'
G9X__pPXScqACWO6YzGx3Q elk-master01

View Data nodes:

$ curl -XGET 'http://localhost:9200/_cat/nodes'   7 47 1 0.02 0.03 0.02 di - elk-data03  10 34 1 0.00 0.02 0.02 im * elk-master01  13 33 1 0.00 0.01 0.02 im - elk-master03  14 33 1 0.00 0.01 0.02 im - elk-master02   7 47 1 0.00 0.03 0.03 di - elk-data02   6 47 1 0.00 0.02 0.02 di - elk-data01

As confirmed you now have a Clean Elasticsearch Cluster on CentOS 8/7 & Ubuntu 20.04/18.04 Linux system.

Gravatar Image
A systems engineer with excellent skills in systems administration, cloud computing, systems deployment, virtualization, containers, and a certified ethical hacker.