Tuesday, October 18, 2016

Install and Setup Consul Service discovery

Consul Installation and Setup

Introduction:

We are going to assume that the consul cluster will have 3 servers, and 3 mysql clients which are part of a Percona XctraDB cluster. These mysql nodes do not have to be part of the cluster, and if they are, they can be part of any db cluster including mariadb galera cluster or mysql cluster - it really does not matter.

Architecture

Consul servers:

consul-s1: 192.168.2.100
consul-s2: 192.168.2.101
consul-s3: 192.168.2.102

Consul clients:

db-pxc-1: 192.168.2.205
db-pxc-2: 192.168.2.206
db-pxc-3: 192.168.2.207

Step1. Install unzip:
  $sudo apt-get update
  $sudo apt-get install unzip

Step 2. Install Consul on all cluster nodes
  $cd /usr/local/bin
  $sudo wget https://releases.hashicorp.com/consul/0.6.4/consul_0.6.4_linux_amd64.zip
  $sudo unzip consul_0.6.4_linux_amd64.zip
  $sudo rm consul_0.6.4_linux_amd64.zip
#Note - you can download the latest consul from https://www.consul.io/downloads

Step3. Add consul user
  $sudo adduser consul
#Note - for password when prompted, just use consul as password

Step4. Create some useful directories
  $sudo mkdir -p /etc/consul.d/scripts
  $sudo mkdir /var/consul
  $sudo chown consul:consul -R /var/consul

Step5. Setup the config.json file

On the Servers, create a config file /etc/consul.d/config.json and add the following config:

  {
    "bootstrap_expect": 3,
    "client_addr": "0.0.0.0",
    "datacenter": "chicago",
    "data_dir": "/var/consul",
    "domain": "consul",
    "dns_config": {
        "enable_truncate": true,
        "only_passing": true
    },
    "enable_syslog": true,
    "encrypt": "fMW9OOeILH2yaiTZ3UKrvQ==",
    "leave_on_terminate": true,
    "log_level": "INFO",
    "rejoin_after_leave": true,
    "server": true,
    "start_join": [
        "192.168.2.100",
        "192.168.2.101",
        "192.168.2.102"
    ],
    "ui_dir": "/usr/local/share/consul/ui"
}

#Note - for encrypt, you can get the value by running the command below
   $consul keygen

Make sure that the generated key should be used on all other consul nodes (both servers and clients) as values for encrypt in this config file. Also make sure that you change the name of the data center appropriately and also the IP address under the "start_join".

On the clients, create a config file /etc/consul.d/config.json and add the following config just like we did in the previous config but with a bit modifications:

{
    "client_addr": "0.0.0.0",
    "datacenter": "chicago",
    "data_dir": "/var/consul",
    "enable_syslog": true,
    "encrypt": "fMW9OOeILH2yaiTZ3UKrvQ==",
    "leave_on_terminate": true,
    "log_level": "INFO",
    "rejoin_after_leave": true,
    "server": false,
    "start_join": [
        "192.168.2.100",
        "192.168.2.101",
        "192.168.2.102"
    ]
}

Step 6. Download the Web UI.
You can perform this step on one of the consul servers, or on all three consul servers. Here, i will just perform this on one of the servers (192.168.2.205)

create some consul dir using the following commads:

  $sudo mkdir -p /usr/local/share/consul/ui
  $cd /usr/local/share/consul/ui
  $sudo wget https://releases.hashicorp.com/consul/0.6.0/consul_0.6.0_web_ui.zip
  $sudo unzip consul_0.6.0_web_ui.zip
  $sudo rm consul_0.6.0_web_ui.zip
  $sudo chown consul:consul -R /usr/local/share/consul/

#Note - make sure to download the latest version of the web ui form the same download page where we downloaded the consul servers

Step 7. Other config files for the Consul clients

On all client nodes e.g., db-pxc-1, dbp-pxc-2, and db-pxc-3..., create the following config files:

Our first config file will be for the mysql services and they will look the same on all client nodes

$sudo vim /etc/consul.d/services.json

{
    "services": [
        {
            "name": "db-pxc",
            "tags": [
                "master"
            ],
            "port": 3306,
            "checks": [
                {
                    "script": "/etc/consul.d/scripts/mysql.pl",
                    "interval": "5s"
                }
            ]
        }
    ]
}

The second config file is a perl script that i wrote to to perform health checks on the mysqld

$sudo vim /etc/consul.d/scripts/mysql.pl

#!/usr/bin/perl

use 5.010;
use strict;

use DBI;
use IO::File;

INIT {
    my $now = localtime();
    print "$now\n";

    my $fh = IO::File->new('/Proj/authentiction.txt') or croak $!;
    my $username = $fh->getline;
    my $password = $fh->getline;
    $fh->close;

    chomp $password;
    chomp $username;

    my $ip = $ARGV[0];

    my $dbh = DBI->connect("dbi:mysql:database=;host=${ip}", $username, $password, {mysql_connect_timeout => 1});
    if (!defined $dbh) {
        print "Cannot connect to mysql.";
        exit -1;
    }

    if ( $dbh->do("SELECT 1") ) {
        print "Mysql is running.";
    }
    else {
        print "Select 1 Error.";
        exit -1;
        # execute "SELECT 1" error, return exit -1 for instance
    }

    $dbh->disconnect;
}

Make the /etc/consul.d/scripts/mysql.pl file executable by running the following command:

$sudo chmod +x /etc/consul.d/scripts/mysql.pl

Then create a file /Proj/authentication.txt store the username and password. For example, if your mysql username is consul, and your  passord is secret, then add the following inside this file:

  $sudo mkdir dir /Proj
  $sudo vim /Proj/authentication.txt

Add your mysql username and password  in the file once opne

  consul
  secret


Step 8. On all consul nodes (both servers and clients), create config file /etc/init/consul.conf with the following config:

    description "Consul client process"

    start on (local-filesystems and net-device-up IFACE=eth0)
    stop on runlevel [!12345]

    respawn

    setuid consul
    setgid consul

    exec consul agent -config-dir /etc/consul.d/

Step 9. Start the consul nodes one by one, starting with the consul servers

    $sudo start consul

Sep 10. Verify that all consul nodes are up and running

    $consul members

If everything went fine, this command should list all the servers and clinet nodes as "alive"

Redis health check scripts

If we wanted to add services and checks for a redis cluster with the architecture shown below,
we have to do all the steps we have demonstrated above and just update the config files
appropriately. Otherwise the file structure and the installation of software is the same:

Master Instances
redis-1-1 192.168.1.150
redis-2-1 192.168.1.151
redis-3-1 192.168.1.152

Slave Instances
redis-1-2 192.168.1.154    #slave of redis-1-1
redis-2-2 192.168.1.155    #slave of redis-2-1
redis-3-2 192.168.1.156    #slave of redis-3-1

On all the redis client nodes, create a JSOn file /etc/consul.d/config.json with the following config:

{
    "advertise_addr": "192.168.1.150",
    "client_addr": "0.0.0.0",
    "datacenter": "chicago",
    "data_dir": "/var/consul",
    "enable_syslog": true,
    "encrypt": "fMW9OOeILH2yaiTZ3UKrvQ==",
    "leave_on_terminate": true,
    "log_level": "INFO",
    "rejoin_after_leave": true,
    "server": false,
    "start_join": [
        "192.168.2.100",
        "192.168.2.101",
        "192.168.2.102"
    ]
}

#Note - change the IP on advertise_addr appropriately to match your local IP on each redis instance. The encrypt value should be the same on all nodes that are part of this consul cluster. Again, the start_join Ips are those of the consul servers.

The /etc/consul.d/services.json file will look like this on all redis instances:

   {
    "services": [
        {
            "name": "redis",
            "port": 6379,
            "checks": [
                {
                    "script": "/etc/consul.d/scripts/redis.sh",
                    "interval": "5s"
                }
            ],
            "enableTagOverride": true
        }
    ]
}

Create a new redis basj script /etc/consul.d/scripts/redis.sh with the following config:

  #!/bin/bash

nc -z 127.0.0.1 6379
echo `date`
if [ "$?" -ne 0 ]; then
    echo "Cannot connect to redis."
    exit 2
else
    C=$(redis-cli -p 6379 -h 127.0.0.1 info | grep "cluster_enabled:1");
    if [[ "$C" != *"cluster_enabled:1"* ]]; then
        echo "Redis is not cluster enabled."
        exit 2
    else
        M=$(redis-cli -p 6379 -h 127.0.0.1 info | grep role);
        L=$(redis-cli -p 6379 -h 127.0.0.1 info | grep "loading_start_time");
        if [[ "$M" != *"role:master"* ]]; then
            curl -s --upload-file /etc/consul.d/scripts/slave.json http://127.0.0.1:8500/v1/agent/service/register
            echo "Redis is in slave mode."
            exit 0
        elif [[ -n "$L" ]]; then
            echo "Redis doesn't join into cluster yet."
            exit 1
        else
            echo "Redis is in master mode."
            curl -s --upload-file /etc/consul.d/scripts/master.json http://127.0.0.1:8500/v1/agent/service/register
            exit 0
        fi
    fi
fi


Create a json file /etc/consul.d/scripts/master.json with the following config file:

  {
    "name": "redis",
    "tags": [
        "master"
    ],
    "port": 6379,
    "checks": [
        {
            "script": "/etc/consul.d/scripts/redis.sh",
            "interval": "5s"
        }
    ],
    "enableTagOverride": true
}

Note - on all the Redis master instances, make sure that the tags has the value of "master" as above.
For all the Redis slave instances, make the tags value "slave" as shown in config below. The master.sh above and the slave.json file below will be used to ensure that consul is aware of changes between master and slave nodes when a failover takes place, and allow consul to handle DNS calls to the Redis instances based on master instances only. This is important if you want all your redis slave instances to be read-only.

Create a json file /etc/consul.d/scripts/slave.json with the following config file:

  {
    "name": "redis-general",
    "tags": [
        "slave"
    ],
    "port": 6379,
    "checks": [
        {
            "script": "/etc/consul.d/scripts/redis.sh",
            "interval": "5s"
        }
    ],
    "enableTagOverride": true
}

Lastly, create the start script /etc/init/consul.conf and add the following config:

description "Consul client process"

start on (local-filesystems and net-device-up IFACE=eth0)
stop on runlevel [!12345]

respawn

setuid consul
setgid consul

exec consul agent -config-dir /etc/consul.d/

The you can start the redis instances the same way we started the mysql instances:

   $sudo start consul
 
That is all there is to it.

I would recommend reading the consul documentation to have a better understanding of consul and how to maximize its use. I will try to create a video that demonstrated how consul works.


-----------------------------------------------

Using consul to check if Replication is working or not on a master/slave MySQl Replication Setup

In order to check if replication working ok, the idea is that we check for the Slave_IO_Running and Slave_SQL_Running status - if their values = Yes, then replication if fine and consul continues to run, otherwise if either = No, then consul should be stopped.
Below is the bash script i write to handle this:
#!/bin/bash

nc -z 127.0.0.1 3306
echo `date`
if [ "$?" -ne 0 ]; then
    echo "Cannot connect to mysql."
    exit 2
else
    IO_running=$(mysql -uusername -ppassword -e "show slave status\G" | grep "Slave_IO_Running: Yes");
    SQL_running=$(mysql -uusername -ppassword -e "show slave status\G" | grep "Slave_SQL_Running: Yes");
    if [[ "$IO_running" != *"Slave_IO_Running: Yes"* ]] || [[ "$SQL_running" != *"Slave_SQL_Running: Yes"* ]]; then
        echo "Slave Slave IO is not running and/or Slave SQL is not running"
        exit 2
    else
        echo "Replication is ok"
            exit 0
    fi
fi
And then in the consul Services.json, call this script to monitor replication every 5 seconds as seen in the script below:
{
    "services": [
        {
            "name": "db-gc-test",
            "tags": [
                "reader",
                "slave"
            ],
            "port": 3306,
            "checks": [
                {
                    "script": "/etc/consul.d/scripts/mysql.pl",
                     '''"script": "/etc/consul.d/scripts/mysql_repl.sh",'''
                    "interval": "5s"
                }
            ]
        }
    ]
}

The reason for writing a separate script to handle checking the replication status is to avoid complicating the critical script that checks the status of mysql. The goal is to ensure Consul is easy to use and maintain. And if we decide that we do not need this feature, we can simply just stop calling the script from the Services.json. Simple and clean.
*Note-For the record, the username and password will not be passed from the command line as demonstrated in the code above. we shall employ the same technique of reading the username and password from a secure file just like we do in other consul service check scripts. The code above is just for demonstration purposes only.

Screenshots of the Consul Web UI


No comments:

Post a Comment