MySQL Failover/Replication Topology Management With Orchestrator

A post from dbpandit

What is Orchestrator?

Orchestrator is an open source GitHub project authored by Mr. Shlomi Noach and available at the following Github Link: https://github.com/outbrain/orchestrator.

Major functionalities of Orchestrator includes MySQL/MariaDB Master/Intermediate-Master DB failover in seconds (10-20 secs) considering your requirements and managing replication topology (Changing Replication Architecture by drag-drop or via Orchestrator CLI) with ease.

Orchestrator allows:

  • Detection and investigation of replication clusters
  • Safe topology refactoring: moving slaves around the topology
  • Sleek topology visualization
  • Replication problems visualization
  • Topology changes via intuitive drag & drop
  • Maintenance mode declaration and enforcement
  • Auditing of operations & more.

If you like to learn more about and have 20 minutes of time, I would recommend the following video : https://archive.fosdem.org/2016/schedule/event/orchestrator/,

Overall, Github's Orchestrator and MHA are pretty much in terms of there failover mechanism however, Orchestrator offers a lot of other stuff including its fantastic GUI View for providing better control and centralized view of your MySQL infrastructure.

hmm, maybe I've written a lot about Orchestrator? Let me show you how it's GUI looks like:

What can I control about my DB instance from Orchestrator's GUI? Check this out :

Implementation of Orchestrator

Here, I would suggest keeping multiple instances of running Orchestrator's app and these applications will communicate to its own MySQL server (Master-Slave) for data sourcing. Orchestrator provides support of running multiple application instances for better HA. Here, Only one application acts as Master at a time.

Download deb package : https://github.com/outbrain/orchestrator/releases

After download of mentioned packages, Install them with dpkg :

dpkg -i orchestrator_1.5.7_amd64.deb  
dpkg -i orchestrator-cli_1.5.7_amd64.deb  

Important Step : Configuration file : orchestrator.conf.json
Here is the list of variables which you might consider tuning :

Variable Name : MySQLTopologyUser  
Suggested Value : Updated orchestrator's user to connect client MySQL hosts.

Variable Name : MySQLTopologyPassword  
Suggested Value : Updated orchestrator's password to connect client MySQL hosts.

Variable Name : MySQLOrchestratorHost  
Suggested Value : Put orchestrator's own master DB host

Variable Name : MySQLOrchestratorUser  
Suggested Value : Orchestrator's own MySQL DB user

Variable Name : MySQLOrchestratorPassword  
Suggested Value : orchestrator's own master DB password

Variable Name : AuthenticationMethod  
Suggested Value : multi  
Why? : For additional security of Orchestrator's Admin Panel

Variable Name : HTTPAuthUser  
Suggested Value : User to protect Orchestrator's GUI and API access.

Variable Name : HTTPAuthPassword  
Suggested Value : Password to protect Orchestrator's GUI and API access.

Variable Name : DetectClusterAliasQuery  
Suggested Value : "select schema_name from information_schema.SCHEMATA where schema_name not in ('information_schema','mysql','performance_schema', 'meta') limit 1"  
Why? : This query will provide production database name as cluster alias. When you have multiple production MySQL Cluster, This really helps to keep separate Cluster's using database names.

Variable Name : PseudoGTIDPattern  
Suggested Value : "drop view if exists .*?`_pseudo_gtid_hint__"  
Why? : This variable will read Pseudo GTID from big logs using the mentioned query. You'll find a stored procedure in the page named : pseudo_gtid_view fro generating and pushing pseudo GTID in bin logs, If not using GTID already.

Variable Name : FailureDetectionPeriodBlockMinutes  
Suggested Value : 5  
Default Value : 60  
Why update? : Number of Minutes for which Orchestrator wait to do another failure detection.

Variable Name : RecoveryPeriodBlockSeconds  
Suggested Value : 5  
Default : 3600  
Why Changed? : In order to do another Recovery, We've to either previous recovery by going to Audit → Recovery history or let this variable to a very small time frame. During the test, I observed that when a MySQL node goes down, Orchestrator detects it with two different states :  
* UnreachableMaster [Orchestrator doesn't do automatic recovery whereas gives option for manual failover ]
DeadMaster [Automatic recovery is done by Orchestrator]  
* These two states always comes in a very small fraction of the time and cannot be acknowledged manually hence small value for this variable to avoid acknowledgment of the previous recovery and start the recovery process.

Variable Name : RecoveryIgnoreHostnameFilters  
Suggested Value : "host matching pattern"  
Why? : Recovery analysis will completely ignore hosts matching given pattern as these can be dedicated slaves or dev machines available in the replication topology.

Variable Name : RecoverMasterClusterFilters  
Suggested Value : "db0" as our test cluster name stating  
with mentioned string. This is different from Cluster's String. That part is just for showcasing your cluster on dashboard however RecoverMasterClusterFilters depends on your master's hostname regex.

Variable Name : OnFailureDetectionProcesses  
Suggested Value : "echo `date` '1 - Detected {failureType} on {failureCluster}. Affected replicas: {countSlaves} - failureType={failureType}, failureDescription={failureDescription}, failedHost={failedHost}, failureCluster={failureCluster}, failureClusterAlias={failureClusterAlias}, failureClusterDomain={failureClusterDomain}, failedPort={failedPort}, successorHost={successorHost}, successorPort={successorPort}, successorAlias={successorAlias}, countSlaves={countSlaves}, {slaveHosts}, isDowntimed={isDowntimed}, isSuccessful={isSuccessful}, lostSlaves={lostSlaves}' >> /tmp/recovery_`($date date +'%Y_%m_%d')`.log"  
Why? : Priting all this details will help you determine what can be done at this stage.  
Suggestion : We can run any external script at this event.

Variable Name : PreFailoverProcesses  
Suggested Value : "echo `date` '2 - Will recover from {failureType} on {failureCluster} - failureType={failureType}, failureDescription={failureDescription}, failedHost={failedHost}, failureCluster={failureCluster}, failureClusterAlias={failureClusterAlias}, failureClusterDomain={failureClusterDomain}, failedPort={failedPort}, successorHost={successorHost}, successorPort={successorPort}, successorAlias={successorAlias}, countSlaves={countSlaves}, {slaveHosts}, isDowntimed={isDowntimed}, isSuccessful={isSuccessful}, lostSlaves={lostSlaves}' >> /tmp/recovery_`($date date +'%Y_%m_%d')`.log","bash /tmp/prefailover.sh {failedHost} {failureCluster} >> /tmp/orch.log"  
Why? : Priting all this details will help you determine what can be done at this stage.  
What can be done here ?  
Here, You can completely block traffic on failed host just to avoid anymore writes with a custom script like as mentioned below :  
#!/bin/bash
## Prefailover Orchestrator Script
PROGNAME=$(basename $0)  
f_host=$1  
f_cluster=$2  
f_slaves=$3

function error_exit  
{
        if [ $? -ne 0 ]; then
            echo "${PROGNAME}: Step $1:- Failed at `($date date +"%Y_%m_%d")`" | mail -s "${PROGNAME} failed for Cluster Name : $f_cluster" -a "From: MySQL Orchestrator <orchestrator@abc.com>" harsh.pandit@abc.com
        exit 1
        fi
}

# Below function block mysql connections from mentioned subnet (Application Subnet or Host).
function ip_rule_update  
{
        ssh root@$f_host "iptables -A INPUT -i eth0 -p tcp ! -s 10.20.16.0/22 --dport 3306 -m state --state NEW,ESTABLISHED -j DROP"
        error_exit "IPTable Update"

}

if [[ "$f_slaves" == *"db01"* || "$f_slaves" == *"db02"* ]]; then  
        ip_rule_update
else  
        orchestrator -c begin-downtime -i $f_host --duration=10m --reason="No Master left for Promotion" -config=/usr/local/orchestrator/orchestrator.conf.json
        echo "Host $f_host Downtimed" | mail -s "No Candidate Master Left for Promotion,Host $f_host Downtimed for 10m" -a "From: MySQL Orchestrator <orchestrator@abc.com>" harsh.pandit@abc.com
fi  
Variable Name : PostFailoverProcesses  
Suggested Value : "echo `date` '3 - (for all types) Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort} - failureType={failureType}, failureDescription={failureDescription}, failedHost={failedHost}, failureCluster={failureCluster}, failureClusterAlias={failureClusterAlias}, failureClusterDomain={failureClusterDomain}, failedPort={failedPort}, successorHost={successorHost}, successorPort={successorPort}, successorAlias={successorAlias}, countSlaves={countSlaves}, {slaveHosts}, isDowntimed={isDowntimed}, isSuccessful={isSuccessful}, lostSlaves={lostSlaves}' >> /tmp/recovery_`($date date +'%Y_%m_%d')`.log", "bash /tmp/postfailover.sh {failedHost} {failureCluster} {successorHost} >> /tmp/orch.log"  
Here, postfailover.sh is providing details of switching your application servers to newly promoted master mentioned here as {successorHost}. You can write it as per your point of contact for dbserver. It can be an internal dns or vip or maybe anything else.

Variable Name : RecoveryPollSeconds  
Suggested Value : 5  
Default Value : 10  
Why change? : How often Orchestrator re-checks for crash scenarios. You can change it as per your requirements. The default can be a better option here.

Variable Name : PromotionIgnoreHostnameFilters  
Suggested Value : "dbs"  
Why? : I keep my master/candidate master host with the hostname like *dbm and slaves with *dbs. In the case of automatic failover, I never to promote my slave nodes to master which is why I ignore such promotions.  
  • Create following user with provided privileges on each client MySQL node so that Orchestrator can communicate with all nodes available in topology :
GRANT SELECT, SUPER, PROCESS, REPLICATION SLAVE, RELOAD ON *.* TO 'orchestrator'@'%' IDENTIFIED BY 'orch_topology_password';  
  • Orchestrator does automatic discovery one after providing any node's IP of the cluster by recursively drill up the master chain (if any) and down the slave's chain (if any) to detect the entire topology.
orchestrator -c discover -i mysql_client_ip:3306 cli  
  • Recovery of lost slaves/intermediate masters isn't possible without implementing GTID/Pseudo GTID. I use binary log file name, pos based replication so Pseudo GTID's implementation is necessary. Here, We use a separate database named : meta
create a database if not exists meta;  
  • Following mentioned event will recreate a view every 10s to generate Pseudo GTID :
create database if not exists meta;  
use meta;

drop event if exists create_pseudo_gtid_event;  
delimiter $$  
create event if not exists  
  create_pseudo_gtid_event
  on schedule every 5 second starts current_timestamp
  on completion preserve
  enable
  do
    begin
      set @pseudo_gtid_hint := uuid();
      set @_create_statement := concat('drop ', 'view if exists `meta`.`_pseudo_gtid_', 'hint__', @pseudo_gtid_hint, '`');
      PREPARE st FROM @_create_statement;
      EXECUTE st;
      DEALLOCATE PREPARE st;
    end
$$

delimiter ;  
  • Event_Schedular must be enabled on exiting master node of every MySQL cluster monitored with Orchestrator. After failover, Script does the same on promotedMaster node. Use the following SQL command :
set global event_scheduler = 1;  

Ref Links :
Manual : https://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual#
Requirements : https://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual#requirements
Download : https://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual#download
Pseudo GTID : https://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual#pseudo-gtid
Web Interface : https://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual#using-the-web-interface
Recovery Processs : https://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual#topology-recovery