In the event of a disaster happening to a master db, the last resort is to fail back that master to a slave. This should be an absolute last resort due to the complexity of the issue.
This could also be used as a method to switch masters for a big database change, such as adding an index to a table. This method can be used to minimize downtime from such a big change.
The first variable that needs to be noted is “log-slave-updates”
When log-slave-updates=1, the slave is going to log ALL queries that come to it because it’s a slave in it’s binlog. In this case, you will need to be more careful. If promoting this slave to a master, you need to make 100% sure that no more updates are going to come through to it as a slave. Resetting the slave status and removing the slave configuration is probably the easiest way to ensure this. If you want to daisy-chain a slave off of another slave server, this paramater must be set so all replicated queries from the original master can trickle down the chain.
When log-slave-updates=0, only queries for this specific server will be logged in the server’s binlog. This is most convenient when needing to quickly set this slave as the master and then point another slave to it because the only updates being logged to the binlog are those that are actually executed on this server without it being a slave.
Now, to actually promoting new masters…
In this scenario, we will pretend that the master db is corrupt and we now need to promote a new one.
First, check all slaves to see which one is the most caught up. If they are all caught up, then determine which one could be used with additional slaves attached to it the quickest (eg. log-slave-updates=1 as mentioned above). If a slave has another slave running off of it already, this is perfect.
Make sure that the slave that you’re about to promote isn’t read-only (it should be read-only until you change it). Ensure this change is also made in the my.cnf file so the variable is set the next time this instance gets restarted.
Next I usually shut down the master mysql instance for safety.
For sanity, I issue a “FLUSH LOGS” statement on the new master. This will roll the server to a new, fresh binlog which will be easy to start new slaves from. Take note of this new binlog that was just rolled to.
Now change all hosts files to set the new MASTER-IP record
If a new slave is already up and running from this db, then also change the hosts entry for SLAVE
Assuming that all other slaves were caught up and at the same position as this one, reset those slave’s configuration now pointing to the newly promoted master and the binlog that you just started. If the slave isn’t caught up, give it some time to catch up and get to the correct position before issuing these commands. If this isn’t followed exactly you will run into duplicate key issues and/or missing and inconsistent data
In the case that you just want to switch a slave’s master (eg. the master didn’t die), follow these steps. In this example I used DB02:3307 and DB02:3306 databases
I switched DB02:3307’s master to be DB02:3306. The process for doing this without needing to rebuild the entire slave is very sensitive, but only requires a few steps. Assume that DB02:3307 is currently replicating off of a server called DB01. Now, if DB01 completely dies, these slaves won’t have to be rebuilt and DB02 will become the master.
First, I set DB02:3306 and DB02:3307 slaves to stop at a the same specific binlog + pos:
mysql-DB02:3306> START SLAVE UNTIL MASTER_LOG_FILE = ‘db01-binlog.000722’, MASTER_LOG_POS=430000000;
mysql-DB02:3307> START SLAVE UNTIL MASTER_LOG_FILE = ‘db01-binlog.000722’, MASTER_LOG_POS=430000000;
Then, once the slaves stop I know that both of the slaves were at the exact same location. This is crucial.
The next step isn’t necessary, but I do it for ease and safety. On DB02:3306 I issue a “FLUSH LOGS;” statement which will close and reopen all log files. For the binlogs (which we are most interested in) it will start a new binlog so we can start at position 0 of the newest binlog on DB02:3306. Alternatively, you could just do a “SHOW MASTER STATUS;” to get the current binlog information for that host.
Once I’ve found and noted the new master binlog file name and position, I can start that slave back up.
Next, I go over to DB02:3307 to reset the configuration. This time with DB02:3306 as the MASTER_HOST, and the MASTER_LOG_FILE + MASTER_LOG_POS that were noted previously.