Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

What To Do When DMan Replication Status is Red

The master server writes each incoming command to a binlog file. The slave server processes each line of the file in queue. DMan monitors this process for several factors that would indicate that replication is not working or falling behind.

Slave is behind master N seconds

When the slave server starts to fall behind, the DMan status will report the difference in time. This could be due to network latency or I/O problems on the slave or master. Monitor the delta time between the slave and master and ensure it is getting smaller and not larger. If the time that the slave is behind continues to grow you will need to evaluate the I/O problem.

Error - Unable to process N

Sometimes the slave comes across a command in the master binlog file that it cannot process. When this happens the slave stops processing the queue and if left for a while the slave can get very behind in the queue. Evaluate the line that was unable to process on the slave, take corrective steps, and run the Resynchronize Your Replication Servers procedure below. Before you do the full restore though you can try skipping the next 1 or 2 SQL statements to see if you can resolve the problem that way.

Skip a bad SQL Statement

It’s possible to skip 1 or more incompatible SQL statements in the replication log to get replication running again. This can be an option instead of do a full reset. The following notes are a summary of: https://www.howtoforge.com/how-to-repair-mysql-replication

First confirm that there is a SQL statement error.

show slave status

If there is an SQL statement error then you may be able to skip it using the following technique. If it is an IO_Error then this won’t help you.

STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; -- skips the next 1 statement(s) 
START SLAVE;
SHOW SLAVE STATUS;

You can skip more than 1 statement by using a larger value in the SQL_SLAVE_SKIP_COUNTER value.

If the slave successfully restarts then your replication problem is resolved but you may need to consult with the application developers to see if that statement can be expected to recur. Changes may be necessary to avoid the troublesome statements.

Resynchronize Your Replication Servers

If your replication has been blocked by a command line that cannot process on the slave server, it may have gotten out of synch to an extent that the easiest way is to resynch with a data duplication. To do this simply follow the original instructions to backup the master and restore to the slave found on the Running A Backup Server for Bannister Lake Products with DMan Backup in the section that starts with: Making a Copy of the Primary Database For Import On the Backup Server.





  • No labels