Replication and Master or Slave Shutdowns
It is safe to shut down a master server and restart it later. When a slave loses its connection to the
master, the slave tries to reconnect immediately and retries periodically if that fails. The default is to
retry every 60 seconds. This may be changed with the CHANGE MASTER TO statement. A slave also
is able to deal with network connectivity outages. However, the slave notices the network outage only
after receiving no data from the master for slave_net_timeout [2029] seconds. If your outages are
short, you may want to decrease slave_net_timeout [2029]. See Section 5.1.4, “Server System
Variables”.
An unclean shutdown (for example, a crash) on the master side can result in the master binary log
having a final position less than the most recent position read by the slave, due to the master binary log
file not being flushed. This can cause the slave not to be able to replicate when the master comes back
up. Setting sync_binlog=1 [2055] in the master my.cnf file helps to minimize this problem because
it causes the master to flush its binary log more frequently.
Shutting down a slave cleanly is safe because it keeps track of where it left off. However, be careful
that the slave does not have temporary tables open; see Section 16.4.1.22, “Replication and
Temporary Tables”. Unclean shutdowns might produce problems, especially if the disk cache was not
flushed to disk before the problem occurred:
• For transactions, the slave commits and then updates relay-log.info. If a crash occurs between
these two operations, relay log processing will have proceeded further than the information file
indicates and the slave will re-execute the events from the last transaction in the relay log after it has
been restarted.
• A similar problem can occur if the slave updates relay-log.info but the server host
crashes before the write has been flushed to disk. To minimize the chance of this occurring,
set sync_relay_log_info=1 [2036] in the slave my.cnf file. The default value of
sync_relay_log_info [2036] is 0, which does not cause writes to be forced to disk; the server
relies on the operating system to flush the file from time to time.
The fault tolerance of your system for these types of problems is greatly increased if you have a good
uninterruptible power supply.
... zobacz całą notatkę
Komentarze użytkowników (0)