This blog post feeds on from 2 others, which I’d recommend reading before this one
In a 2+1 architecture we have the same setup as before (2 Data nodes), with the exception of an additional node. So, we have Node 11 and Node 12, these are our “Active” data nodes as before. However, we add another Node, Node 13. Node 13 is the “Spare” or “Reserved” node. We will also depend on Redundancy 2 here.
With a ‘+1’ architecture, if a node fails, it is automatically replaced by the Spare node. The slave replica segment of the failed nodes data is pulled from the failed nodes’ neighbour node, and is restored on the implemented Spare node.
For example, say Node 12 fails. It’s master data is gone, obliterated. The database is down! No problem. Node 13 is pushed in Node 12’s place. The slave segment 2 is pulled from Node 11 (great that we’ve got an exact copy huh!).
The database restarts, then the segment is restored onto Node 13 as the Master segment, with the end user knowing no different (apart from a hopefully brief outage).
The process of recovery in a ‘+1’ architecture is as follows:
1 – Database discovers failure
2- Database shuts down
3 – The failed node is automatically replaced with the reserved node
4 – Database starts
5 – Data is restored from the slave segment on the neighbour node onto the Spare node as the Master segment
The only place where it is logged that the failover has happened is the log. Notifications of this occurring would be flagged in Warning and Critical severity levels.