slug | id | title | date | comments | tags | description | references | |
---|---|---|---|---|---|---|---|---|
85-improving-availability-with-failover |
85-improving-availability-with-failover |
Improving availability with failover |
2018-10-26 12:02 |
true |
|
To improve availability with failover, there are serval ways to achieve the goal such as cold standby, hot standby, warm standby, checkpointing and all active. |
Cold Standby: Use heartbeat or metrics/alerts to track failure. Provision new standby nodes when a failure occurs. Only suitable for stateless services.
Hot Standby: Keep two active systems undertaking the same role. Data is mirrored in near real time, and both systems will have identical data.
Warm Standby: Keep two active systems but the secondary one does not take traffic unless the failure occurs.
Checkpointing (or like Redis snapshot): Use write-ahead log (WAL) to record requests before processing. Standby node recovers from the log during the failover.
- cons
- time-consuming for large logs
- lose data since the last checkpoint
- usercase: Storm, WhillWheel, Samza
Active-active (or all active): Keep two active systems behind a load balancer. Both of them take in parallel. Data replication is bi-directional.