Leader Election

Imagine we are designing a system that allows users to subscribe to the service on monthly or annual basis. They pay a monthly/annual fee regular basis. We could imagine for such a system would require a database, would store user subscriptions, where subscription was active, the price, inactive subscriptions. We could make of third party services which take care of charging users, debiting from their account, taking care of payment functionality (e.g paypal, stripe), so this third party service need somehow communicate with the database to know information like the amount to be debited, when to charge the user, etc. We may not want the third party service to interact with the database directly, as database is very sensitive part of the system. One solution would be to insert some service between 3rd party service and the database. This service would be responsible for interacting with the database ( maybe on perioudic basis), figureing out when certain users subscription is going to renew, and this service is going to third party service which would charge the user. So now what happens if this service, which is now crucial to the whole system supposedly fails and as a result entire payment system fails?

One way to solve this problem would be to introduce redundancy. So instead of having some such servers, we have five servers and these five servers responsible for the business logic. Another problem is determining which server among the five would perform the operation, sensitive functions such as the payment should be carried out only ones, instead of five times. Here is where leader election comes into play.

With leader election, if there group of servers/machines incharge of doing the same operation, and the nature of the operation is such that it should be performed only ones, for example the payment functionality, should happen only ones. Leader election has the servers is question elect one of themselves as a leader and that server and that server alone is going to be responsible for performing the action, and other servers wait. If the leader fails of comes offline, they servers would elect another leader and perform the next operation in question. The elected leader has a lease ( say every 5 seconds ) and this needs to be refreshed every five minutes by leader, if the leader fails to refresh, this means that leader is experiencing a network failure and a new leader needs to be elected.

Although the concept may seem trivial, but in reality this is a very difficult problem to solve. When we have large and complex distributed system, selecting and re-electing a leader in case of failure is a complex task. This is becasuse these machine / systems must share state ( in this case who the leader at this point in time ) at all times. And what happens when there is a network failure or a partition, what happens to the election then? The difficulty lies in the fact that multiple machine need to gain consensus or agree upon something together. In this case, finding the leader is the thing they are trying to finding consensus on. In order to actually achieve leader election, we have use a consensus algorithms. These consensus algotithms are extremely complex mathematical algoruthms that allow multiple servers in a groups ( or nodes in a cluster), help to agree in a single data value, in this case who is going to be leader in a cluster of nodes. Two examples of consensus algorithms are Paxos and Raft.

In the industry we use third party services - zookeeper and Etcd that arenty not primarily meant leader election, but these can be used to implement our own leader election in an easy way. These are very often used in the Industry. We see example with Etcd:

Etcd is a strongly consistent and high available key-value store that is used to implement leader election in system. Stongly consistency means that when we have multiple machines or even just one machine, reading and writing to the same key-value pair to the key-value store in etcd, you are always guaranteed to returned the correct value, no matter in what time you access the same key-value pair. Etcd achieve this by implmenting a consensus algorithm, here Etcd uses the Raft algorithm.

back