This is a free and brief translation of the article (original in English)
In this article we would like to highlight one of the most important mechanisms of work Zilliqa is a consensus mechanism. You can also read part one, which is dedicated to the details of how the protocol works and the concept sharding that Zilliqa uses to achieve network scalability.
The importance of the consensus protocol for high network performance
The use of the sharding mechanism alone is not enough to ensure high transaction processing speed. Performance is also determined by the speed at which each shard can verify transactions and generate a new block. This is where the need for an efficient consensus algorithm arises.
The ideal consensus protocol for Zilliqa would be one that could support a small shard size (approximately 600 nodes) and allow for rapid consensus.
What about the consensus protocols used in Bitcoin/Ethereum?
An attentive reader may have noticed that the shard size indicated earlier is much smaller than in standard P2P blockchain networks such as Bitcoin or Ethereum, where the network is formed by tens of thousands of nodes. And he may wonder why Zilliqa can’t use the same protocol in each of the shards? Moreover, since the size of the shard is much smaller than in other networks, consensus between nodes in the shard should be achieved faster.
In reality, shard size alone does not speed up the consensus mechanism.
Let's take a closer look at the consensus algorithm used in Bitcoin and Ethereum. These blockchain platforms use the so-called Nakamoto consensus. Under this consensus mechanism, the network "elects" a leader at certain intervals, and that leader proposes a block. Then, it transmits this block and nodes either accept or reject it.
The key to the Nakamoto consensus lies in the way this leader is defined. At approximately equal intervals, each node publishes the result of its calculations, and the one that performs the calculations faster becomes the leader. Of course, these calculations must be complex enough that there is a high probability that only one leader will emerge.
Performing calculations may slow down the consensus algorithm. Moreover, consensus does not depend on the number of nodes, but on the total computing power of the network. Applicable to Zilliqa, the PoW algorithm will not become more efficient with a small shard size, and therefore our system requires a different consensus protocol.
Practical Byzantine Fault Tolerance (PBFT) to the rescue
Zilliqa uses the PBFT (Practical Byzantine Fault Tolerance) protocol to achieve consensus in each shard. This protocol was proposed by Castro and Liskov in 1999. When working with this protocol, it is believed that up to a third of the nodes in each shard can be dishonest (fraudulent).
There are a number of reasons why we chose this particular consensus protocol. The main reason: the operation of this protocol depends on the size of the network, and the small size of shards can allow it to work more efficiently.
In the PBFT protocol, all nodes in a group (in a shard in our case) have a sequence. One node has the status of the main (leader), and the rest are auxiliary. The consensus process takes place in three stages:
1. Preliminary preparation. At this stage, the leader announces a new block of information by sending out a “pre-message”.
2. Preparation. Having received this message, each node checks it and the information in the block for correctness, and sends a message about readiness to all other nodes.
3. Execution. Having received messages about readiness from the vast majority, each node sends out a message about the execution of transactions in the block. Finally, each node expects an execution message from the vast majority of nodes in the shard, which will mean that the information proposed by the leader has been accepted.
The PBFT protocol is based on the fact that the leader is honest, because malicious actions will freeze the entire protocol. To deal with this problem, the protocol has a leader change process. If there is no progress within a given period of time, each node has the right to send a request to change the leader. The decision to change the leader is reached by the vast majority of nodes, and the next leader becomes a node from a pre-agreed list.
Other advantages of PBFT
1. Transaction Completeness. In the PBFT protocol, transactions are considered completed once consensus is reached. No transaction confirmation is required. Nakamoto consensus requires transaction verification to protect against double-spending attacks.
2. Low power consumption. Since the PBFT protocol is not based on computational processes, it does not require significant energy consumption. Zilliqa uses PoW only to protect against sybil attacks and to identify nodes, but not to achieve consensus. After identification, work on the PBFT protocol begins, and re-identification is carried out, say, every 100 blocks. With Nakamoto consensus, PoW is required for each block.
3. Less difference between miner rewards. In the PBFT protocol, consensus is achieved through collective voting and sending messages with a signature. This allows us to reduce the difference in rewards for miners.
Problems with the PBFT Protocol
The PBFT protocol is certainly a promising way to achieve consensus in Zilliqa. But it has one major drawback: it only works effectively when the group reaching consensus is small (eg, less than 50). However, Zilliqa uses a shard size of 600 nodes or more for security reasons. In this case, the communication process between nodes quickly becomes a bottleneck, because each node must exchange messages with all nodes in the shard.
Summary
The sharding mechanism alone is not sufficient for fast system operation, so an effective consensus algorithm is required. The PoW algorithm used in Bitcoin/Ethereum will not work more efficiently with a small shard size, as in Zilliqa, since it depends on the overall computing power of the network.
Zilliqa decided to use the PBFT (Practical Byzantine Fault Tolerance) protocol to achieve consensus, as it is optimal for a small network. In addition, PBFT has such advantages as transaction completeness, low energy consumption and less variation between miner rewards.
However, PBFT has one significant drawback: it only works effectively when the group reaching consensus is small (for example, less than 50), and Zilliqa uses a shard size of 600 nodes or more for security reasons.