From Traditional Fault Tolerance to Blockchain. Wenbing Zhao
Чтение книги онлайн.
Читать онлайн книгу From Traditional Fault Tolerance to Blockchain - Wenbing Zhao страница 12
Distributed consensus has been studied for several decades, at least starting in 1970s. The reason why distributed consensus is important is that a distributed system would span over many computing nodes, and these nodes must maintain a common view on the system state so that each can operate as planned towards the objectives of the system. Prolonged inconsistency among different components of the system would damage the integrity of the system and ultimately would result in system-level failures that are visible to end users.
The cost of system failures is enormous. If a data center is brought down by a system failure, the average cost for downtime may range from $42,000 to about $300,000 per hour [2, 6]. The cost can be estimated by summing up the wasted expenses and the loss of revenue. While the labor cost of downtime may be estimated relatively easily (i.e., roughly, wasted expenses per hour = number of employees × average salary per hour) [13], it is much harder to estimate the loss of revenue, especially due to the damages on the reputation of the business and the loyalty of its potential customers [2].
Ensuring high availability of distributed systems is not cheap. In [7], the cost of data center is estimated to range from $450 per square foot for 99.671% availability (i.e., 28.8 hours of downtime per year), to $1,100 per square foot for 99.995% availability (i.e., 0.4 hours of downtime per year). That is perhaps one reason why about 59% of Fortune 500 companies suffer from 1.6 hours or more of downtime per week [2].
All classical consensus algorithms rely on a concept referred to as membership, that is, every node would know how many nodes are in the current membership, the logical role of each node, and how to reach other nodes. Another important construct is voting via the sending of messages to each other. Typically, one of the members would assume a special role, which is referred to as the primary or the coordinator. The coordinator might fail or become compromised, in which case, a new coordinator would be elected through voting. As such, classical distributed consensus algorithms are expensive, highly complex, and not scalable due to the heavy use of multiple rounds of message exchanges among the members.
In January 2009, the launch of the first practical cryptocurrency, Bitcoin [12], has completely changed the picture. The most essential prerequisite for a cryptocurrency is the assurance that it is virtually impossible for anyone to double-spend the money (i.e., cryptocurrency) one has. Bitcoin addressed this requirement by introducing an immutable distributed ledger in the form of a chain of blocks where each block aggregates hundreds or even thousands of transactions. This distributed ledger is often referred to as the blockchain. The immutability of the blockchain is achieved by several means: (1) cryptographic protection of the blockchain, such as digital signature, one-way hash function, and chaining of the blocks; (2) massive degree of replication of the blockchain across many nodes in the Bitcoin network; and (3) a novel probabilistic consensus algorithm that is completely different from classical consensus algorithms.
The consensus algorithm used in Bitcoin does not involve any explicit form of voting, therefore, there is no extra message exchange among the nodes in the Bitcoin network for the purpose of reaching consensus. In Bitcoin, the consensus building process is converted into a lottery-like stochastic process where the winner of the lottery gets the right to create a new block of transactions and collects an award [22]. To ensure fairness and to ensure the process to be a stochastic process, every participating node would work on a Proof-of-Work (PoW) based puzzle, and the first one that finds a solution becomes the winner. The PoW puzzle has a predefined target difficulty, and a participating node would experiment with different ways of making the hash of the block header meet the target difficulty. This is a CPU-intensive process. Hence, the only way a node could gain advantage over other nodes is to invest in better hardware that can perform the hash operation faster. The Bitcoin consensus algorithm is referred to as PoW and sometimes as the Nakamoto algorithm, named after Bitcoin’s creator, which is apparently a pseudonym. This novel form of consensus algorithm has aroused huge interest in the research and application of the blockchain technology [20]. Some even expressed the hope that the blockchain technology would lead to a new-form of economy, just like what the Internet has transformed our society [16].
This book contains two parts. The first part consists of the first 7 chapters and it covers the most essential techniques for building dependable distributed systems. The last 3 chapters form the second part, which covers the blockchain technology.
Chapter 1 introduces the basic concepts and terminologies of dependable distributed computing, as well as the primary means to achieve dependability.
Chapter 2 describes the checkpointing and logging mechanisms, which are widely used in practice to achieve some form of fault tolerance. Checkpointing and logging enable the recoverability of the system but do not prevent service disruption. These mechanisms are relatively simple to implement and understand, and they incur minimum runtime overhead while demanding very moderate extra resources (only stable storage). Furthermore, checkpointing and logging also serve as the foundation for more sophisticated dependability techniques.
Chapter 3 covers research works on recovery-oriented computing, including fault detection and diagnosis, microreboot, and system-level undo and redo. Recovery-oriented computing aims to facilitate faster recovery after a system failure and thereby improving the availability of the system. Similar to checkpointing and logging, the mechanisms for recovery-oriented computing do not prevent service disruption, hence, it is a promising approach for many e-commerce application, but not suitable for applications that require high reliability.
Chapter 4 outlines the replication technique for data and service fault tolerance. This is the fundamental technique to ensure high reliability. Through active replication (i.e., the use of multiple redundant copies of the application processes), the system would be able to mask the failure of a replica and continue to process clients’ requests (this is actually not entirely true, as we will show in later chapters, some failures may cause extended period of unavailability of the system). With replication comes the complexity of consistency issue. Ideally, the replicas should always maintain consistency with each other. However, doing so might not incur too much runtime overhead to be acceptable for some applications, or may cause extended period of system unavailability. Hence, strict consistency may have to be compromised either for better performance [15] or for better availability [19].
Chapter 5 explains the group communication systems, which can be used to implement active replication. A group communication system typically offers a totally ordered reliable multicast service for messages, a membership server, and a view synchrony service. These set of services help the replicas to maintain consistency even in the presence of failures, which would reduce the development cost of building dependable systems with active replication.
Chapter 6 discusses the consensus problem and describes several Paxos algorithms, including the Classic Paxos, Dynamic Paxos, Cheap Paxos, and Fast Paxos. While it is easy for a group of processes to agree on the same value if all processes can communicate with each other promptly and if none of them fails, distributed consensus is an incredibly hard problem when processes might fail and there might be extended delay to send or receive a message. The classical Paxos algorithm solves the consensus problem