From Traditional Fault Tolerance to Blockchain. Wenbing Zhao
Чтение книги онлайн.
Читать онлайн книгу From Traditional Fault Tolerance to Blockchain - Wenbing Zhao страница 26
Figure 2.14 shows the logging latency for various message sizes using the traditional disk (on the left), and the solid state disk (on the right), respectively. The experimental results are presented here in the form of a sequence of probability density functions (PDF) [12] of the logging latency for various message lengths. The PDFs give much more details on the cost of logging operation than a simple average value. As can be seen, on both the solid state disk and the traditional disk, the far majority of the logging operation (for each incoming message) can be completed within 1000 μs for messages as large as 100KB, which means the logging can be done with a rate of over 100MB per second, approaching the advertised upper limit of the data transfer rate of traditional disks. For small messages, the logging can be done within 100 μs.
Figure 2.15 A summary of the mean logging latency and mean end-to-end latency under various conditions.
It is somewhat surprising to see that the performance on the solid state disk is not significantly better than that on the traditional disk, especially for small messages. For large messages, the solid state disk does make the logging operations more predictable in its latency, that is, the standard deviation [12] is much smaller than that on the traditional disk, as can be seen in Figure 2.15.
Figure 2.16 Probability density function of the end-to-end latency.
The end-to-end latency results shown in Figure 2.16 prove that indeed the pessimistic logging contributes very moderate (often less than 10%) overhead to the performance of the system as observed by the client. For messages of up to 100KB, the end-to-end latency with and without pessimistic logging falls within 10ms. For small messages, the end-to-end latency can go down as low as about 100μs. In all circumstances, the end-to-end latency is significantly larger than the logging latency. For the message size of 100KB, the oneway transfer latency over the network is estimated to be around 2600μs (half of the end-to-end latency without logging). This implies that the network manages to offer slightly under 40MB per second transfer rate.
2.3.2 Sender-Based Message Logging
For distributed applications that do not wish to log messages synchronously in stable storage, the sender-based message logging protocol [13] can be used to achieve limited degree of robustness against process failures. The basic idea of the sender-based message logging protocol is to log the message at the sending side in volatile memory. Should the receiving process fail, it could obtain the messages logged at the sending processes for recovery. To avoid restarting from the initial state after a failure, a process can periodically checkpoint its local state and write the message log in stable storage (as part of the checkpoint) asynchronously.
Unlike the receiver-based message logging protocol introduced in section 2.3.1, where the relative ordering of the messages received can be implicitly logged, such ordering information (i.e., the determinant for the messages) must be explicitly supplied by the receiver of a message to the sender. Furthermore, after sending the ordering information, the receiver needs to wait for an explicit acknowledgment for the ordering message. Prior to receiving of the acknowledgment, the receiver must not send any message to other processes (however, it can execute the message received immediately without delay, similar to the optimization for pessimistic logging discussed in section 2.3.1.2. This restriction is put in place to prevent the formation of orphan messages and orphan processes [7], which would force the orphan processes to roll back their state during the recovery of another process.
An orphan message is one that was sent by a process prior to a failure, but cannot be guaranteed to be regenerated upon the recovery of the process [7]. An orphan process is a process that receives an orphan message. If a process sends out a message and subsequently fails before the determinants of the messages it has received are properly logged, the message sent becomes an orphan message.
2.3.2.1 Data Structures
In the sender-based message logging protocol, each process must maintain the following data structures:
◾ A counter, seq_counter, used to assign a sequence number (using the current value of the counter) to each outgoing (application) message. The counter is initialized to 0 and incremented by one for each message sent. The sequence number is needed for duplicate detection (at the receiving process).
◾ A table used to carry out duplicate detection on incoming messages. The table consists of a collection of entries, one for each process with which the current one communicates. Each entry has the form <process_id,max_seq>, where max_seq is the maximum sequence number that the current process has received from a process with an identifier of process_id. A message is deemed as a duplicate if it carries a sequence number lower or equal to max_seq for the corresponding process.
◾ Another counter, rsn_counter, used to record the receiving/execution order of an incoming message. The counter is initialized to 0 and incremented by one for each message received. The receiving order of a message is represented by the current value of the counter and it is sent back to the sending process of the message for logging.
◾ A message log (in volatile memory) for messages sent by the process. In addition to the message sent, the following meta data is also recorded for each message:- Destination process id, receiver_id;- Sending sequence number, seq;- Receiving sequence number, rsn.The destination process id, the sending sequence number, and the message will be logged prior to the sending of the message. However, the receiving order number will be logged after the process receives such information later.
◾ A history list for the messages received since the last checkpoint. Each entry in the list has the following information regarding each message received:- Sending process id, sender_id;- Sending sequence number, seq;- Receiving sequence number, rsn (assigned by the current process).The history list is used to find the receiving order number for a duplicate message received. Upon receiving a duplicate message, the process should supply the corresponding (original) receiving order number so that the sender of the message can log such ordering information properly.
All the data structures described above except the history list must be checkpointed together with the process state. The two counters, one for assigning the message sequence number and the other for assigning the message receiving order, are needed so that the process can continue doing so upon recovery using the checkpoint. The table for duplicate detection is needed for a similar reason. However, the saving of the message log as part of the checkpoint might appear to be counter-intuitive because a major benefit of doing checkpointing is to truncate the message log (i.e., garbage collect logged messages) for (receiver-based) pessimistic logging as described in section