Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering
Abstract
Distributed applications use replication, implemented by protocols like Paxos, to ensure data availability and transparently mask server failures. This paper presents a new approach to achieving replication in the data center without the performance cost of traditional methods. Our work carefully divides replication responsibility between the network and protocol layers. The network orders requests but does not ensure reliable delivery – using a new primitive we call ordered unreliable multicast (OUM). Implementing this primitive can be achieved with near-zero-cost in the data center. Our new replication protocol, Network- Ordered Paxos (NOPaxos), exploits network ordering to provide strongly consistent replication without coordination. The resulting system not only outperforms both latency- and throughput-optimized protocols on their respective metrics, but also yields throughput within 2% and latency within 16 μs of an unreplicated system – providing replication without the performance cost.