From 7153d6d0cb2b8272fc9b4c4141a7b298120c8cd4 Mon Sep 17 00:00:00 2001 From: Sergey Fedorov Date: Mon, 9 Sep 2024 14:55:18 +0200 Subject: [PATCH] Fix Disel framework name in post on distributed protocol frameworks There was a misspelling of the Disel framework name through the text of the blog post on frameworks for implementing distributed protocols: it was wrongly spelled "Diesel". Correct that mistake. Signed-off-by: Sergey Fedorov --- .../index.mdx | 32 +++++++++---------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/blog/2024-08-27-on-frameworks-for-implementing-distributed-protocols/index.mdx b/blog/2024-08-27-on-frameworks-for-implementing-distributed-protocols/index.mdx index b95ab6e..e01b0c9 100644 --- a/blog/2024-08-27-on-frameworks-for-implementing-distributed-protocols/index.mdx +++ b/blog/2024-08-27-on-frameworks-for-implementing-distributed-protocols/index.mdx @@ -32,7 +32,7 @@ Here is the full list of 7 frameworks, based on different programming languages, - [Babel](https://github.com/pfouto/babel-core) — a generic framework for implementing and executing distributed protocols (based on Java); - [DiStaL](https://distal.github.io/) — a framework for implementing and executing distributed protocols (based on Scala); - [PSync](https://github.com/dzufferey/psync) — a framework for implementing and verifying fault-tolerant distributed protocols (based on Scala); - - [Diesel](https://github.com/DistributedComponents/disel) — a framework for implementation and compositional machine-assisted verification of distributed systems and their clients (based on Coq); + - [Disel](https://github.com/DistributedComponents/disel) — a framework for implementation and compositional machine-assisted verification of distributed systems and their clients (based on Coq); - [Verdi](https://github.com/uwplse/verdi) — a framework for implementing and formally verifying distributed systems (based on Coq). [^other-frameworks]: If you know of some other framework for implementing distributed protocols that I should have looked into, please let me know. @@ -52,7 +52,7 @@ The way that systems and their components are modeled has a profound effect on t The common approach is to model distributed systems at high level as [state transition systems](https://en.wikipedia.org/wiki/Transition_system) where the global system state changes upon external and internal events, such as client requests, messages exchanged within the system, and timeouts. In this model, the global system state includes the states of all individual nodes and their components, as well as the environment state. The environment state usually includes the state of the network the nodes communicate through, in particular the messages in transit. Transitioning from one state to another happens according to a global _transition function_ triggered by events. The transition function receives the current state of the system together with the triggering event and returns the new system state. -The explored frameworks follow the [message-passing](https://en.wikipedia.org/wiki/Message_passing) approach, where the states of individual nodes and protocol components are disjoint, i.e. different parts do not share pieces of state. Interaction between nodes happens through the network by sending and receiving messages. Sending and receiving of messages are modeled as events modifying the global system state by updating the set of messages in transit in the network state and, in case of message receiving, the state of the target component in the destination node. For example, in Diesel, the system state includes a "message soup" for each protocol, which models the current state and the history of the network. In Diesel's abstract model, messages are never removed from the message soup, instead they are marked either as active or consumed. +The explored frameworks follow the [message-passing](https://en.wikipedia.org/wiki/Message_passing) approach, where the states of individual nodes and protocol components are disjoint, i.e. different parts do not share pieces of state. Interaction between nodes happens through the network by sending and receiving messages. Sending and receiving of messages are modeled as events modifying the global system state by updating the set of messages in transit in the network state and, in case of message receiving, the state of the target component in the destination node. For example, in Disel, the system state includes a "message soup" for each protocol, which models the current state and the history of the network. In Disel's abstract model, messages are never removed from the message soup, instead they are marked either as active or consumed. Nodes and network failures are modeled as special events, e.g. dropping or duplicating messages in the network, disabling normal event handling in faulty nodes or (partially) resetting their state. Some of the frameworks only consider crash faults, leaving [Byzantine faults](https://en.wikipedia.org/wiki/Byzantine_fault) for the future work. The system models in most of the frameworks do not seem to include timing assumption, so they can be considered asynchronous. In contrast, PSync employs the Heard-Of model based on communication-closed rounds, which provides an illusion of simple synchronous communication on top of the partial synchrony of the actual underlying network. In this model, protocol execution proceeds in explicit rounds, alternating communication with protocol state transition based on the set of messages received during the round. Network and node faults in this model are unified, and the network assumptions are specified in terms of the heard-of sets. @@ -60,7 +60,7 @@ Communication between nodes is predominantly modeled as fire-and-forget message Individual components within nodes are commonly modeled as sequential, event-driven state machines interacting with each other via reliable message passing. The message passing normally follows the one-to-one one-way or request-response patterns; however, some of the frameworks also support one-to-many notifications between components. Some frameworks (e.g., Verdi) explicitly distinguish between the events that are external to the distributed protocol, like client requests, and internal events, like exchanging messages between protocol components and nodes within the distributed system. -To overcome the limitations of strict state separation between protocol components in the abstract model, Diesel allows coupling protocols via inter-protocol behavioral dependencies, called send-hooks, which allow restricted logical access to other protocol's state. Diesel doesn't seem to strictly follow the message-passing model for protocol components within the same node, supporting generic composition of protocol components. Diesel also provides mechanisms to establish stronger properties of protocols and their combinations by strengthening them with additional inductive invariants. +To overcome the limitations of strict state separation between protocol components in the abstract model, Disel allows coupling protocols via inter-protocol behavioral dependencies, called send-hooks, which allow restricted logical access to other protocol's state. Disel doesn't seem to strictly follow the message-passing model for protocol components within the same node, supporting generic composition of protocol components. Disel also provides mechanisms to establish stronger properties of protocols and their combinations by strengthening them with additional inductive invariants. As we can see, although there are some interesting variations and extensions, the underlying system model in the explored frameworks is largely the same, namely the one of a state transition system composed of components, which are sequential, event-driven state machines with disjoint state, interacting via message passing. This is remarkably similar to how distributed protocols are usually implemented in real-world code bases, which do not use any framework, as I described in [the previous post](https://replica-io.dev/blog/2024/03/04/on-implementation-of-distributed-prtocols). This approach tends to shift the focus more to the operational rather than functional and logical aspects of the system. There I also pointed out, when discussing how protocol implementations attempt to [evade concurrency](https://replica-io.dev/blog/2024/03/04/on-implementation-of-distributed-prtocols#evading-concurrency), that this approach may complicate the implementation and cause fragmentation of the protocol logic. Perhaps, adopting the same kind of the underlying model is one of the reasons why the explored frameworks do not seem to have made a real breakthrough in designing and implementing distributed protocols. @@ -68,21 +68,21 @@ As we can see, although there are some interesting variations and extensions, th In all of the explored frameworks, protocol implementations rely on some kind of runtime or shim provided by the framework, hiding low-level concerns from implementation of the protocol logic. In most of the frameworks, the runtime is responsible for coordinating the execution of protocol components and interaction between them; some of the frameworks also provide there dedicated interfaces for communication over the network and setting timers. Protocol components normally need to be registered within the runtime before they can function within the system. -In most cases, protocol components within the same node can interact by simply sending some kind of internal messages to each other, without establishing any explicit connection. In contrast, components in Atlas require explicit orchestration of the interaction with the rest of the system. Diesel instead supports composition of components expressed as effectful functional programs; it also allows loosely coupling protocol components via inter-protocol behavioral dependencies, called send-hooks, at the formal specification level. +In most cases, protocol components within the same node can interact by simply sending some kind of internal messages to each other, without establishing any explicit connection. In contrast, components in Atlas require explicit orchestration of the interaction with the rest of the system. Disel instead supports composition of components expressed as effectful functional programs; it also allows loosely coupling protocol components via inter-protocol behavioral dependencies, called send-hooks, at the formal specification level. Since the runtime in Mir is only responsible for coordinating the execution of protocol components and interaction between them, there are special components that provide functionality for sending and receiving messages over the network, as well as for setting local timers and for cryptographic operations. Mir explicitly distinguishes between passive components, which can only produce events synchronously, as a result of processing input events provided by the runtime, and active components, which can asynchronously produce new events on their own. -There are frameworks that provide means for enhancing protocol components with additional properties. Diesel provides a protocol combinator `ProtocolWithIndInv`, which allows elaborating a protocol by strengthening it with additional inductive invariants. Verdi provides verified system transformers, which allow transforming protocols specified, implemented, and verified in an idealistic, reliable network semantics into an equivalent implementation, preserving the transformed system properties under a weaker fault model. PSync provides a class, which can be used to wrap protocol round instances to support updating progress conditions and synchronizing rounds in a Byzantine setting. +There are frameworks that provide means for enhancing protocol components with additional properties. Disel provides a protocol combinator `ProtocolWithIndInv`, which allows elaborating a protocol by strengthening it with additional inductive invariants. Verdi provides verified system transformers, which allow transforming protocols specified, implemented, and verified in an idealistic, reliable network semantics into an equivalent implementation, preserving the transformed system properties under a weaker fault model. PSync provides a class, which can be used to wrap protocol round instances to support updating progress conditions and synchronizing rounds in a Byzantine setting. There are two main approaches to structure interaction of distributed protocols with the rest of the application. Some frameworks provide a mechanism for interacting with protocol components by sending and receiving messages, same as protocol components interact with each other. Other frameworks allow defining dedicated interfaces for that purpose, featuring callable methods or special IO events. In some of the frameworks, protocol components are supplied with some kind of handles to trigger side effects, such as sending messages or setting up timers; in other frameworks, side effects only happen after returning the control back to the runtime, e.g. through the return value or using a monadic structure. -Protocol components commonly consist of the component's state and protocol logic structured as handlers modifying the state or state transition functions, which are triggered by the runtime upon certain events or conditions. In Distal and Diesel, the handlers/transitions can be augmented with guarding conditions that must hold in order to trigger the action. +Protocol components commonly consist of the component's state and protocol logic structured as handlers modifying the state or state transition functions, which are triggered by the runtime upon certain events or conditions. In Distal and Disel, the handlers/transitions can be augmented with guarding conditions that must hold in order to trigger the action. The round-based model in PSync imposes a particular structure of protocol components. Protocol components in PSync must specify a sequence of protocol rounds. Each round, in general, consists of methods to: initialize the round, send messages at the beginning of the round, process received messages, and finally update the internal state before transitioning into the next round. -PSync and Diesel explicitly separate the protocol specification from its implementation, whereby the specification is used to formally verify the implementation. Protocol specifications in PSync consist of protocol properties, as well as safety and liveness predicates (assumptions). In order to aid automated verification in PSync, the specification should also include round and/or phase invariants; round transition relations are automatically derived from the code, although this imposes certain limitations on the code. In Diesel, high-level abstract protocol specifications are defined in terms of state-space coherence predicates and send/receive transitions. +PSync and Disel explicitly separate the protocol specification from its implementation, whereby the specification is used to formally verify the implementation. Protocol specifications in PSync consist of protocol properties, as well as safety and liveness predicates (assumptions). In order to aid automated verification in PSync, the specification should also include round and/or phase invariants; round transition relations are automatically derived from the code, although this imposes certain limitations on the code. In Disel, high-level abstract protocol specifications are defined in terms of state-space coherence predicates and send/receive transitions. -The configuration of protocol components within nodes and their internal structure can be more static or dynamic. For example, in Babel, protocol components and various components' callbacks are registered within the runtime dynamically. The configuration of top-level components in Mir is rather static, for it cannot be changed after initialization, but there is a special component, called factory module, that supports creating sub-components dynamically. Such flexibility at runtime can make formal verification particularly hard, so the frameworks focused on formal verification (PSync, Diesel, and Verdi) tend to be rather static with respect to the configuration and internal structure of protocol components. +The configuration of protocol components within nodes and their internal structure can be more static or dynamic. For example, in Babel, protocol components and various components' callbacks are registered within the runtime dynamically. The configuration of top-level components in Mir is rather static, for it cannot be changed after initialization, but there is a special component, called factory module, that supports creating sub-components dynamically. Such flexibility at runtime can make formal verification particularly hard, so the frameworks focused on formal verification (PSync, Disel, and Verdi) tend to be rather static with respect to the configuration and internal structure of protocol components. Overall, the abstract model adopted by a framework, e.g. the model of a generic state transition system with message-passing or the Heard-Of model based on communication-closed rounds, largely determines how protocol specifications and implementations are structured within the framework. The framework's features like formal verification can add further restrictions, e.g. restricting runtime flexibility of the configuration and internal structure of protocol components. @@ -94,11 +94,11 @@ Using a regular general-purpose programming language as the basis allows tapping In most cases, the notational extensions introduced by the frameworks serve to make the code more declarative and concise. One target of such enhancements is providing a convenient and clear way of expressing the typical event-oriented pseudo-code notation with `upon` statements. For example, the protocol logic in Distal is implemented by defining rules and the corresponding actions, whereby the rules are expressed in a declarative style resembling typical pseudo-code found in the literature and specify event predicates, such as the message type and matching conditions, as well as the means to specify composite events, which are triggered by collections of messages. -Another area of applying notational enhancements is for expressing certain actions performed by the protocol logic and overcoming the limitations of the host language. For example, Distal provides a special notation for sending messages, discarding received messages, as well as for scheduling actions to be executed in future. Diesel extends Coq's specification language Gallina with effectful commands (actions), such as sending and receiving messages, reading from local state, monadic sequential composition, and general recursion. For message sending and receiving actions, Diesel provides transition wrappers, which lift low-level operations on the networking layer to the level of well-typed program primitives corresponding to the protocol specification. Similarly, Verdi provides a monad for expressing the actions of sending messages, emitting output event, and manipulating the current state, as well as convenience notation for various monadic bindings. +Another area of applying notational enhancements is for expressing certain actions performed by the protocol logic and overcoming the limitations of the host language. For example, Distal provides a special notation for sending messages, discarding received messages, as well as for scheduling actions to be executed in future. Disel extends Coq's specification language Gallina with effectful commands (actions), such as sending and receiving messages, reading from local state, monadic sequential composition, and general recursion. For message sending and receiving actions, Disel provides transition wrappers, which lift low-level operations on the networking layer to the level of well-typed program primitives corresponding to the protocol specification. Similarly, Verdi provides a monad for expressing the actions of sending messages, emitting output event, and manipulating the current state, as well as convenience notation for various monadic bindings. -Finally, expressing protocol properties, assumptions, and invariants, as well as the related annotations in the protocol implementation can also benefit from notational enhancements. PSync, for instance, defines a DSL for expressing properties, predicates and invariants, in which one of the main primitives is the notion of domain, representing a set of values of certain type with universal and existential quantification defined for it, as well as set comprehension. Diesel features a special notation for representing the higher-order Hoare types of program fragments. +Finally, expressing protocol properties, assumptions, and invariants, as well as the related annotations in the protocol implementation can also benefit from notational enhancements. PSync, for instance, defines a DSL for expressing properties, predicates and invariants, in which one of the main primitives is the notion of domain, representing a set of values of certain type with universal and existential quantification defined for it, as well as set comprehension. Disel features a special notation for representing the higher-order Hoare types of program fragments. -Many elements of a DSL can be implemented using common programming language techniques like polymorphism, inheritance, composition, higher-order functions, and the type system features. This way, Distal implements most of its DSL as ordinary methods and convenience aliases. However, this approach has certain limitations, and such techniques as metaprogramming (macros) or code generation are often required. For instance, in Distal, the key element of the DSL is implemented as a macro; in PSync, automatic derivation of transition relations from the code is also implemented as a macro. Code generation in Mir reduces the amount of hand-written boilerplate code by processing Protobuf definitions annotated with special extensions. Diesel and Verdi take advantage of Coq's syntax extension features. +Many elements of a DSL can be implemented using common programming language techniques like polymorphism, inheritance, composition, higher-order functions, and the type system features. This way, Distal implements most of its DSL as ordinary methods and convenience aliases. However, this approach has certain limitations, and such techniques as metaprogramming (macros) or code generation are often required. For instance, in Distal, the key element of the DSL is implemented as a macro; in PSync, automatic derivation of transition relations from the code is also implemented as a macro. Code generation in Mir reduces the amount of hand-written boilerplate code by processing Protobuf definitions annotated with special extensions. Disel and Verdi take advantage of Coq's syntax extension features. Clear and convenient notation for defining protocol specifications and their implementation is crucial for ergonomics and expressiveness. Building upon ordinary code, clever use of common programming language techniques, such as polymorphism, inheritance, composition, higher-order functions, and the type system features, should be the preferred approach for achieving notational expressiveness. Such techniques as metaprogramming and code generation can greatly help overcoming the limitations of that approach or further improving the notation, but they can make the framework more complicated and, therefore, should be employed judiciously. @@ -118,23 +118,23 @@ So orchestrating the execution and coordinating interaction of protocol componen ## Verification -Not all of the explored frameworks are concerned with verifying correctness of protocols and their implementations. Verification is the primary area of focus for Diesel and Verdi, as well as PSync. Verification is not a major concern in Mir, though it provides some mechanisms that can be considered as lightweight verification of protocol implementation correctness. +Not all of the explored frameworks are concerned with verifying correctness of protocols and their implementations. Verification is the primary area of focus for Disel and Verdi, as well as PSync. Verification is not a major concern in Mir, though it provides some mechanisms that can be considered as lightweight verification of protocol implementation correctness. Mir includes support for recording, inspecting, modifying, and replaying traces of events being passed by the node engine between the components, which can be very helpful for debugging, but also to perform correctness analysis. Mir also comes with a simple discrete-event simulation runtime that can be used for randomized reproducible testing with simulated time. -Full-fledged formal verification of protocol specifications and implementation in Diesel and Verdi relies on machine-assisted theorem proving, whereas PSync attempts to automate the process. Theorem proving, even if machine-assisted, is a very difficult, time consuming task and requires special expertise. Automated verification, on the other hand, is in general undecidable and can only be achieved with certain restrictions to the system model and the protocol implementation. +Full-fledged formal verification of protocol specifications and implementation in Disel and Verdi relies on machine-assisted theorem proving, whereas PSync attempts to automate the process. Theorem proving, even if machine-assisted, is a very difficult, time consuming task and requires special expertise. Automated verification, on the other hand, is in general undecidable and can only be achieved with certain restrictions to the system model and the protocol implementation. -Formal verification requires formal specification of the assumptions and the required properties. In Diesel, protocol specifications are defined in terms of state-space coherence predicates and send/receive transitions. In Verdi, the correct behavior of the protocol is specified as a logical predicate over its possible traces of events. Formal specifications in PSync are expressed in terms of protocol properties, expressed in a fragment of a first-order temporal logic, as well as safety and liveness predicates (assumptions), expressed in terms of cardinalities of heard-of sets. +Formal verification requires formal specification of the assumptions and the required properties. In Disel, protocol specifications are defined in terms of state-space coherence predicates and send/receive transitions. In Verdi, the correct behavior of the protocol is specified as a logical predicate over its possible traces of events. Formal specifications in PSync are expressed in terms of protocol properties, expressed in a fragment of a first-order temporal logic, as well as safety and liveness predicates (assumptions), expressed in terms of cardinalities of heard-of sets. In these frameworks, the method of formally proving the correctness is typically by induction, constructing inductive state invariants. Coming up with appropriate inductive invariants constitutes the greatest difficulty in the verification effort and requires special skills. In PSync, the simple round-based structure with lock-step semantics makes protocol implementation amenable to automated verification that can check safety and liveness properties. The verification, though, requires inductive invariants at the boundaries between rounds. However, the automated verification problem is decidable with certain constraints. -The correctness of protocol implementation in Verdi is proved directly, whereas Diesel employs a different approach. For implementing protocol specification, Diesel provides a DSL, embedded shallowly into Coq, that extends Coq's specification language Gallina. Diesel programs and their fragments are assigned types corresponding to the protocol specification in higher-order separation-style Hoare logic. For message sending and receiving actions, Diesel provides transition wrappers, which lift low-level operations on the networking layer to the level of well-typed program primitives corresponding to the protocol specification. Well-typed programs are guaranteed to be _correct by construction_ w.r.t. the protocol specifications. +The correctness of protocol implementation in Verdi is proved directly, whereas Disel employs a different approach. For implementing protocol specification, Disel provides a DSL, embedded shallowly into Coq, that extends Coq's specification language Gallina. Disel programs and their fragments are assigned types corresponding to the protocol specification in higher-order separation-style Hoare logic. For message sending and receiving actions, Disel provides transition wrappers, which lift low-level operations on the networking layer to the level of well-typed program primitives corresponding to the protocol specification. Well-typed programs are guaranteed to be _correct by construction_ w.r.t. the protocol specifications. In Verdi, once the protocol is specified, implemented, and verified in an idealistic reliable network semantics, it can be translated using _verified system transformers_ into an equivalent implementation, preserving the transformed system properties under a weaker fault model. -in Diesel, protocol specifications can be generic and parameterized. The generic protocol specifications with their proven invariants can be used in composition. Diesel provides mechanisms to establish stronger properties of generic protocols and their combinations by elaborating the protocol, i.e. strengthening it with additional inductive invariants. +in Disel, protocol specifications can be generic and parameterized. The generic protocol specifications with their proven invariants can be used in composition. Disel provides mechanisms to establish stronger properties of generic protocols and their combinations by elaborating the protocol, i.e. strengthening it with additional inductive invariants. Correctness is very important for the critical fault-tolerant protocols; therefore, the verification aspect deserves special attention. There are relatively easy-to-apply lightweight methods, such as randomized reproducible testing with discrete-event simulation, and there are sophisticated formal methods typically requiring special skills and expertise, such as machine-assisted theorem proving. However, some particularly structured system models may enable automated reasoning and make formal verification more practical, though at the cost of additional limitations.