-
Notifications
You must be signed in to change notification settings - Fork 203
Tasks
The Elrond Network is a peer-to-peer network of machines called Nodes. The Nodes perform all the tasks required by the Elrond protocol:
- they accept transactions from users in the outside world (see Receiving Transactions)
- they execute the transactions and put them into blocks (see Building a Block)
- they execute custom code (SmartContracts) deployed to the blockchain and called by transactions (see Executing SmartContracts)
- they run a consensus protocol to agree on what block needs to be permanently added to the blockchain (see Participating in Consensus)
- they must remain in permanent agreement with each other regarding the latest state of the blockchain (see Communicating with the Metachain)
- they produce blocks in lock-step with the entire network (see Timekeeping and Producing Blocks)
- they help each other by passing required information from one another (see Requesting information from the Network)
- they communicate constantly with each other and relay messages from one another, ensuring reliable and efficient peer-to-peer communication (see Propagation of information)
- they manage an impredictible, yet deterministic randomness source (see Generating randomness)
- they constantly supervise each other, to detect any accidental faults in processing or malicious activity as early as possible (see Ensuring tolerance to faults and Detecting malicious activity
- other tasks?
Because of the complicated tasks listed above must be performed reliably, in a reproducible manner and as efficient as possible, the Node must be a complex piece of software.
The main purpose of the Nodes in the Elrond Network is to execute the Transactions requested by its users. When a user wants to request the execution of a Transaction, they submit it to any Node's REST API, as doing this causes the Node to propagate the Transaction throughout the rest of the Network, which will lead to its execution and its eventual inclusion in the Blockchain. This means that the REST API of the Nodes collectively represents the "entry point" through which the Network receives information and requests from the outside world.
Note that people who run Nodes on their own machines may choose to disable the REST API, allowing their Nodes to focus strictly on producing blocks for the Network. On the other hand, there are Nodes in the Network that are dedicated "entry points" and never produce Blocks (but contribute meaningfully in other ways). See below for details.
While any user can submit a desired Transaction to any Node directly through its REST API, there is a convenient method of submitting Transactions to the Network at large: Elrond provides a Web application which, among other features, enables its users to easily submit Transactions to the Network without having to worry about technicalities such as composing a valid Transaction, signing it and selecting an appropriate Node to submit it to. This Web application is the Elrond Wallet, currently submitting Transactions to the Testnet and is accessible at testnet.elrond.com/#/wallet.
To create a Transaction, the Wallet takes user input, such as the destination Account and the sum of ERD to transfer, then uses this input to perform the operations described in the section Creating a Transaction. The resulting Transaction is passed to the Proxy, which is an intermediary application that handles incoming Transaction requests from the Wallet and submits them to certain dedicated Nodes in the Elrond Network.
While any Node in the Network can accept Transaction requests, the Elrond Wallet submits Transactions to the Proxy application, which maintains a list of Nodes to forward Transaction requests to - these Nodes are selected in such manner that any Transaction submitted to them will be processed by the Network as soon and as efficiently as possible.
The Proxy will submit a Transaction on behalf of the user to the REST API of one of its listed Nodes, selected for (1) being online at the moment and (2) being located within the Shard to which the Sender's Account belongs (see Executing Transactions for the reason of this second criterion). After receiving the Transaction on its REST API, that specific Node will propagate the Transaction throughout the Network, which will lead to its execution.
Note that the Nodes enlisted by the Proxy for Transaction submission are not just some random Nodes - they are specific Nodes which do not ever participate in Consensus, also known as Observer Nodes (as opposed to the normal Nodes, called Validator Nodes). Observer Nodes thus act as a default dedicated "entry point" into the Network. Moreover, Observer Nodes play an important role in the health and stability of the Network, because they act as fast propagators of information and have a lot of storage space. This means that the performance requirements of running a Node can be lowered for normal Nodes, and the machines that run normal Nodes are not as stressed as they would be without Observers.
It is worth repeating here, though, that submitting a Transaction through the Wallet, and implicitly to the Observer Nodes, is completely optional - any Node of the Network will accept Transactions to propagate, given it has not disabled its REST API.
In its very simplest form, a Transaction is the atomic transfer of ERD between two specific Accounts. After a Transaction is executed, commited to the Blockchain and declared final, the Transaction takes full effect and cannot ever be reverted anymore. More complex Transactions also require executing SmartContracts and storing the results, but the principle remains the same for all Transactions:
- Transactions must arrive at Nodes through Gossip
- Transactions involve two Accounts (Sender and Destination)
- Transactions are executed atomically (a Transaction has no effect until it is executed completely)
- Transactions are incremental changes of the global State
- Transactions are stored in bulk in the form of Blocks
- Transactions must be first declared final before taking full effect
- Transactions cannot be reverted once declared final
Nodes receive Transactions from the outside world on their REST API, which they propagate throughout the Network for execution. The execution of a Transaction involves a set of verifications against any faults it may contain, then its inclusion in a Block that was proposed and validated by the Nodes participating in a Consensus process.
Executing any kind of Transaction requires a non-negligible computational effort to be expended by the Nodes. This effort involves the consumption of time, computational power and electrical energy - real-world resources which cost real-world money. To compensate people who run Nodes on their machines, the Elrond Protocol dictates that Nodes will receive compensation for their work, paid in ERD.
But the exact amount of real-world resources consumed by a Node to execute a Transaction is difficult to quantify. A solution to this problem, popular across many Blockchain platforms, is to use an abstract unit called Gas for quantifying computational effort. Gas is then used to calculate the amount of ERD received by Nodes as compensation for their work. Measuring computational effort in Gas instead of CPU cycles, Joules or the market price of hardware is a simplification, of course, but it is very useful and flexible. The exact amounts of Gas consumed by Nodes for the tasks it performs is described in the section Gas consumed by operations.
In principle, executing a Transaction would be a straightforward process, regardless of its type, but in practice, it is complicated by the fact that the Elrond Network is sharded (i.e. fragmented) in order to achieve higher efficiency. More specifically, Sharding affects the execution of Transactions due to State Sharding, which separates and distributes the information required to execute Transactions across a carefully controlled number of Shards. As a direct consequence, information about the Accounts themselves is divided throughout the Network. And to execute Transactions, a Node requires information about both the Sender Account and Destination Account. But because a Node is assigned to a single Shard at a given moment in time, it might find itself in the situation where it must execute a Cross-Shard Transaction, i.e. a Transaction where the Sender Account and Destination Account are located in different Shards. In contrast, Transactions where the Sender Account and Destination Account are located in the same Shard are called Intra-Shard Transactions.
It is important to remember that the execution of any Transaction is always initiated by Nodes belonging to the Shard of the Sender Account, regardless of the type of Transaction. This approach enables a relatively simple solution to the challenge posed by Cross-Shard Transactions, because the Nodes that initiate their execution already have half the required information: they have the Sender Account information by virtue of belonging to the Shard of the Sender Account.
To keep the Blocks tidy and organized, Transactions they contain are grouped in Miniblocks, which are lists of Transactions of the same type. For example, there are Miniblocks containing value-transferring Cross-Shard Transactions where the current Shard is the Source Shard, and there are Miniblocks containing SmartContract results from Cross-Shard Transactions where current Shard is the Destination Shard. See Types of Miniblocks within a Block for more details.
As one would expect, Intra-Shard Transactions are easier to execute and will finalize earlier than Cross-Shard Transactions, which require extra steps. The Elrond Network aims to organize the Account Space in a way that minimizes the number of Cross-Shard Transactions and maximizes the number of Intra-Shard Transactions. See the subsections below for details on how both types of Transactions are executed.
An Intra-Shard Transaction is a Transaction between two Accounts assigned to the same Shard. This is the simpler kind of Transaction to execute, as opposed to executing Cross-Shard Transactions, which require extra steps.
As emphasized in section Executing Transactions, the execution of a Transaction is always initiated by Nodes that belong to the Shard of the Sender Account, regardless of the type of Transaction. Since an Intra-Shard Transaction involves Accounts assigned to the same Shard (as its name implies), the Nodes executing it already have all the information required.
The steps of execution are as follows:
- Certain Nodes of the Shard begin the Consensus process.
- The Leader Node of the Consensus group proposes a Block which includes the Intra-Shard Transaction. This results in the sum of ERD being deducted from the Sender Account and then added to the Destination Account.
- The rest of the Nodes in the Consensus group validate the proposed Block and validate the result available in the Block Header (e.g. root hashes).
- If all Nodes in the Consensus group agree on the same resulting Block, then the Block is notarized by the Metachain and the Consensus process ends.
- Later, after some more Blocks have been notarized by the Metachain, the Block that contained the executed Intra-Shard Transaction is declared final, which means that all the Transactions it contains are now final as well, including the Intra-Shard Transaction discussed here.
Note that Intra-Shard Transactions are stored in Miniblocks within a Block, separate from Cross-Shard Transactions (which are also stored in their own Miniblocks). See the linked sections for more details.
A Cross-Shard Transaction is a Transaction between two Accounts assigned to different Shard. This is the more complex kind of Transaction to execute, as opposed to executing Intra-Shard Transactions, which are simpler and require fewer steps.
As emphasized in section Executing Transactions, the execution of a Transaction is always initiated by Nodes that belong to the Shard of the Sender Account, regardless of the type of Transaction. This means that a Node executing a Cross-Shard Transaction has only half the information it requires to completely execute the Transaction, namely the Sender Account. To execute the Transaction in full, the Node would hypothetically require the Destination Account to be retrieved from the Shard it is assigned to. But this would go against the main advantage of Sharding, namely to keep information separated and process it separately in order to gain performance. To avoid this contradiction, Elrond Nodes employ an alternative method of executing Cross-Shard Transactions: they execute the Cross-Shard Transaction in half inside the Shard of the Sender Account, then send a partial result to the Shard of the Destination Account for completion. The steps are as follows:
- Certain Nodes of the Shard begin the Consensus process.
- The Leader Node of the Consensus group proposes a Block which includes the half-executed Cross-Shard Transaction. This results in the sum of ERD being deducted from the Sender Account, but the sum is not added to the Destination Account, because the information on the Destination Account is stored by Nodes of a different Shard.
- The rest of the Nodes in the Consensus group execute the proposed Block and validate the result available in the Block Header (e.g. root hashes).
- If all Nodes in the Consensus group agree on the same resulting Block, then the Block is notarized by the Metachain and the Consensus process ends.
- Nodes from the Shard of the Destination Account receive, through Gossip, the notarized Block from the Metachain
- The Nodes from the Shard of the Destination Account notice that the Block contains a half-executed Transaction, of which their Shard is the Destination.
- These Nodes then execute the second half of the Transaction, namely adding the transferred sum of ERD to the Destination Account during their own Consensus process and add the fully executed Transaction to the Block they produce.
- If all Nodes in the Consensus group of Shard of the Destination Account agree on the same resulting Block, then the block is notarized by the Metachain, ending their Consensus.
- Later, after some more Blocks have been notarized by the Metachain, the Block that contained the fully executed Cross-Shard Transaction is declared final, which means that all the Transactions it contains are now final as well, including the Cross-Shard Transaction discussed here.
Note that Cross-Shard Transactions are stored in Miniblocks within a Block, separate from Intra-Shard Transactions (which are also stored in their own Miniblocks). See the linked sections for more details.
- ☐ Describe how a Transaction with a SmartContract call looks like
- ☐ Describe how a Transaction with a SmartContract deployment looks like
- ☐ Connect this section with the Processor components
- ☐ The VM needs access to info such as Accounts, Blocks, Transactions etc because SCs might need such info
- ☐ Describe here what any SC VM must be able to do: take specialized input, provide specialized output, meter the execution in Gas units, provide "logs"
A Transaction is usually a value-transferring process, with a relatively simple execution, but there is a special kind of Transactions which, on top of transferring value, also request the execution of custom code. Such custom code must have already been published to the Blockchain in the form of a SmartContract (TODO link) before anyone could request its execution. Publishing a SmartContract to the Blockchain is called deploying the SmartContract (performed as a special Transaction), while requesting the execution of code from within a SmartContract is referred to as calling the SmartContract. Both the deployment of a SmartContract or a call to one will require a component dedicated to handling SmartContracts: the Virtual Machine.
TODO
Nodes of the Elrond Network append new Blocks to the Blockchain with consistent regularity: put briefly, a new Block is produced and validated every Round, regardless of how many Transactions had to be executed. This means that even if there are no user-requested Transactions, a Block will still be produced, containing Reward Transactions. Ideally, there will be one Block added to the Blockchain in each Round, although Nodes are built to handle unexpected situations where no Block was produced in a Round or even situations where multiple conflicting Blocks were accidentally produced in the same Round due to severe connectivity issues (see Fork resolution).
The Consensus process lies at the heart of the Protocol, because almost everything the Nodes do is either in preparation for, or as a consequence of a Consensus process. And because a Consensus process results in the next Block to be added to the Blockchain, its security is of high importance.
At any given moment, there are as many Blockchains in the Elrond Network as there are Shards (Blockchains and Shards may merge and split, though - see Adaptive State Sharding. This grants the Network its efficiency and high throughput, but it also means that there is a separate Consensus process executed within each Shard per Round, in order to produce new Blocks for each of the Blockchains managed by the Shards. The Metachain is no exception, although its Consensus process is only slightly different from the Consensus within Shards.
While each Shard performs a Consensus in each Round, this does not involve all the Nodes of the Shards. Only a selected subset of the Nodes in a Shard will take part in Consensus, and it's always a different subset of Nodes, as described in the following section. The selected Nodes form what is called the Consensus Group.
During Consensus, the Leader of Consensus produces a Block containing executed Transactions and their results, and the rest of the Consensus Group will validate it. The members of the Group then send their signed approval messages back to the Leader, which assembles an aggregated signature of the entire Consensus Group and seals the Block.
An entire Consensus process lasts for a single Round, but a Round is further divided into Subrounds, corresponding to the tasks that the Nodes must complete in sequence and in finite amounts of time, in order to successfuly conclude the Consensus. Keep in mind that Consensus is a real-time process, and exceeding the time budgeted for any Subround results in its failure.
At the beginning of the Consensus process, all Nodes must calculate which ones are part of the Consensus Group for that Round in their Shard. Some will, at this moment, discover that they will participate in Consensus, while the rest will discover that they're not part of the Consensus Group. Because this process relies on the randomness generated by the Nodes, they cannot know beforehand what the Consensus Group will be - it can only be calculated after the previous Block has been generated (see the linked section for details).
In short, the Consensus Group in a Shard is a random subset of the Nodes of that Shard. The probability of any Node to be selected for Consensus is not uniform, however: its Rating is taken into account, and Nodes with a higher Rating factor have a higher probability of being chosen for Consensus.
The size of the Consensus Group (i.e. the number of Nodes participating in a Consensus process) is currently a configurable
parameter of the Network, set in the nodesSetup.json
file, although in the future it will be a parameter subjected to the
Network's Governance system or it may be computed dynamically as a function of Shard size.
Of the Nodes in the selected Consensus Group, one is designated the Leader of the Consensus,
Selecting Nodes for the Consensus process of a Round is a task fulfilled by the NodesCoordinator component.
TODO
TODO
TODO
TODO
- ☐ initial source: genesis
- ☐ leader produces new random seed from prev block + own signature
- ☐ how does a new genesis block for a new epoch affect the seed? is it calculated just the same?
- ☐ TxFee = GasPrice ⨯ ConsumedGas
- ☐ TxFeeLimit = GasPrice ⨯ GasLimit
- ☐ Gas = quantum of computational work performed for the network, compensated in ERD
- ☐ Gas is "spent" executing Transactions and also storing data
- ☐ Describe how Gas is metered during the execution of Transactions
TODO
- ☐ epochs and rounds
- ☐ NTP and staying in sync with reality
TODO
Each Node is constantly listening to its connected Peers for incoming information of all kind, packaged as atomic Messages. A Message may arrive to the Node because it was either:
- sent by a Peer directly to the Node (thus the Message has a sender Node and a destination Node)
- broadcasted by a Node to all its Peers
Most often, Messages are propagated being broadcast by a Node to its Peers. Nodes will automatically re-broadcast many of the received Messages to their Peers, to ensure the widest propagation of information. This repeated broadcasting is called Gossiping. Furthermore, to ensure efficient and orderly communication, Messages are assigned to predefined Topics (i.e. "categories" of Messages). The Node that constructs a new Message must assign a single Topic to it, based on the type of information it contains and what audience it is directed at. There are many Topics defined by the Network, but a single Node will only listen to a specific subset of them. See Topics of interest for a Node for details.
Whenever a Message arrives at the Node, it firstly assesses whether the Message is of interest or not. This is easy to verify: if the Topic of the Message is not of interest, then the Message is promptly discarded and forgotten. But if the Message is assigned to a Topic of interest to the Node, a complex process begins: the Message is passed to the specific Interceptor that handles the corresponding Topic, and it is now the responsibility of the Interceptor to deal with the Message somehow. There are multiple types of Interceptors - one for each type of information that a Message can contain.
All (?) Interceptors have one task: to put the information contained by the received Message into the Data Pool of the Node. The Data Pool thus acts as a reservoir of information for most tasks performed by the Node. After being exposed to the Network for a while, a Node will have its Data Pool teeming with Transactions, Block Headers and other types of information propagated through the Network, all arrived as Messages at some time in the past. Whenever the Node must perform some task, e.g. to produce a Block, it will use the information found in the Data Pool.
In case the Data Pool does not contain certain pieces of information required by the Node at a given moment, the Node must create Request Messages (?), which it sends to some of its Peers. If a Peer has the information requested by the Node, it packages that piece of information as a Message and sends it back directly to the Node. If the requested information is missing from the Peers that have been asked for it, they sit silently and do nothing (they don't propagate the original request further). The Node that needs information must try again with some other Peers. See Requesting information from Peers for details.
A special case where a different form of propagation of information takes place is the Consensus process. See the linked section for details.
The Topic-broadcasting functionality itself is available in the libp2p library, which the Node uses for all its peer-to-peer communication.
A Node will automatically re-broadcast to its Peers most Messages that it receives (e.g. containing Transactions or other information), in order to propagate information throughout the Network. This is called Gossiping. Note that Direct Messages are never re-broadcast, thus they are not subject to Gossip. All other Messages are considered "of general interest", thus are broadcast to as many Nodes as possible. However, not all Peers of the Node will care about each and every Message - they will simply ignore those Messages that arrive on a Topic that doesn't interest them. Because of this selective interest on Topics, it would be a waste of bandwidth and processing power to always broadcast every Message to all Peers. Consider the following example:
- Message
m1
arrives at NodeA
on the TopicTx0
. Let's say that this Message is both valid and interesting toA
. -
A
wants to propagatem1
to its Peers: they are NodesB
,C
,D
andE
. - But
A
knows thatC
andE
do not care about any Message that arrives onTx0
, thus they would ignorem1
if it were to be sent to them. -
A
knows that onlyB
andD
would acceptm1
. - Therefore
A
will only sendm1
toB
andD
, to save resources.
Thus, to make communication more efficient, Nodes will be mindful about the Topics of interest of their Peers and will actually perform selective broadcasting: a Node will exclude a Peer from the broadcasting of a Message if it knows that the Message will be uninteresting to the respective Peer, as described in the example above.
To summarize: any Message m
on Topic T
will be sent by Node X
to a certain Node Y
, if the following conditions are met:
- The Nodes
X
andY
must be Peers - The Message
m
is valid. Invalid Messages are ignored and dropped. -
X
must have declared interest in TopicT
(see Topics of interest). Otherwise, the Message is ignored and dropped. -
Y
must also have declared interest in TopicT
-
X
must be aware thatY
has declared its interest in TopicT
TODO
TODO
TODO
- ☐ Requests: when and how are requests generated?
- ☐ Requests: generated (?) when a Processor needs info, queries the Data Pool, doesn't find that info, then generates a request and broadcasts it. But which component exactly generates the Request? Is it the Data Pool? And what does that specific Processor do when it finds out that the information is missing and must wait? If it enters Consensus and it suddenly realizes some information is missing, will it simply wait during the Consensus?
- ☐ Resolvers: they send a reply back directly to the requesting Peer (the PeerID is in the Message containing the initial request)
- ☐ retrieving the last notarized block header: when and why
- ☐ retrieving miniblock hashes: when and why
- ☐ sending information for notarization after Consensus
The Node is designed to behave properly in adverse situations, such as malicious Peers, missing information, weak connectivity and even forks in the Blockchain or a commandeered Consensus group. Most of these situations are naturally handled in the Protocol itself, mostly due to the resilience-first design of the Metachain and how the Nodes in Shards use the information stored in it.
A critical aspect of the stability of the Network is to avoid adding incorrect or maliciously crafted Blocks to the Blockchain. Correct Blocks are produced by a Consensus group that has majority of at least two-thirds of the Nodes acting honestly. But a malicious actor may try to determine the Nodes of a Shard to accept a crafted Block by broadcasting its Block Header throughout the Shard. A lot of effort went into building mechanisms that protect the Node against such Blocks.
These sections will first describe how Blockchain correctness is implemented for Shard Blockchains. Later, the Metachain will be discussed as well.
Architecture-wise, the features described here are implemented in the Block Header Interceptor (package process/interceptors
)
and in the Sync mechanism (package process/sync
). The interaction between these two components happens as follows:
- The Node receives Block Headers from the Network, either through Gossip or by request
- The newly received Block Header is preliminarily validated by the Block Header Interceptor for integrity and authenticity
- After the preliminary validation done by the Interceptor, the Block Header is saved into a caching component (Cacher) owned
by the Interceptor (see the
headers
Cacher of theHdrInterceptorProcessor
struct inprocess/interceptors/processor/hdrInterceptorProcessor.go
) - The Cacher will notify the components listening to its storage that a Block Header has been saved into it; one component
listening on the Cacher owned by the
HdrInterceptorProcessor
is theShardBootstrap
, the entry point into the Sync mechanism (process/sync/shardblock.go
) - The
ShardBootstrap
will react to any Block Header added to the aforementioned Cacher by adding the newly received Block Header into itsforkDetector
for analysis (see functionprocessReceivedHeader()
inprocess/sync/baseSync.go
) - The
forkDetector
must determine whether each received Block Header fits onto the Blockchain (according to the information held by the Node) and whether a choice must be made between conflicting Blocks - Based on information from the
forkDetector
, theShardBootstrap
must choose whether to accept or reject the Block Header (and which of the potentially conflicting Block Headers should be accepted)
The first line of defense against accepting an invalid or malicious Block are the Interceptors, specifically the Block Header Interceptor, which verifies the integrity and authenticity of a received Block Header. While integrity is easy to verify, the authenticity of the Block Header requires knowledge of the Consensus group that produced it. Retrieving this knowledge relies on the fact that every Node in a Shard can deterministically calculate which Nodes of the Shard were part of the Consensus group in the Round specified by the Block, and can also determine which Node was the Consensus Leader (i.e. the one which proposed the Block).
Calculating the entire Consensus group for any given Round is possible in the Elrond Network because of how the randomness source for Consensus selection is calculated, described in detail in this Medium post. This randomness source has three important properties: (1) it cannot be made available to the next Consensus group until the current Consensus process ends, (2) it is deterministic and known by every Node without requiring a randomness beacon and (3) it is unique in each Round.
The Block Header Interceptor will use both the information present in the Block Header itself and information about the Consensus groups in the past to evaluate the Block Header. Thus it will reject the received Block Header for any of the following reasons:
- The Block Header has already been blacklisted in the past (how?) (method
HdrInterceptorProcessor.Validate()
inprocess/interceptors/processor/hdrInterceptorProcessor.go
) - The Block Header does not belong to the Shard to which the Node belongs (method
InterceptedHeader.processFields()
inprocess/block/interceptedBlocks/interceptedBlockHeader.go
) - The Block Header contains a randomness source that has not been signed by the known Consensus Leader of the Round and Shard
which were specified in the Block Header (method
InterceptedHeader.CheckValidity()
inprocess/block/interceptedBlocks/interceptedBlockHeader.go
) - The Block Header contains a signature that does not belong to the known Consensus Leader of the Round and Shard specified in
the Block Header (method
InterceptedHeader.CheckValidity()
inprocess/block/interceptedBlocks/interceptedBlockHeader.go
) - The Block Header references an incorrect previous Block (add code link)
- The Block Header contains an aggregated signature that does not match the signatures of the Nodes known to have formed the
Consensus group of the Round and Shard specified by the Block Header (method
headerSigVerifier.verifySig()
inprocess/block/interceptedBlocks/common.go
, and called inInterceptedHeader.CheckValidity()
mentioned above)
Once a Block Header passes the validation checks above, it is sent to the Sync mechanism via the headers
Cacher owned by
ShardBootstrap
(mentioned in the previous section). The Sync mechanism must now determine whether to accept this Block as
valid not only on its own, but also in the greater context of the previous Blocks and the Metachain.
The Sync mechanism is a critical component of the Node which ensures that every Node has up-to-date information from Network and the Node's currently stored Blockchain is complete and correct. It is also responsible for triggering requests for missing information and for ensuring correctnes in case of conflicting information.
For Nodes assigned to Shards, the Sync mechanism is implemented in the ShardBootstrap
structure (file
process/sync/shardblock.go
). When a Node is started, the method ShardBootstrap.StartSync()
is called, which launches a
go-routine that regularily verifies the synchronization state of the Node (currently, every 5 milliseconds), and if the Node is
out-of-sync, missing Blocks will be requested and any forks will be resolved. This all happens in the method
baseBootstrap.syncBlock()
.
Detecting desynchronization is the first task of the method baseBootstrap.syncBlock()
. This is done in the method
baseBootstrap.ShouldSync()
, and can be summarized as having the latest correct Block for the current Round. To verify this, the
method starts by verifying if the current Round number is different than the number of the Round of last synchronization check
(boot.roundIndex
). Then, the forkDetector
is invoked to analyze whether there are forks in the Block Headers known to it
(which were provided by the Block Header Interceptor, see the previous section). If there are no forks, the Node simply needs to
have all the Blocks up to the current Round (this check always passes for Nodes that were in Consensus). But if the
forkDetector
detects a fork among the received Block Headers, it must also be determined if the Node finds itself on the
"correct" side of the fork or on the "incorrect" one. Being on the "correct" side of the fork implies no further action, but
being on the "incorrect" side implies the need for for a roll-back to the last known "correct" state of the Blockchain
(performed after returning from baseBootstrap.ShouldSync()
back to baseBootstrap.syncBlock()
).
After finishing fork detection (and eventual roll-backs), the method baseBootstrap.ShouldSync()
must continue the
synchronization process by requesting missing Block Headers from the Network, processing them and commiting them to the
Blockchain in its local storage.
Once the synchronization process ends, it will be no longer required during the current Round: all calls to
baseBootstrap.ShouldSync()
will return false
.
The forkDetector
(structure baseForkDetector
in process/sync/baseForkDetector.go
) is responsible for detecting conflicts
among received Block Headers and must determine which Block should be accepted in case of a conflict.
Most forks are short and trivial, and naturally appear as a result of network latency or disconnections.
- If the Node receives the Block Headers of two different Blocks but which have the same Nonce, produced in different Rounds,
it means that for some reason a Consensus group could not propagate its correctly produced Block fast enough through the Shard, so
after the Round expired, a new Consensus has produced another Block instead (same Nonce, different Round). But after some
time the first Block arrives at the Node, creating a conflict. In this case, the
forkDetector
will choose the first Block to be correct (earliest Round for the same Nonce). This is subject to change, though, and this criterion will be replaced with The Longest Chain criterion once development progresses, because the k-finality principle also applies to Shard Blockchains. - If the Node receives the Block Headers of two Blocks with the same Nonce, produced by the same Consensus group in the same Round, it means that the Leader is not a single Node, but a cloned one (two machines using the same BLS key, selected to be Leader of Consensus). In this case, they will produce very similar Blocks (might even contain the same Transactions), but due to network latencies, the Leaders will receive the signatures from the rest of the Consensus group in varying order. Even if they receive enought signatures, differences in the order in which the Leader clones received the signatures is enough to produce Blocks with different hashes. Since these Blocks are functionally identical, they are both correct, but the Block with the numerically lowest hash will be chosen as "correct" of the two.
For other cases, the forkDetector
will rely on the Metachain to determine "correct" Blocks, based on the k-finality principle.
Note that the implementation of the forkDetector
is currently in development, and will change once a new component will be
finalized (the BlockTracker
component).
The Metachain is far less vulnerable to forks, due to the configured size of the Consensus group, which is the entire Metashard. Still, trivial forks may appear due to network latencies, which are resolved by applying the same Longest Chain principle. Fork resolution on the Metachain is in development.
The primary defense against Shard takeover and malicious collusion among Nodes is the Shard-Shuffling mechanism, which moves a proportion of the Nodes in a Shard to other Shards, randomly, while bringing in Nodes from different Shards. This process happens at the beginning of a new Epoch. This process is currently in development.
When the majority of a Shard has been compromised during an Epoch, a remaining honest Node may detect this issue and send a Fisherman challenge to the Metashard. This process is currently in development.
Info to integrate:
- ☐ Synchronization: at the beginning of each round, the Node checks if it is synchronized with the rest of the peers. Synched = has all the current pieces of info (blocks, block headers, nonces etc) and can move on to processing. Not synched = must start requesting information from the Network.
Notes:
- Narrative entry-point The Node is ready and connected to the network. It has been busy processing transactions, proposing and validating blocks for some days now. Let's see, in more detail, what happens to a transaction once it arrives at the Node, from initial validation until being saved in a final block on the blockchain, notarized by the Metachain.