There’s more than one way to put code on a blockchain
In most discussions about blockchains, it doesn’t take long for the notion of “smart contracts” to come up. In the popular imagination, smart contracts automate the execution of interparty interactions, without requiring a trusted intermediary. By expressing legal relationships in code rather than words, they promise to enable transactions to take place directly and without error, whether deliberate or not.
From a technical viewpoint, a smart contract is something more specific: computer code that lives on a blockchain and defines the rules for that chain’s transactions. This description sounds simple enough, but behind it lies a great deal of variation in how these rules are expressed, executed and validated. When choosing a blockchain platform for a new application, the question “Does this platform support smart contracts?” isn’t the right one to ask. Instead, we need to be asking: “What type of smart contracts does this platform support?”
In this article, my goal is to examine some of the major differences between smart contract approaches and the trade-offs they represent. I’ll do this by looking at four popular enterprise blockchain platforms which support some form of customized on-chain code. First, IBM’s Hyperledger Fabric, which calls its contracts “chaincode”. Second, our MultiChain platform, which introduces smart filters in version 2.0. Third, Ethereum (and its permissioned Quorum and Burrow spin-offs), which popularized the “smart contract” name. And finally, R3 Corda, which references “contracts” in its transactions. Despite all of the different terminology, ultimately all of these refer to the same thing – application-specific code that defines the rules of a chain.
Before going any further, I should warn the reader that much of the following content is technical in nature, and assumes some familiarity with general programming and database concepts. For good or bad, this cannot be avoided – without getting into the details it’s impossible to make an informed decision about whether to use a blockchain for a particular project, and (if so) the right type of blockchain to use.
Let’s begin with some context. Imagine an application that is shared by multiple organizations, which is based on an underlying database. In a traditional centralized architecture, this database is hosted and administered by a single party which all of the participants trust, even if they do not trust each other. Transactions which modify the database are initiated only by applications on this central party’s systems, often in response to messages received from the participants. The database simply does what it’s told because the application is implicitly trusted to only send it transactions that make sense.
Blockchains provide an alternative way of managing a shared database, without a trusted intermediary. In a blockchain, each participant runs a “node” that holds a copy of the database and independently processes the transactions which modify it. Participants are identified using public keys or “addresses”, each of which has a corresponding private key known only to the identity owner. While transactions can be created by any node, they are “digitally signed” by their initiator’s private key in order to prove their origin.
Nodes connect to each other in a peer-to-peer fashion, rapidly propagating transactions and the “blocks” in which they are timestamped and confirmed across the network. The blockchain itself is literally a chain of these blocks, which forms an ordered log of every historical transaction. A “consensus algorithm” is used to ensure that all nodes reach agreement on the content of the blockchain, without requiring centralized control. (Note that some of this description does not apply to Corda, in which each node has only a partial copy of the database and there is no global blockchain. We’ll talk more about that later on.)
In principle, any shared database application can be architected by using a blockchain at its core. But doing so creates a number of technical challenges which do not exist in a centralized scenario:
- Transaction rules. If any participant can directly change the database, how do we ensure that they follow the application’s rules? What stops one user from corrupting the database’s contents in a self-serving way?
- Determinism. Once these rules are defined, they will be applied multiple times by multiple nodes when processing transactions for their own copy of the database. How do we ensure that every node obtains exactly the same result?
- Conflict prevention. With no central coordination, how do we deal with two transactions that each follow the application’s rules, but nonetheless conflict with each other? Conflicts can stem from a deliberate attempt to game the system, or be the innocent result of bad luck and timing.
So where do smart contracts, smart filters and chaincode come in? Their core purpose is to work with a blockchain’s underlying infrastructure in order to solve these challenges. Smart contracts are the decentralized equivalent of application code – instead of running in one central place, they run on multiple nodes in the blockchain, creating or validating the transactions which modify that database’s contents.
Let’s begin with transaction rules, the first of these challenges, and see how they are expressed in Fabric, MultiChain, Ethereum and Corda respectively.
Transaction rules perform a specific function in blockchain-powered databases – restricting the transformations that can be performed on that database’s state. This is necessary because a blockchain’s transactions can be initiated by any of its participants, and these participants do not trust each other sufficiently to allow them to modify the database at will.
Let’s see two examples of why transaction rules are needed. First, imagine a blockchain designed to aggregate and timestamp PDF documents that are published by its participants. In this case, nobody should have the right to remove or change documents, since doing so would undermine the entire purpose of the system – document persistence. Second, consider a blockchain representing a shared financial ledger, which keeps track of the balances of its users. We cannot allow a participant to arbitrarily inflate their own balance, or take others’ money away.
Inputs and outputs
Our blockchain platforms rely on two broad approaches for expressing transaction rules. The first, which I call the “input–output model”, is used in MultiChain and Corda. Here, transactions explicitly list the database rows or “states” which they delete and create, forming a set of “inputs” and “outputs” respectively. Modifying a row is expressed as the equivalent operation of deleting that row and creating a new one in its place.
Since database rows are only deleted in inputs and only created in outputs, every input must “spend” a previous transaction’s output. The current state of the database is defined as the set of “unspent transaction outputs” or “UTXOs”, i.e. outputs from previous transactions which have not yet been used. Transactions may also contain additional information, called “metadata”, “commands” or “attachments”, which don’t become part of the database but help to define their meaning or purpose.
Given these three sets of inputs, outputs and metadata, the validity of a transaction in MultiChain or Corda is defined by some code which can perform arbitrary computations on those sets. This code can validate the transaction, or else return an error with a corresponding explanation. You can think of the input–output model as an automated “inspector” holding a checklist which ensures that transactions follow each and every rule. If the transaction fails any one of those checks, it will automatically be rejected by all of the nodes in the network.
It should be noted that, despite sharing the input–output model, MultiChain and Corda implement it very differently. In MultiChain, outputs can contain assets and/or data in JSON, text or binary format. The rules are defined in “transaction filters” or “stream filters”, which can be set to check all transactions, or only those involving particular assets or groupings of data. By contrast, a Corda output “state” is represented by an object in the Java or Kotlin programming language, with defined data fields. Corda’s rules are defined in “contracts” which are attached to specific states, and a state’s contract is only applied to transactions which contain that state in its inputs or outputs. This relates to Corda’s unusual visibility model, in which transactions can only be seen by their counterparties or those whose subsequent transactions they affect.
Contracts and messages
The second approach, which I call the “contract–message model”, is used in Hyperledger Fabric and Ethereum. Here, multiple “smart contracts” or “chaincodes” can be created on the blockchain, and each has its own database and associated code. A contract’s database can only be modified by its code, rather than directly by blockchain transactions. This design pattern is similar to the “encapsulation” of code and data in object-oriented programming.
With this model, a blockchain transaction begins as a message sent to a contract, with some optional parameters or data. The contract’s code is executed in reaction to the message and parameters, and is free to read and write its own database as part of that reaction. Contracts can also send messages to other contracts, but cannot access each other’s databases directly. In the language of relational databases, contracts act as enforced “stored procedures”, where all access to the database goes via some predefined code.
Both Fabric and Quorum, a variation on Ethereum, complicate this picture by allowing a network to define multiple “channels” or “private states”. The aim is to mitigate the problem of blockchain confidentiality by creating separate environments, each of which is only visible to a particular sub-group of participants. While this sounds promising in theory, in reality the contracts and data in each channel or private state are isolated from those in the others. As a result, in terms of smart contracts, these environments are equivalent to separate blockchains.
Let’s see how to implement the transaction rules for a single-asset financial ledger with these two models. Each row in our ledger’s database has two columns, containing the owner’s address and the quantity of the asset owned. In the input–output model, transactions must satisfy two conditions:
- The total quantity of assets in a transaction’s outputs has to match the total in its inputs. This prevents users from creating or deleting money arbitrarily.
- Every transaction has to be signed by the owner of each of its inputs. This stops users from spending each other’s money without permission.
Taken together, these two conditions are all that is needed to create a simple but viable financial system.
In the contract–message model, the asset’s contract supports a “send payment” message, which takes three parameters: the sender’s address, recipient’s address, and quantity to be sent. In response, the contract executes the following four steps:
- Verify that the transaction was signed by the sender.
- Check that the sender has sufficient funds.
- Deduct the requested quantity from the sender’s row.
- Add that quantity to the recipient’s row.
If either of the checks in the first two steps fails, the contract will abort and no payment will be made.
So both the input–output and contract–message models are effective ways to define transaction rules and keep a shared database safe. Indeed, on a theoretical level, each of these models can be used to simulate the other. In practice however, the most appropriate model will depend on the application being built. Does each transaction affect few or many pieces of information? Do we need to be able to guarantee transaction independence? Does each piece of data have a clear owner or is there some global state to be shared?
It is beyond our scope here to explore how the answers should influence a choice between these two models. But as a general guideline, when developing a new blockchain application, it’s worth trying to express its transaction rules in both forms, and seeing which fits more naturally. The difference will express itself in terms of: (a) ease of programming, (b) storage requirements and throughput, and (c) speed of conflict detection. We’ll talk more about this last issue later on.
When it comes to transaction rules, there is one way in which MultiChain specifically differs from Fabric, Ethereum and Corda. Unlike these other platforms, MultiChain has several built-in abstractions that provide some basic building blocks for blockchain-driven applications, without requiring developers to write their own code. These abstractions cover three areas that are commonly needed: (a) dynamic permissions, (b) transferrable assets, and (c) data storage.
For example, MultiChain manages permissions for connecting to the network, sending and receiving transactions, creating assets or streams, or controlling the permissions of other users. Multiple fungible assets can be issued, transferred, retired or exchanged safely and atomically. Any number of “streams” can be created on a chain, for publishing, indexing and retrieving on-chain or off-chain data in JSON, text or binary formats. All of the transaction rules for these abstractions are available out-of-the-box.
When developing an application on MultiChain, it’s possible to ignore this built-in functionality, and express transaction rules using smart filters only. However, smart filters are designed to work together with its built-in abstractions, by enabling their default behavior to be restricted in customized ways. For example, the permission for certain activities might be controlled by specific administrators, rather than the default behavior where any administrator will do. The transfer of certain assets can be limited by time or require additional approval above a certain amount. The data in a particular stream can be validated to ensure that it consists only of JSON structures with required fields and values.
In all of these cases, smart filters create additional requirements for transactions to be validated, but do not remove the simple rules that are built in. This can help address one of the key challenges in blockchain applications: the fact that a bug in some on-chain code can lead to disastrous consequences. We’ve seen endless examples of this problem in the public Ethereum blockchain, most famously in the Demise of The DAO and the Parity multisignature bugs. Broader surveys have found a large number of common vulnerabilities in Ethereum smart contracts that enable attackers to steal or freeze other peoples’ funds.
Of course, MultiChain smart filters may contain bugs too, but their consequences are more limited in scope. For example, the built-in asset rules prevent one user from spending another’s money, or accidentally making their own money disappear, no matter what other logic a smart filter contains. If a bug is found in a smart filter, it can be deactivated and replaced with a corrected version, while the ledger’s basic integrity is protected. Philosophically, MultiChain is closer to traditional database architectures, where the database platform provides a number of built-in abstractions, such as columns, tables, indexes and constraints. More powerful features such as triggers and stored procedures can optionally be coded up by application developers, in cases where they are actually needed.
assets + streams
Let’s move on to the next part of our showdown. No matter which approach we choose, the custom transaction rules of a blockchain application are expressed as computer code written by application developers. And unlike centralized applications, this code is going to be executed more than one time and in more than one place for each transaction. This is because multiple blockchain nodes belonging to different participants have to each verify and/or execute that transaction for themselves.
This repeated and redundant code execution introduces a new requirement that is rarely found in centralized applications: determinism. In the context of computation, determinism means that a piece of code will always give the same answer for the same parameters, no matter where and when it is run. This is absolutely crucial for code that interacts with a blockchain because, without determinism, the consensus between the nodes on that chain can catastrophically break down.
Let’s see how this looks in practice, first in the input–output model. If two nodes have a different opinion about whether a transaction is valid, then one will accept a block containing that transaction and the other will not. Since every block explicitly links back to a previous block, this will create a permanent “fork” in the network, with one or more nodes not accepting the majority opinion about the entire blockchain’s contents from that point on. The nodes in the minority will be cut off from the database’s evolving state, and will no longer be able to effectively use the application.
Now let’s see what happens if consensus breaks down in the contract–message model. If two nodes have a different opinion about how a contract should respond to a particular message, this can lead to a difference in their databases’ contents. This in turn can affect the contract’s response to future messages, including messages it sends to other contracts. The end result is an increasing divergence between different nodes’ view of the database’s state. (The “state root” field in Ethereum blocks ensures that any difference in contracts’ responses leads immediately to a fully catastrophic blockchain fork, rather than risking staying hidden for a period of time.)
Sources of non-determinism
So non-determinism in blockchain code is clearly a problem. But if the basic building blocks of computation, such as arithmetic, are deterministic, what do we have to worry about? Well, it turns out, quite a few things:
- Most obviously, random number generators, since by definition these are designed to produce a different result every time.
- Checking the current time, since nodes won’t be processing transactions at exactly the same time, and in any event their clocks may be out of sync. (It’s still possible to implement time-dependent rules by making reference to timestamps within the blockchain itself.)
- Querying external resources such as the Internet, disk files, or other programs running on a computer. These resources cannot be guaranteed to always give the same response, and may become unavailable.
- Running multiple pieces of code in parallel “threads”, since this leads to a “race condition” where the order in which these processes finish cannot be predicted.
- Performing any floating point calculations which can give even minutely different answers on different computer processor architectures.
Our four blockchain platforms employ several different approaches to avoiding these pitfalls.
Determinism by endorsement
When it comes to determinism, Hyperledger Fabric adopts a completely different approach. In Fabric, when a “client” node wants to send a message to some chaincode, it first sends that message to some “endorser” nodes. Each of these nodes executes the chaincode independently, forming an opinion of the message’s effect on that chaincode’s database. These opinions are sent back to the client together with a digital signature which constitutes a formal “endorsement”. If the client receives enough endorsements of the intended outcome, it creates a transaction containing those endorsements, and broadcasts it for inclusion in the chain.
In order to guarantee determinism, each piece of chaincode has an “endorsement policy” which defines exactly what level of approval is required in order to render its transactions valid. For example, one chaincode’s policy might state that endorsements are required from at least half of the blockchain’s nodes. Another might require an endorsement from any one of three trusted parties. Either way, every node can independently check if the necessary endorsements were received.
To clarify the difference, determinism in most blockchain platforms is based on the question: “What is the result of running this code on this data?” – and we need to be absolutely sure that every node will answer this question identically. By contrast, determinism in Fabric is based on a different question: “Do enough endorsers agree on the result of running this code on this data?” Answering that is a rather simple matter of counting, and there’s no room for non-determinism to creep in.
What price does Fabric pay for this flexibility? If the purpose of a blockchain is to remove intermediaries from a shared database-driven application, then Fabric’s reliance on endorsers takes a big step away from that goal. For the participants in the chain, it is no longer enough to follow the chaincode’s rules – they also need certain other nodes to agree that they have done so. Even worse, a malicious subset of endorsers could approve database changes that do not follow chaincode at all. This gives endorsers much more power than the validators in regular blockchains, who can censor transactions but cannot violate the blockchain’s rules. Blockchain application developers must decide whether this trade-off makes sense in their particular case.
|Model||Endorsements||Adapted runtime||Purpose-built VM||Adapted runtime|
|Code visibility||Counterparties +
|Enforced||No||Yes||Yes||No (for now)|
So far, we’ve discussed how different blockchain platforms express transaction rules in code, and how they deterministically ensure that every node applies those rules identically. Now it’s time to talk about a third aspect of our showdown: How does each platform deal with the possibility that two transactions, which are valid in and of themselves, conflict with each other? In the simplest example, imagine that Alice has $10 in a financial ledger and broadcasts two transactions – one sending $8 to Bob, and the other sending $7 to Charlie. Clearly, only one of these transactions can be allowed to succeed.
We can begin by grouping MultiChain’s and Corda’s approach to this problem together. As described earlier, both of these use an input–output model for representing transactions and their rules, in which each transaction input spends a previous transaction output. This leads to a simple principle for preventing conflicts: Every output can only be spent once. MultiChain filters and Corda contracts can rely on their respective platforms to enforce this restriction absolutely. Since Alice’s $10 is represented by a previous transaction output, this single-spend rule automatically stops her sending it to both Bob and Charlie.
Despite this similarity, it’s important to point out a key difference in how MultiChain and Corda prevent conflicts. In MultiChain, every node sees every transaction and so can independently verify that each output is only spent once. Any transaction which performs a double spend against a previously confirmed transaction will be instantly and automatically rejected. By contrast, in Corda there is no global blockchain, so “notaries” are required to prevent these double spends. Every Corda output state is assigned to a notary, who has to sign any transaction spending that output, confirming it has not been spent before. A blockchain’s participants must trust notaries to follow this rule honestly, and malicious notaries can cause havoc at will. As with endorsements in Fabric, this “single-spend as a service” design has advantages in terms of confidentiality but reintroduces intermediaries, going against the blockchain grain. (It’s important to clarify that Corda notaries can be run by groups of participants using a consensus algorithm, so the integrity of the ledger can still be protected against individual bad actors).
Let’s move on to Ethereum. To recall, Ethereum uses contracts and messages rather than inputs and outputs. As a result, transaction conflicts such as Alice’s two payments are not immediately visible to the blockchain engine. Instead, they are detected and blocked by the contract which processes the transactions, after their order is confirmed on the chain. When processing each of Alice’s payments, the contract verifies whether her balance is sufficient. If the transaction paying $8 to Bob comes first, it will be processed as usual, leaving Alice with $2 in her account. As a result, when the contract processes the second transaction paying $7 to Charlie, it sees that Alice lacks the necessary funds and the transaction aborts.
Outputs vs contracts
So far we’ve seen two different techniques for preventing conflicting transactions – single-spend outputs in MultiChain and Corda, and contract-based verification in Ethereum. So which is better?
In order to help answer this question, let’s consider an example “1-of-2 multisignature” account which holds $100 on behalf of Gavin and Helen, and allows either of them to spend that money independently. Gavin instructs his application to pay $80 to Donna, and a few seconds later, Helen wants to send $40 to Edward. Since there are insufficient funds for both payments, these transactions would inevitably conflict. In the event that both transactions are broadcast, the outcome will be determined by whichever makes it first into the chain. Note that unlike Alice’s example, this conflict is accidental, since no one is trying to break the application’s rules – they simply had unlucky timing.
In considering the likelihood of this conflict occurring, the key question is this: After Gavin sends out his transaction, how long will it take Helen’s node to know that her payment might fail? The shorter this period is, the more likely Helen is to be stopped from attempting that payment, saving her and her application from a subsequent surprise.
With the input–output model, any conflict between transactions is directly visible to the blockchain platform, since the two transactions will be explicitly attempting to spend the same previous output. In MultiChain, this happens as soon as Gavin’s transaction has propagated to Helen’s node, usually in a second or less. In Corda, the output’s notary will refuse the request to sign Helen’s transaction, since it has already signed Gavin’s, so Helen will instantly know that her payment will fail. (Although if the Corda notary is itself distributed, she may have to wait a few seconds for a reply.) Either way, there is no need to wait for a transaction to be confirmed and ordered in the blockchain.
What about Ethereum’s model? In this case, there is no immediate way for the blockchain platform to know that a conflict will occur. While Helen’s node may see Gavin’s transaction on the network, it cannot know how this will affect Helen’s own transaction, since from its perspective these are simply two messages being sent to the same contract. Perhaps ten seconds later, once the final ordering of the conflicting transactions is confirmed on the blockchain, Helen’s node will recalculate the actual instead of the expected outcome, and her application will update its display accordingly. In the meantime, both Gavin and Helen will be left in the dark.
But we shouldn’t conclude from this that the input–output model always works best. Consider a variation on our example scenario, where both Gavin and Helen request smaller $40 payments from the original balance of $100, at exactly the same time. In the input–output model these transactions would conflict, since they are both spending the same database row containing that $100, and only one of the payments would succeed. But in Ethereum, both transactions would be successfully processed, irrespective of their final order, since the account contains sufficient funds for both. In this case, Ethereum more faithfully fulfills Gavin’s and Helen’s intentions.
Finally, let’s talk about Fabric, whose endorsement-based approach is a hybrid of these two techniques. As explained earlier, when a Fabric “client” node wants to send a message to a contract, it first asks some endorsing nodes to execute that message on its behalf. The endorsing nodes do so in a similar way to Ethereum – running the contract against their local database – but this process is observed rather than immediately applied. Each endorser records the set of rows that would be read and written, noting also the exact version of those rows at that point in time. This “read-write set” of versioned rows is explicitly referenced in the endorsement, and included in the transaction which the client broadcasts.
Conflicts between Fabric transactions are resolved once their order is finalized in the chain. Every node processes each transaction independently, checking endorsement policies and applying the database changes specified. However, if a transaction reads or writes a database row version that has already been modified by a previous transaction, then that second transaction is ignored. To go back to Alice’s conflicting payments to Bob and Charlie, both of these transactions will read and modify the same row version, containing the $10 with which Alice started. So the second transaction will be safely and automatically aborted.
Fabric’s approach to conflict resolution works just fine, but in terms of performance and flexibility it combines the worst of the previous two models. Because endorsements convert transactions into specific read-write sets, Gavin and Helen’s simultaneous but compatible $40 payments would lead to a conflict that Ethereum avoids. However, Fabric does not gain the speed advantage of the input–output model, since endorsers execute contracts against the most recent version of the database confirmed by the blockchain, ignoring unconfirmed transactions. So if Helen initiates her payment a few seconds after Gavin, but before Gavin’s has been confirmed on the blockchain, Fabric will create conflicting transactions that a pure input–output model avoids.
|Model||Read-write sets||Single spend||Contract checks||Single spend|
|Speed||~10s (confirmation)||~1s (propagation)||~10s (confirmation)||0~5s (notary)|
A complex choice
In this piece, we reviewed many of the different ways in which Corda, Ethereum, Fabric and MultiChain address the key challenges of “smart contracts”, or application code that is embedded in a blockchain. And each platform has different answers to our three core questions: How are transaction rules represented? How is code executed deterministically? And how do we prevent conflicts?
So who is the winner of our smart contract showdown? It should be obvious by now that there is no simple answer. Each platform represents a complex multi-way trade-off between flexibility, simplicity, performance, disintermediation, safety and confidentiality. So the choice of platform for a particular application has to begin with a detailed understanding of that application’s trust model, the types of transactions it involves, and their likely patterns of conflict. If you find someone pushing a specific smart contract solution before they know the answers to these questions, I suggest politely but firmly insisting that they adopt a “smarter” approach.
Please post any comments on LinkedIn.