Deploying multiple MultiChain nodes to minimize downtime

The general architecture of blockchains is well suited for high availability scenarios, because they are designed to avoid single points of failure. Nonetheless, individual blockchain participants can suffer downtime if their own node stops functioning, for example due to a power cut, system crash or loss of connectivity. For this reason participants should consider running two or more MultiChain nodes redundantly and simultaneously, in order to ensure high availability.

The basic principle for creating a high availability cluster is to run several nodes within the organization, preferably on separate systems with separate network connections in separate locations. All of these nodes should be connected to the blockchain’s peer-to-peer network in the usual way. In order to function as substitutes of each other, the nodes should share private keys and keep their watch-only addresses and subscriptions in sync.

Private key sharing

By sharing private keys between nodes, all nodes gain the ability to connect to the network and sign transactions as substitutes for each other. MultiChain offers a number of ways to share private keys between nodes:

  • When launching a node for the first time, the initprivkey runtime parameter tells the node to use an existing private key instead of generating a new random one. (Private keys can be obtained from other nodes using the dumpprivkey API command.)
  • After a node is running and connected to a network, additional private keys can be added to its wallet using the importprivkey command. For example, createkeypairs can be used on any node (including an offline or cold node) to obtain a private key and corresponding address without affecting that node’s wallet, then importprivkey can be used to import the new private key into every node in the cluster. Be sure to use importprivkey with rescan=false to save rescanning the chain if the private key and its corresponding address have not yet been used.
  • If necessary, use the dumpwallet and importwallet commands to copy all private keys from one node to another.
  • If a node has multiple private keys (with corresponding addresses) in its wallet, the handshakelocal runtime parameter selects which address is used for peer-to-peer handshaking.
  • Ensure that autocombining is switched off (to prevent double spends between cluster nodes) by setting the runtime parameter autocombineminconf=999999999. The combineunspent API can still be called explicitly if necessary.

Watch-only addresses and subscriptions

If multiple nodes are to act as drop-in replacements for each other, it’s important to ensure that they share the same set of watch-only addresses and subscribed assets and streams:

  • When importaddress is used to add one or more watch-only addresseses to one node in the cluster, make sure the same command is called for all of the others. You can safely set rescan=false to save rescanning the chain if the address has not yet been used.
  • When subscribe is used to subscribe to one or more assets or streams on one node, issue the same command to the others. If the asset or stream was created recently, rescanning the chain will be fast, so there’s no need to set rescan=false.
  • The autosubscribe runtime parameter can be used with all nodes in the cluster, to instruct them to automatically subscribe to every new asset and/or stream created.

Load balancing for reading

Let’s assume that all of the nodes in the cluster: (a) are in sync with a non-forked blockchain, (b) have the same set of wallet addresses (whether watch-only or with private keys), (c) are subscribed to the same set of assets and streams, (d) have the same value for the runtime parameters txindex, maxshowndata and hideknownopdrops. In this case, the following API commands which read information from the chain can be directed at any of the nodes in the cluster, and will return the same result for the same input parameters:

  • Listing blockchain entities: listpermissions, listassets, liststreams and listupgrades. Note however that unconfirmed permission changes, assets, streams and upgrades may appear in different orders at the end of the list.
  • Querying balances: getaddressbalances, getmultibalances and gettotalbalances, so long as minconf>=1.
  • Reading specific transactions in various ways: getaddresstransaction, getwallettransaction, getassettransaction, getstreamitem, gettxoutdata, getrawtransaction and gettxout. Note however that time and timereceived fields may differ.
  • Listing subscribed assets or streams: listassettransactions and all the liststream* commands, so long as local-ordering=false. As in previous cases, time and timereceived fields may differ and unconfirmed items may appear in different orders at the end of the list.
  • Global blockchain characteristics: getblockchainparams, getblock, getblockhash and listblocks.

This list assumes that all nodes are updated with the latest block in the chain. In reality, not all nodes will receive and process new blocks at exactly the same time. Even for old confirmed transactions, this may lead to different values for confirmations fields.

Approaches for writing

Particular care must be taken with operations that write to the chain, i.e. which build and send transactions. If two nodes in a cluster build a transaction for the same address simultaneously, they are likely to spend the same unspent output from previous transactions. The two transactions will therefore be in "double-spend" conflict with each other, and only one will be accepted onto the blockchain.

In order to safely write to the chain using a cluster of nodes, one of the following strategies (from easiest to hardest) should be adopted:

  • Use a failover rather than load balancing strategy, in which transactions are only sent from one node at a time. If this node goes down, wait a few seconds (to allow any existing transactions to propagate) then start sending transactions from a different node in the cluster.
  • Use multiple addresses for the organization’s transactions. For each address, use only one node to send transactions from that address. If the node for an address goes down, wait a few seconds then start transacting from that address using a different node.
  • Track unspent outputs on the application level to ensure that different nodes do not try to spend the same outputs. To achieve this, all transactions should be built using createrawtransaction, in which the previous outputs to spend are explicitly specified.
  • Build transactions entirely outside of the node and only use signrawtransaction to sign them and/or sendrawtransactions to broadcast them. It’s also safe to send the same raw transaction via sendrawtransaction on multiple nodes simultaneously.

Monitoring

The health of the nodes in the cluster can be monitored in a number of different ways:

  • Use getpeerinfo to check the number and status of each node’s connections to other peers. The lastsend and lastrecv fields contain Unix timestamps showing when a message was last sent and received over each connection. The pingtime field shows the last measurement of latency across each connection. If the number of peers drops, no new messages are being exchanged, or the latency becomes high, this likely points to a networking issue.
  • Use getblockchaininfo to check the number of blocks in each node’s blockchain. If one node falls significantly behind the others, this likely points to a networking or system overload issue.
  • Use getmempoolinfo to check how many transactions are in each node’s memory pool, i.e. have not yet been confirmed on that node’s copy of the blockchain. If the size number goes high, this points to a network or system overload issue, or a network-wide mining problem.