Multichain performance drop on high transaction count after short period of time

+1 vote

Hi.

I am running a performance test on MC 1.0.6 on AWS c5.4xlarge instance (single node).

Using 10 worker on the same machine (so no network latency) which are sending as fast as possible (but synchronoulys) and writing random key-values to the root stream. (key: 64Byte, value: 128byte)

 

I recognized a slow but steady drop in throughput of tx/s.

Tx/s started at 1100 and are getting lower over time. At the moment the Tx/s rate is only at 440 and that was 24 hours after I started the test.

Tx/s: https://snapshot.raintank.io/dashboard/snapshot/3RYKP94zrQvc1Gbl8MPwSE9XmI7mu7YE

#TX: https://snapshot.raintank.io/dashboard/snapshot/avBNI3h1YM391Kz3KigVKPhqeUQ1Cbzn

CPU is constantly at 20%

Memory is slowly rising: https://snapshot.raintank.io/dashboard/snapshot/eOEmIrenI3uQ00LKqvOktbk7UsmRp8HR

High disk usage  (blocks-folder is small, seems to be indexing)

https://snapshot.raintank.io/dashboard/snapshot/UP8euPezGN2A1b3K4kmPr3Dhr6FDqNqR

Is this an expected behaviour? I am quite curious how far the tx rate will drop, since we are already below our minimum expectation.

asked Aug 4, 2018 by Alexoid

2 Answers

0 votes
Your problem here is unlikely to be related to the sending transaction side, but rather to the fact that this node is subscribed to the root stream. If a node is subscribed to a stream, it is indexing that stream's content in real time, in about 10 different ways. Inevitably, inserting into an index gets slower as that index increases in size.

If you remove this subscription, you'll still see some slowdown, as the number of transactions in the node's wallet increases over time, but it should be a much weaker effect. If you wanted to, you could also remove that effect by building transactions externally so that they are not stored in the node's wallet.
answered Aug 5, 2018 by MultiChain
But you do have to be subscribed to a stream to read data don't you? It is not an option for many scenarios to write to a stream you are not subscribed to.

I mean, it is ok that it has to do a lot of indexing and stuff, but it does not seem to scale over time...
You only have to be subscribed to a stream to use the stream APIs like liststreamitems, liststreamkeys, liststreamkeyitems, etc...

To look at it another way, if you have a database table with 10 different indexes, you would expect that insertion would be significantly slower after it contains 150 million rows than when it was empty, so you might not want all those indexes.

So the long term solution to this is to give more fine-grained control over the type of indexing that is performed by a stream subscriber. We expect to offer this kind of control in the premium/enterprise version of MultiChain.

But for now there is certainly no requirement in MultiChain to subscribe to a stream in order to write to it. For example you can keep a record of each txid externally, and then retrieve transactions using getwallettransaction (if this node was the publisher) or getrawtransaction (if txindex=1, as is default).
>> But for now there is certainly no requirement in MultiChain to subscribe to a stream in order to write to it <<
Yes, but if you also want to read, you have to be subscribed. And most scenarios I am working on of course also include reading.

>>  For example you can keep a record of each txid externally, and then retrieve transactions using getwallettransaction (if this node was the publisher) or getrawtransaction (if txindex=1, as is default). <<
This is not really an option in a decentralized network, except every node has this kind of logic locally.
What I also experienced: After a restart of multichain
- Memory consumption dropped from 30% back to 3%
- TX/s raised from 330 back to 480
We're running some tests here to see if we can reproduce this behavior. It would be helpful if you could share the code that is generating the requests?
Nothing special. Just a go programm with 5 worker threads sending synchronous batch requests (50 publish calls) into root stream with random key (64byte) values(128byte).

What I also experienced: Every time when I restart MultiChain, the multichain folder drops about 20GB of disk usage. Is this expected?
(see: https://snapshot.raintank.io/dashboard/snapshot/mtEJzLmBD3Y9Su7TefqsitNMoUAtQP2C?orgId=2&from=now-2d&to=now )

The node is now at:
- 250Mio Tx
- 200 TX/s
A question about the memory consumption: where are you getting this statistic from? Is it private reserved memory by the multichaind process, is it total memory consumption of the operating system, or something else? If it's total OS memory, this is normal behavior – it will be caching some disk in memory to improve performance, and the memory would be released if it was needed for regular usage.
Another question: our own tests in this scenario show the main performance bottleneck is I/O, rather than CPU. Are you using a hard disk drive or SSD for this Amazon instance?
Hi. At the moment I am using EBS gp2 volumes with the highest possible IOPS settings (10000IOPS). But I can also try it on an io1 volume with up to 30000IOPS.
For the memory usage: this is the line in the top command
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 6492 ubuntu    20   0 6445616 3.898g  35672 S  86.0 25.8   1108:21 multichaind
Yes, please try with the faster drives and let us know what you see. The Amazon website suggests these are better suited for high volume database workloads. As for the reserved memory size, I will ask about this and get back to you.
Also please post the output of getwalletinfo and getblockchaininfo?
getwalletinfo:
{
    "walletversion" : 60000,
    "balance" : 0.00000000,
    "walletdbversion" : 2,
    "txcount" : 294408818,
    "utxocount" : 1,
    "keypoololdest" : 1533307082,
    "keypoolsize" : 2
}

getblockchaininfo:
{
    "chain" : "main",
    "chainname" : "performance",
    "description" : "MultiChain performance",
    "protocol" : "multichain",
    "setupblocks" : 60,
    "reindex" : false,
    "blocks" : 62622,
    "headers" : 62622,
    "bestblockhash" : "00389a339d0d84550b7b24b07058a9b2cffa5cd08f5fba7fb574ebf49ad23c17",
    "difficulty" : 0.00000006,
    "verificationprogress" : 1.00000000,
    "chainwork" : "0000000000000000000000000000000000000000000000000000000000f49f00",
    "chainrewards" : 0.00000000
}
I now changed volume type from gp2 to io1 and changed IOPS from 10000 to 32000. This will take a while to have an impact on performance. I have also checked the metrics of the ebs volume, but that looked ok to me. Let's see if the throughput changes now. the volume should not be a bottleneck anymore. At least TX/s should increase if disk io was the bottleneck.
I checked the volume metrics together with an aws technician. There seem to be no issues and there are no thresholds that would slow down the instance io wise.

Nevertheless I tripled the number of IOPS and used a high performance io1 volume. But this had no impact on the TX/s.
Node is still running at ~200TX/s.
Thanks for your reply. We would like to simulate your exact situation to see what is causing the bottleneck and excess memory usage. You can post the full details of the setup (including OS, MultiChain version, and the Go code) here, or if you prefer by private correspondence, please email to multichain dot debug at gmail.
Just an update on the "autosubscribe":
I also experience a TX/s drop when not subscribed to the streams I write to:
https://snapshot.raintank.io/dashboard/snapshot/w85YT3GZGmpMRuv570JbRzHGrH501oAo?orgId=2
Yes, we would expect some drop if the node is not subscribed, since the transactions are still being included in the wallet and indexed there. But it seems to be a weaker effect, which we'd also expect.
Ok so we have to be aware of this behavior since many scenarios use a blockchain to write a lot of hashes and check for existence / timestamp and publisher. It seems that we have an upper bound on the TX/s that depends on the target blockchain size (that is not always known upfront). I am also curious to see if the TX/s stays stable at some point or slowly approach 0.

Thanks for the information and your support so far.
+2 votes
Just a note for anyone else following this thread. The issue with increasing memory usage has been resolved. There's an unofficial 1.0.6.1 release of MultiChain available here (Linux only):

https://www.multichain.com/download/multichain-1.0.6.1.tar.gz

And the issue was also fixed in the official release of MultiChain 2.0 alpha 4.
answered Aug 13, 2018 by MultiChain
...