Failure after less than 2 million transactions

+1 vote

Hi,

Please can you explain the cause of this: I was trying to verify the wallet result published in your blogs (http://www.multichain.com/blog/2016/07/announcing-the-new-multichain-wallet/). I did a loop to 6 million, transferring a unit of an asset from one node to another. I had up to 100million unit of that asset. After about 2 million transactions, I get this error on the receiving node (which is also the first node and the only mining node)

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

and this on the sending node (accessing multichain via python json-rpc):

ConnectionError: HTTPConnectionPool(host='localhost', port=8358): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f2911db9cd0>: Failed to establish a new connection: [Errno 111] Connection refused',))

Also, I confirmed that there was enough hdd space on both nodes.

Is there somewhere i should have changed a sort of cap to the max number of transactions or something like that?

R.

 

asked Feb 7, 2017 by Rosevelt
edited Feb 7, 2017 by Rosevelt
When you say you got this error on the receiving node:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

I assume this means that multichaind stopped running on the receiving node. If so are you able to restart it from the command line, or do you keep seeing that error when you do so?

The issue with the sending node is probably that the memory pool of unconfirmed transactions became full, because the receiving node stopped running, and it's the only one mining blocks that confirm transactions. If so it's a consequence of the other issue.
I get the same error message a few seconds after restarting the first node, but the second node restarted fine. Also noted that it took longer to start.
Can you point me in the right direction to read about the memory pool of unconfirmed transactions. Also is there sometime that can be done about this? For instance, how do I recover my first node?
Before you recover the first node, we would really appreciate an opportunity to debug this, since it seems like a bug in MultiChain which is triggered by what you're doing, which we don't see in our own testing. If you're willing to help, could you please do the following:

a) stop both nodes if they are currently running, using this on both servers:

multichain-cli [chain-name] stop

b) (g)zip up the ~/.multichain/[chain-name] directory on both nodes, e.g. using tar -cvzf [file name] ~/.multichain/[chain-name]

c) email those two zipped files, along with the script you're using to generate the test transactions, to us at multichain dot debug at gmail dot com.

d) please confirm here which version of MultiChain you're using.

Once you've done that, you can try recovering the broken node by running multichaind with the -reindex flag, and making sure that your load testing is paused until it has reached the end of the blockchain as stored. But we'd really appreciate if you can take the snapshot, as described above, before doing so. Thanks!
MultiChain Core Daemon build 1.0 alpha 27 protocol 10007
Thanks, we got the files and will take a look. For now, another question - did you ever stop multichaind on the receiving node by directly terminating the process or shutting down the server, rather than issuing a 'stop' command via the API?
No, i didn't. It was a new setup. However, I had an exact error the day before due to running out of space on my VM node, so I had to extend the vm disk space on all the nodes, delete the old experiment folders before starting again.

1 Answer

+1 vote

Thanks, we took a look at your files. This appears to be a simple out-of-memory issue. Every time you send the asset to the receiving node, a new unspent transaction output (UTXO) is created in the receiving node's wallet. These UTXOs take up memory and cause it to eventually run out. In a real pattern of usage, the receiving wallet would also presumably be sending out transactions, and this is how UTXOs get consumed.

If you want to test throughput for this kind of usage pattern, you will need to make heavy use of the combineunspent API on the receiving node to keep down the number of UTXOs in its wallet. If the receiving address also has send permissions, MultiChain also automatically combines UTXOs (see the autocombine runtime parameters) but the default settings will not be able to keep up with such an intensive test.

But the simpler option is to run a performance test that does not inflate the number of UTXOs. For example, using the publish API to create stream items, or sending assets to the chain's burnaddress (from getinfo) which doesn't create UTXOs in any node's wallet.

answered Feb 12, 2017 by MultiChain
...