Why do I sometimes get a stream publish error ?

+2 votes

Hi

I have used Multichain for years, but all of a sudden I now (starting this morning, after an application restart) receive the following error from time to time:

"this transaction was rejected. This may be because you are sharing private keys between nodes, and another node has spent the funds used by this transaction."

I have two Multichain nodes sharing a private key. They write very little to the blockchain, at most once per second. I only get the error on one of the nodes (and never on the other node), on about 50% of all transactions.

I use Multichain version 1.0.8. The multichaind process has not been restarted for over a year.

Both nodes write to streams, and sometimes to the same stream. They only write one single-transaction at a time.

The written JSON is small, about 1KB.

Why do I get this error?

The error has been reported earlier (https://www.multichain.com/qa/15147/publish-data-error), but there seems to have been no resolution for that.

 

asked Feb 18 by ilsundal

1 Answer

0 votes
You cannot send transactions (including publishing to a stream) simultaneously from two nodes, using the same address, with regular sending API commands. See the 'Approaches for writing' section on this page for a list of your options: https://www.multichain.com/developers/clustering-high-availability/
answered Feb 18 by MultiChain
Thanks, that's good to know, but this error also happens when the nodes are not sending transactions simultaneously, e.g. when just one of the nodes sends a transaction to the blockchain without any other activity.

Further, this error always happen on the same node, and only sometimes. It never happens on the other node.

My best theory is that the two nodes did once send transactions simultaneously from the same address which caused one of the nodes to somehow end in a bad state that it cannot recover from. This theory is also supported by the observation that I have run the current code for two years without any problems (probably because the two nodes have not send transactions simultaneously before).

I tried to restart both nodes, but that didn't help.
It is conceivable that this is the cause. You can test this theory by stopping the node that is causing problems, then restarting it with the -reindex command line parameter. Note that this reindexing will take some time. Please let us know what you find.
I tried applying the "-reindex" command line parameter, but that didn't help. The error still occurs. regularly.

I am now also using only one node, not two. And the problem still occurs on that node.

I have enabled debugging and the error appears like this in the debug log:

2020-02-26 22:49:11 mchn: Asset Grouping. Group Size: 1. Group Count: 0
2020-02-26 22:49:11 CommitTransaction: 8a26599fd83b9857bf27b3d65ee0cdab68511ed8ef7bbcb51cd6f4d246d59ca8, vin: 1, vout: 2
2020-02-26 22:49:11 Missing tx (e3044df1651a21b635d5debfab70a1008cf789af32bace504d4b989d32413cf0)
2020-02-26 22:49:11 CommitTransaction() : Error: Transaction not valid:

Apparently some transaction is missing, but why is this? And more importantly, what can I do to fix it?
I have forwarded this to the dev team to take a look.
Please confirm the nodes are not at all connected to each other. You should see an empty output for getpeerinfo on the node causing problems.
I can confirm that. There are now only one node. And two applications writing transactions to it.

multichain@api1:~$ multichain-cli kyc getpeerinfo
{"method":"getpeerinfo","params":[],"id":"33220033-1582878018","chain_name":"kyc"}

[
]

And the issue is still there:

2020-02-28 08:00:00 mchn: Asset Grouping. Group Size: 1. Group Count: 0
2020-02-28 08:00:00 CommitTransaction: d4a8d4edab437a94299b1ccdc9bf0974f6e547e35e53d3d26e86102badfcaf7e, vin: 1, vout: 2
2020-02-28 08:00:00 mchn: Asset Grouping. Group Size: 1. Group Count: 0
2020-02-28 08:00:00 CommitTransaction: 0dfa8a2a3bcfa62e422ed1d98b5bd2bca2e690aff1607a511b69eb29668d34b9, vin: 1, vout: 2
2020-02-28 08:00:00 Missing tx (e3044df1651a21b635d5debfab70a1008cf789af32bace504d4b989d32413cf0)
2020-02-28 08:00:00 CommitTransaction() : Error: Transaction not valid:
2020-02-28 08:00:00 mchn: Asset Grouping. Group Size: 1. Group Count: 0
2020-02-28 08:00:00 CommitTransaction: 88fa0b8ac69a6d8ee2b8959fe95a1cc67a9aa9fc8a04a0c990a08ed1b618fdd7, vin: 1, vout: 2
2020-02-28 08:00:00 Missing tx (e3044df1651a21b635d5debfab70a1008cf789af32bace504d4b989d32413cf0)
2020-02-28 08:00:00 CommitTransaction() : Error: Transaction not valid:
OK, there's not much more we can do at this point except take a look at your blockchain and node state here. Are you able to stop this node, zip up its blockchain directory, and send it to us at multichain dot debug at gmail dot com?
The directory is 370MB, not feasible to be sent over email.

Is there another way? Also, perhaps you can tell me which files exactly you need? Some files in the directory contains confidential information.
Without access to the blockchain directory or node, it is hard to get to the bottom of this. But here is another suggestion from the team for a more thorough reindexing:

1. Call getblockcount to see chain height
2. Stop ALL nodes - those with the same key and those with different (these nodes may have conflicting transactions in mempool)
3. Restart ALL these nodes, ONE BY ONE, with -reindex, wait for the chain to reach the height obtained in #1, stop node and ONLY THEN do this procedure for another node
4. After step 3 is performed on ALL nodes, restart MultiChain normally
Thanks, but this is a production environment so I cannot stop everything.

As a solution, I chose to delete the faulty blockchain and recreate it from another non-faulty blockchain. This seems to have solved the issue, and without any downtime.
...