In a network where all nodes are generating transaction why one node performs much better than other nodes?

+2 votes
I have set up a small network of 5 nodes. I am making all 5 of them generate transactions to their maximum capability by using a load testing tool for a 5 minute time duration. For all of my experiments, one node outperforms the other with around 100-110 tps, but all the other nodes only achieve less than 15tps. I have made surer that all the nodes start generating transactions at the exact same time. All the nodes have exactly the same specs and the same programs are running on them during the experiment. I don't understand why this is happening.
asked Oct 26, 2022 by maheen.ayesha
Which MultiChain version are you using?
multichain v2.1.2

1 Answer

0 votes
This is probably related to one node doing all the mining, and being run at saturated capacity so it doesn't get a chance to allow transactions from other nodes. (It also seems like you are running these nodes on weak cloud instances?) Anyway, version 2.3.1 which is coming out shortly should address this scenario.
answered Nov 2, 2022 by MultiChain
Thanks, I've discussed this with the development team. We have a few theories, you can say if any of them is true:

a) Some nodes are subscribed to the stream and some are not (the subscribed ones can be slower)
b) Some nodes have a full memory pool, see getmempoolinfo output
c) The nodes are not syncing, check that getblockcount matches

You can also see if there are any hints in debug.log as to why this is happening.

For our information, it would also be helpful to understand *exactly* what you're doing in these transactions. Which API are you calling and what are its parameters?
Thank you for these suggestions. We are basically using the publish API. We have not created a separate stream, so we pass in "root",key, our json object to it. This API request is sent automatically from jmeter and we run Jmeter for 5 minutes on all nodes simultaneously.

a) We have not created a separate stream. We use root and I think it already has permissions to write. read, publish. None of the nodes are subscribed to any other stream.

b) We use blocknotify and run getmempoolinfo and getlastblockinfo whenever blocknofity is triggered. The getmempoolinfo results show that it starts from 0 and then gradually increases and then goes back to 0. These values are different on all nodes, but all go down to 0. We start the next experiment only after is has gone back to 0 on all nodes.

c) We ran getlastblockinfo and we got the same value on all nodes. However, in our results of getlastblockinfo, there are some differences. Some blocks are skipped or out of order in some nodes.

For the debug file I am not sure what to look for. Mostly it says commit transaction.
Also our mining diversity is set to 0.3, mining turnover and mine-empty-round is 0. Anyone-can-mine is also false.
Thanks. I assume nothing else is running on these systems which is slowing them down, i.e. you've checked the Windows Task Manager / Performance Monitor / etc...? Would you be interested in trying it out on Linux instead, or are you committed to using Windows?
Yes, No other programs are running on these pcs, only multichain, our backend server and jmeter.
It will not be feasible for us to move to linux at this stage. We are open any suggestions to improve performance on this setup.
I also wanted to ask about parallel processing of v2.3.1. We have not noticed any changes. Does it need to be enabled specially?
Thanks for your reply. We are still mystified by this. Some more ideas from the team:

a) Perhaps the set of unspent transactions is larger on some nodes than others, which can slow down transaction creation. You can check this by running "listunspent 0" on each node and seeing the number of items in the result.

b) Please confirm all nodes are subscribed to the root stream, so the difference cannot be explained in terms of this subscription status. Look for the 'subscribed' field in the response from the liststreams command.

c) Please try running multichaind with the extra parameter -logcommittx=0 – this will reduce the amount of logging and this could make a difference depending on disk performance and disk driver configuration.

d) Please confirm using an activity monitor / task manager on every Windows computer that there is no process which is taking up a lot of CPU or memory. Sometimes there are background processes on surprises, especially on Windows.
...