Port number problem after a node's IP address was changed

+2 votes
I have 3 servers on AWS, let's call them S1, S2 and S3, which are all on the same chain. S1 used IP address 10.0.1.250 and port 2775, S2 uses 10.0.1.233:2775 and S3 uses 10.,0.1.248:2775. The S1 instance crashed and had to be rebuilt, but this caused its IP address to change to 10.0.3.328. I have tried connecting S2 and S3 to S1, and S1 to S2 and S3 by using the addnode command, but updates I perform on S1 do not seem to be propagating to S2 and S3. S2 and S3 are still communicating with each other, but S1 appears to be cut off completely.

When I run getpeerinfo on each server I get peculiar results as I expect all the port numbers to be 2775, but sometimes they are totally different. Here is the output from S1:

C:\multichain>multichain-cli chain1 getpeerinfo
{"method":"getpeerinfo","params":[],"id":"87620777-1576427198","chain_name":"chain1"}

[
{
"id" : 4736,
"addr" : "10.0.1.233:2775",
"addrlocal" : "10.0.3.238:60924",
"services" : "0000000000000001",
"lastsend" : 1576427194,
"lastrecv" : 1576427194,
"bytessent" : 2648,
"bytesrecv" : 6428,
"conntime" : 1576427161,
"pingtime" : 0.09599,
"version" : 70002,
"subver" : "/MultiChain:0.2.0.11/",
"handshakelocal" : "1E6yHM3qBbh7WBL2X8whecLJch6cB5eh9WcvX6",
"handshake" : "1Qon9APHsiY89PV4J5k2CRqmRGzTv1RqUgpTbz",
"inbound" : false,
"encrypted" : false,
"startingheight" : 1152,
"banscore" : 0,
"synced_headers" : 1152,
"synced_blocks" : 963,
"inflight" : [
964,
965,
966,
967,
968,
969,
970,
971,
972,
973,
974,
975,
976,
977,
978,
979
],
"whitelisted" : false
}
]

C:\multichain>

Here is the output from S2:

C:\multichain>multichain-cli chain1 getpeerinfo
{"method":"getpeerinfo","params":[],"id":"42786950-1576427533","chain_name":"chain1"}

[
{
"id" : 2,
"addr" : "10.0.1.248:2775",
"addrlocal" : "10.0.1.233:51214",
"services" : "0000000000000001",
"lastsend" : 1576427532,
"lastrecv" : 1576427532,
"bytessent" : 1146430,
"bytesrecv" : 1146518,
"conntime" : 1576248189,
"pingtime" : 0.046886,
"version" : 70002,
"subver" : "/MultiChain:0.2.0.11/",
"handshakelocal" : "1Qon9APHsiY89PV4J5k2CRqmRGzTv1RqUgpTbz",
"handshake" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
"inbound" : false,
"encrypted" : false,
"startingheight" : 1152,
"banscore" : 0,
"synced_headers" : 1152,
"synced_blocks" : 1152,
"inflight" : [
],
"whitelisted" : false
},
{
"id" : 2374,
"addr" : "10.0.3.238:61014",
"addrlocal" : "10.0.1.233:2775",
"services" : "0000000000000001",
"lastsend" : 1576427524,
"lastrecv" : 1576427524,
"bytessent" : 6976,
"bytesrecv" : 2840,
"conntime" : 1576427463,
"pingtime" : 0.062524,
"version" : 70002,
"subver" : "/MultiChain:0.2.0.11/",
"handshakelocal" : "1Qon9APHsiY89PV4J5k2CRqmRGzTv1RqUgpTbz",
"handshake" : "1E6yHM3qBbh7WBL2X8whecLJch6cB5eh9WcvX6",
"inbound" : true,
"encrypted" : false,
"startingheight" : 1075,
"banscore" : 0,
"synced_headers" : -1,
"synced_blocks" : -1,
"inflight" : [
],
"whitelisted" : false
}
]

C:\multichain>

Here is the output from S3:

C:\multichain>multichain-cli chain1 getpeerinfo
{"method":"getpeerinfo","params":[],"id":"49302041-1576427671","chain_name":"chain1"}

[
{
"id" : 2,
"addr" : "10.0.1.233:51214",
"addrlocal" : "10.0.1.248:2775",
"services" : "0000000000000001",
"lastsend" : 1576427670,
"lastrecv" : 1576427671,
"bytessent" : 1147382,
"bytesrecv" : 1147294,
"conntime" : 1576248189,
"pingtime" : 0.03125,
"version" : 70002,
"subver" : "/MultiChain:0.2.0.11/",
"handshakelocal" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
"handshake" : "1Qon9APHsiY89PV4J5k2CRqmRGzTv1RqUgpTbz",
"inbound" : true,
"encrypted" : false,
"startingheight" : 1152,
"banscore" : 0,
"synced_headers" : 1152,
"synced_blocks" : 1152,
"inflight" : [
],
"whitelisted" : false
},
{
"id" : 2378,
"addr" : "10.0.3.238:61069",
"addrlocal" : "10.0.1.248:2775",
"services" : "0000000000000001",
"lastsend" : 1576427668,
"lastrecv" : 1576427668,
"bytessent" : 1219,
"bytesrecv" : 1218,
"conntime" : 1576427648,
"pingtime" : 0.062477,
"version" : 70002,
"subver" : "/MultiChain:0.2.0.11/",
"handshakelocal" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
"handshake" : "1E6yHM3qBbh7WBL2X8whecLJch6cB5eh9WcvX6",
"inbound" : true,
"encrypted" : false,
"startingheight" : 1075,
"banscore" : 0,
"synced_headers" : -1,
"synced_blocks" : -1,
"inflight" : [
],
"whitelisted" : false
}
]

C:\multichain>

How can I get all 3 nodes to communicate with each other?
asked Dec 15, 2019 by Tony Marston

2 Answers

0 votes

First, the surprising port numbers you're seeing are due to the fact that each connection between peers is initiated in one particular direction. The port number of the initiator will not be 2775, because it is an outgoing rather than a listening port, and cannot block incoming connections. You'll see that for any connection where inbound is false, the addr port is 2775. And for any connection where inbound is true, the addrlocal port is 2775.

Second, it looks like the nodes are successfully connected to each other. My best guess as to the problem is that the clocks on the nodes are too far out of sync with each other, so they are not accepting each other's data. The output of these API calls shows a skew of 5+ minutes.

answered Dec 15, 2019 by MultiChain
That time difference was due to the delay between logging on to each server, running the getperrinfo command, copying the output. I have just logged onto each of the 3 servers and the clocks are all in perfect sync, so that cannot be the problem.

Is there any significance in the number of 'inflight' entries in the data for S1?

When I try to update the node labels for each of the 3 nodes the S2 and S3 nodes show each other's updates but not the change for S1. Server S1 still does not show any changes for S2 and S3.

I have tried using S1 to update the permissions of S2 and S3, but these all show as 'pending'.

The 3 nodes may be able to see each other, but there is no traffic between S1 and the other 2 nodes.
I've forwarded this to the dev team to take a closer look.
If you see "inflight" in getpeerinfo, it means S1is still not synchronized with S2 and S3. You can check also "synced_blocks" on all 3 nodes. When is this state, it is likely transactions sent by S1 will be ignored by S2/S3 and vice versa, transactions made by S2/S3 can be ignored by S1. For example, you will not be able to change permissions for S2/S3 from S1.

But this out-of-sync state is temporary. If, for some reason, it takes too much time, you can try to start multichaind with -debug=net and see in debug.log why the blocks are not processed normally.

Can you please also clarify what to you mean by "update the node labels"?
By "update the node label" I mean using the "publishfrom" command to change the label for a node's address.

I have run -debug=net on S1 and produced a log file, but it is too large to include in this textbox. Do you have an email address so that I can send it as an attachment?
Yes, please send it to multichain dot debug at gmail dot com.
Thanks - the file has been received and we are discussing internally. It looks like the issue was caused by the fact that the protocol was upgraded while the network was disconnected, and you have mining-diversity set low enough to allow a long-distance fork. Will confirm more details shortly.
0 votes
Thanks for your patience. We have concluded that this is a bug in MultiChain that was triggered by the unusual case of the protocol being upgraded on a node while it was not in sync with the network. It will be fixed in the next maintenance release. In the meantime your workaround, assuming this is not a production network, is to destroy S1 and recreate it, attaching it to the network. Then perform the upgrade only once it is in sync. Please let me know if this resolves the situation.
answered Dec 22, 2019 by MultiChain
Sure, send them over please and we'll take a look.
We checked the directories you sent to us. The problem is - your chain doesn't have miner node. I suppose it was S1 (the first node you created), but after the crash, address with mining permission was permanently lost. MultiChain doesn't allow  spending of unconfirmed outputs sent from other nodes (S2/S3) and since these outputs are never confirmed, you cannot publish from S1.

To fix the problem, you should grant mining permission to one of the addresses, say, S2. Please note that you have 3 admins in your chain - S2, S3 and "lost S1". You need at least two of them to grant mining permissions, so you should do this from both S2 and S3:

grant  1Qon9APHsiY89PV4J5k2CRqmRGzTv1RqUgpTbz mine

When first block is mined (after several seconds), you should be able to see outputs in 'listunspent 1' and publish from S1

If you want to continue to use this chain, I would also suggest you to revoke permissions from the "lost S1" address:

On S2 and S3:

revoke 1E6yHM3qBbh7WBL2X8whecLJch6cB5eh9WcvX6 admin,mine
Thanks for that suggestion, but I have deleted all the multichain directories on all 3 nodes and restarted from scratch. Everything is now working as expected.
I have tried restoring the previous data and followed your suggestions, but any change to permissions on any node is marked as "pending", as shown in the following:

C:\multichain>multichain-cli chain1 listpermissions * 1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE true
{"method":"listpermissions","params":["*","1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",true],"id":"11576586-1579174595","chain_name":"chain1"}

[
    {
        "address" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
        "for" : null,
        "type" : "connect",
        "startblock" : 0,
        "endblock" : 4294967295,
        "admins" : [
            "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE"
        ],
        "pending" : [
        ]
    },
    {
        "address" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
        "for" : null,
        "type" : "send",
        "startblock" : 0,
        "endblock" : 4294967295,
        "admins" : [
            "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE"
        ],
        "pending" : [
        ]
    },
    {
        "address" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
        "for" : null,
        "type" : "receive",
        "startblock" : 0,
        "endblock" : 4294967295,
        "admins" : [
            "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE"
        ],
        "pending" : [
        ]
    },
    {
        "address" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
        "for" : null,
        "type" : "issue",
        "startblock" : 0,
        "endblock" : 4294967295,
        "admins" : [
            "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE"
        ],
        "pending" : [
        ]
    },
    {
        "address" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
        "for" : null,
        "type" : "create",
        "startblock" : 0,
        "endblock" : 4294967295,
        "admins" : [
            "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE"
        ],
        "pending" : [
        ]
    },
    {
        "address" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
        "for" : null,
        "type" : "mine",
        "startblock" : 0,
        "endblock" : 0,
        "admins" : [
            "1Qon9APHsiY89PV4J5k2CRqmRGzTv1RqUgpTbz",
            "1E6yHM3qBbh7WBL2X8whecLJch6cB5eh9WcvX6"
        ],
        "pending" : [
            {
                "startblock" : 0,
                "endblock" : 4294967295,
                "admins" : [
                    "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE"
                ],
                "required" : 1
            }
        ]
    },
    {
        "address" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
        "for" : null,
        "type" : "admin",
        "startblock" : 0,
        "endblock" : 4294967295,
        "admins" : [
            "1E6yHM3qBbh7WBL2X8whecLJch6cB5eh9WcvX6"
        ],
        "pending" : [
            {
                "startblock" : 0,
                "endblock" : 4294967295,
                "admins" : [
                    "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE"
                ],
                "required" : 1
            }
        ]
    },
    {
        "address" : "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE",
        "for" : null,
        "type" : "activate",
        "startblock" : 0,
        "endblock" : 0,
        "admins" : [
        ],
        "pending" : [
            {
                "startblock" : 0,
                "endblock" : 4294967295,
                "admins" : [
                    "1VYiJ7xrc9X1cZxyPCwXf93x6bEpZ9xT7xjsLE"
                ],
                "required" : 1
            }
        ]
    }
]

C:\multichain>
Sorry, my mistake. It is impossible to restore the chain if the private key of the only miner (1E6yHM3qBbh7WBL2X8whecLJch6cB5eh9WcvX6) is lost. This is because any new miner grant should be confirmed in block and there is no miner in the chain to confirm this.

I'm afraid, unless you backed up wallet of S1 before crash, the chain is useless. You can see the data published on that chain, but you cannot mine new blocks.
...