Is there a formula for estimating disk usage from streams?

+2 votes
I'd like to determine how quickly 16Gb of disk space might be consumed on a device with limited storage capacity in the following scenario:

I create a seed node on a high capacity server, defining a chain and a stream. The seed node subscribes to the stream. No other node subscribes to the stream.
I create 10,000 additional nodes, each on a small device. The seed node grants chain connnect, send, receive permissions to each node as well as stream publish permission. Each node is then expected to invoke the publish command 4 times every 15 seconds with a hex data payload of approximately 400 bytes on each invocation. If run continuously, that would be 23,040 publish requests per device per day.

How quickly will the devices use up 16Gb? If less than a year, is there a procedure for archiving and offloading old stream information to free up disk space?

TIA for any insights.
asked Mar 19, 2018 by Ed

1 Answer

+1 vote

So to approximately calculate this, for nodes not subscribed to the stream, you would do something like the following:

  1. Take the total amount of actual payload data.
  2. Add an overhead of approximately 200 bytes per transaction.
  3. Add an overhead of approximately 2 kilobytes per block.

Based on a rough calculation you should be OK if your block time is not too low.

Also note that you can save quite a bit of space (item 2 above) by combining multiple stream items in a single transaction. This is supported in MultiChain 2.0 alpha versions.

There is currently no way of archiving and offloading old stream information.

answered Mar 20, 2018 by MultiChain
To calculate the actual size of the data payload (let's say the hexadecimal encoded payload as supplied to the publish method is 0950ef8a5dfc0474f56f5d0d9dae8f0e19bf59c28771a4697ec3f1c4e6012d53), should I take it as 64 bytes, or 32 bytes since it is just a hexadecimal representation?
Take is as 32 bytes since it's stored everywhere as raw binary, and the hexadecimal is just used for the API requests and responses.
...