Compressing raw exchange transaction

+1 vote
The binary size of a raw multichain partial exchange transaction is about 239 bytes. Any good idea what is the best way to compress it or generate shorter length transaction raw codes? Will making changes to the multichain source code to generate smaller transaction size be recommended? What are the risks?
asked Mar 23, 2017 by kakkoiiman

2 Answers

+2 votes

Good question with a long answer which we'll spljt into two. Here's a raw exchange offer created in MultiChain, 221 bytes in size, broken down in hexadecimal byte by byte:

01000000transaction: version number
01transaction: number of inputs
 ee1881c148d8742b
 368c73e93f824914
 6ee40b86cef5e608
 3d86e1830e1f357a
first input: previous txid (bytes reversed)
 00000000first input: previous vout
 6afirst input: script length
 47first input: signature length
 30440220667d633d
 9088f413bbc5588f
 7a2c56b6fdf36c47
 1fd4dd0650e76805
 7a1502e502206700
 8fd5bf421e8e207c
 6e072750c698e877
 4709efa26ebbeccc
 d9300e87c68f83
first input: signature
 21first input: public key length
 02840b3512eb1433
 2671d7ecba9d91db
 adab4c50b1d3daaa
 0573d60fc6eeb42b
 ca
first input: public key
 fffffffffirst input: "sequence number"
01transaction: number of outputs
 0000000000000000first output: native currency quantity
 37first output: script length
 76first output: OP_DUP opcode
 a9first output: OP_HASH160 opcode
 14first output: address length byte
 ec1ba3cf752a7308
 4343254fa4cad51c
 12ef2c9a
first output: destination address
 88first output: OP_EQUALVERIFY opcode
 acfirst output: OP_CHECKSIG opcode
 1cfirst output: metadata length byte
 73706b71first output: metadata: spkq prefix
 fbaf7dd0781bb053
 1cc81295813b40e8
first output: metadata: MultiChain asset identifier
(first 16 bytes of issuance txid reversed)
 1027000000000000first output: metadata: MultiChain asset quantity
 75first output: OP_DROP opcode
00000000transaction: "lock time"

 

The table above is encoded as follows:

  • Regular typeface: data that will be the same in every exchange offer (assuming the chain has no native currency, and only one asset is being requested) and can be removed if you want to compress that offer for storage, then reinserted of course after retrieval to recreate the raw offer.
  • Bold typeface: data that will be unique to every exchange offer and should be kept in full in the compressed version.
  • Italiic typeface: data that can be removed or compressed under certain conditions, listed below.

Here are some further notes about individual items:

  • first input: previous vout – assuming the input to the exchange was created with preparelockunspent(from), this will easily be in the range 0...255 so can be represented as a single byte.
  • first input: script length – this can be calculated as the first input's signature length + 35 (in base 10).
  • first input: public key – if the input was created with preparelockunspent(from), this public key would have been used to sign the inputs of the previous transaction referred by the input's previous txid, so can be retrieved from there (this requires nodes to be running with -txindex=1, which is the default).
  • first output: destination address – this can be calculated from the first input's public key in the usual method of calculating an address from a public key (see the first few steps here: http://www.multichain.com/developers/address-key-format/)
  • first output: metadata: MultiChain asset identifier – if there are a limited number of known assets on the blockchain, this can be reduced to a byte or two, which retrieve from a lookup table implemented in your application.
  • first output: metadata: MultiChain asset quantity – assuming the number of raw units is known to be smaller than 256x, this can be reduced to x bytes.
answered Mar 23, 2017 by MultiChain
edited Mar 23, 2017 by MultiChain
+1 vote

So, a lot depends on the characteristics of your application, but if you apply all the techniques here, allowing up to 65536 assets on the blockchain, and up to 232 raw units per asset, the data can be reduced to 111 bytes as follows:

 ee1881c148d8742b
 368c73e93f824914
 6ee40b86cef5e608
 3d86e1830e1f357a
first input: previous txid (bytes reversed)
 00first input: previous vout
 47first input: signature length
 30440220667d633d
 9088f413bbc5588f
 7a2c56b6fdf36c47
 1fd4dd0650e76805
 7a1502e502206700
 8fd5bf421e8e207c
 6e072750c698e877
 4709efa26ebbeccc
 d9300e87c68f83
first input: signature
 0100first output: metadata: MultiChain asset index
 10270000first output: metadata: MultiChain asset quantity

 

There are absolutely no risks in this method – you're just removing redundant information from the raw hexadecimal for storage, then rebuilding that raw hexadecimal after retrieving this information.

answered Mar 23, 2017 by MultiChain
edited Mar 23, 2017 by MultiChain
Thanks for the great answer. To complete my understanding, I just want to clarify the following.
The original metadata size of asset identifier and quantity is a total of 24 (16+8) bytes.
In your example, it was reduced to 2 bytes of index with 4 bytes for qty.
So it has to be reconstructed back to the exact 24 bytes
a) 2 bytes of index are referring to a 16 byte asset identifier stored in a look up. So during reconstruction, the 16 byte is restored.
b) 4 bytes for qty is padded with 4 bytes of '00' to restore 8 bytes for qty.
Yes, correct on both points.
...