How do streams work under the hood?

+5 votes

As far as I know, blockchains do not provide a way to efficiently retrieve data. For example, if I'm looking for txs with a desired property (e.g. it has an output to a particular address), I simply have to traverse over each block and each tx in those blocks and fetch the matching txs.

However, Multichain seems to be providing a way to do the above task efficiently with streams if I understand correctly.How does it do that?

Here it says it indexes data like regular DBs. Where do these indexes stored? Are they stored off-chain, in local memory?

In general, how do streams achieve efficient retrieval of data?

asked May 14, 2018 by TinfoilHat

1 Answer

+3 votes
Best answer
If a node is subscribed to a stream, it is indexing that stream's content in real-time, in many different ways. These indexes are stored by the local node in embedded LevelDB databases, and they are sufficient to allow rapid retrieval of stream items using any of the APIs offered.
answered May 15, 2018 by MultiChain
selected May 15, 2018 by TinfoilHat
Hello and thank you for the answer. Just to make it sure, LevelDB just stores indexes, not the data itself (transactions), right?
LevelDB stores indexes and transactions. But if the stream item data size is larger than 256 bytes, it is omitted from the LevelDB index, and replaced with a pointer to the data on disk (within the block). So large items are only stored once.