Hi all,

I'm still doing a little more investigation before opening up a formal
bip PR, but getting close.  Here are some more findings.

After moving the compression from main.cpp to streams.h (CDataStream) it
was a simple matter to add compression to transactions as well. Results
as follows:

range = block size range
ubytes = average size of uncompressed transactions
cbytes = average size of compressed transactions
cmp_ratio% = compression ratio
datapoints = number of datapoints taken

range 	ubytes 	cbytes 	cmp_ratio% 	datapoints
0-250b 	220 	227 	-3.16 	23780
250-500b 	356 	354 	0.68 	20882
500-600 	534 	505 	5.29 	2772
600-700 	653 	608 	6.95 	1853
700-800 	757 	649 	14.22 	578
800-900  	822 	758 	7.77 	661
900-1KB 	954 	862 	9.69 	906
1KB-10KB  	2698 	2222 	17.64 	3370
10KB-100KB 	15463 	12092 	21.8 	15429


A couple of obvious observations.  Transactions don't compress well
below 500 bytes but do very well beyond 1KB where there are a great deal
of those large spam type transactions.   However, most transactions
happen to be in the < 500 byte range.  So the next step was to appy
bundling, or the creating of a "blob" for those smaller transactions, if
and only if there are multiple tx's in the getdata receive queue for a
peer.  Doing that yields some very good compression ratios.  Some
examples as follows:

The best one I've seen so far was the following where 175 transactions
were bundled into one blob before being compressed.  That yielded a 20%
compression ratio, but that doesn't take into account the savings from
the unneeded 174 message headers (24 bytes each) as well as 174 TCP
ACK's of 52 bytes each which yields and additional 76*174=13224 bytes,
making the overall bandwidth savings 32%, in this particular case.

*2015-11-18 01:09:09.002061 compressed blob from 79890 to 67426 txcount:175*

To be sure, this was an extreme example.  Most transaction blobs were in
the 2 to 10 transaction range.  Such as the following:

*2015-11-17 21:08:28.469313 compressed blob from 3199 to 2876 txcount:10*

But even here the savings are 10%, far better than the "nothing" we
would get without bundling, but add to that the 76 byte * 9 transaction
savings and we have a total 20% savings in bandwidth for transactions
that otherwise would not be compressible.

The same bundling was applied to blocks and very good compression ratios
are seen when sync'ing the blockchain.

Overall the bundling or blobbing of tx's and blocks seems to be a good
idea for improving bandwith use but also there is a scalability factor
here, when the system is busy, transactions are bundled more often,
compressed, sent faster, keeping message queue and network chatter to a
minimum.

I think I have enough information to put together a formal BIP with the
exception of which compression library to implement.  These tests were
done using ZLib but I'll also be running tests in the coming days with
LZO (Jeff Garzik's suggestion) and perhaps Snappy.  If there are any
other libraries that people would like me to get results for please let
me know and I'll pick maybe the top 2 or 3 and get results back to the
group.


On 13/11/2015 1:58 PM, Peter Tschipper wrote:
> Some further Block Compression tests results that compare performance
> when network latency is added to the mix.
>
> Running two nodes, windows 7, compressionlevel=6, syncing the first
> 200000 blocks from one node to another.  Running on a highspeed
> wireless LAN with no connections to the outside world.
> Network latency was added by using Netbalancer to induce the 30ms and
> 60ms latencies.
>
> From the data not only are bandwidth savings seen but also a small
> performance savings as well.  However, the overall the value in
> compressing blocks appears to be in terms of saving bandwidth.  
>
> I was also surprised to see that there was no real difference in
> performance when no latency was present; apparently the time it takes
> to compress is about equal to the performance savings in such a situation.
>
>
> The following results compare the tests in terms of how long it takes
> to sync the blockchain, compressed vs uncompressed and with varying
> latencies.
> uncmp = uncompressed
> cmp = compressed
>
> num blocks sync'd 	uncmp (secs) 	cmp (secs) 	uncmp 30ms (secs) 	cmp
> 30ms (secs) 	uncmp 60ms (secs) 	cmp 60ms (secs)
> 10000 	264 	269 	265 	257 	274 	275
> 20000 	482 	492 	479 	467 	499 	497
> 30000 	703 	717 	693 	676 	724 	724
> 40000 	918 	939 	902 	886 	947 	944
> 50000 	1140 	1157 	1114 	1094 	1171 	1167
> 60000 	1362 	1380 	1329 	1310 	1400 	1395
> 70000 	1583 	1597 	1547 	1526 	1637 	1627
> 80000 	1810 	1817 	1767 	1745 	1872 	1862
> 90000 	2031 	2036 	1985 	1958 	2109 	2098
> 100000 	2257 	2260 	2223 	2184 	2385 	2355
> 110000 	2553 	2486 	2478 	2422 	2755 	2696
> 120000 	2800 	2724 	2849 	2771 	3345 	3254
> 130000 	3078 	2994 	3356 	3257 	4125 	4006
> 140000 	3442 	3365 	3979 	3870 	5032 	4904
> 150000 	3803 	3729 	4586 	4464 	5928 	5797
> 160000 	4148 	4075 	5168 	5034 	6801 	6661
> 170000 	4509 	4479 	5768 	5619 	7711 	7557
> 180000 	4947 	4924 	6389 	6227 	8653 	8479
> 190000 	5858 	5855 	7302 	7107 	9768 	9566
> 200000 	6980 	6969 	8469 	8220 	10944 	10724
>
>