This is a really cool finding, thanks Pieter!

I did some more analysis on selecting a good P value to reduce total data downloaded considering both filters themselves and blocks in the case of false positive matches, using data from mainnet. The quantity it minimizes is:

filter_size(N, B) + block_size * false_positive_probability(C, N, B)

N is the number of filter elements per block
B is the Golomb-Rice coding parameter
C is the number of filter elements watched by the client

The main result is that:

For C = 10, B = 13 is optimal
For C = 100, B = 16 is optimal
For C = 1,000, B = 20 is optimal
For C = 10,000, B = 23 is optimal

So any value of B in the range 16 to 20 seems reasonable, with M = 1.4971 * 2^B for optimal compression, as Pieter derived. The selection of the parameter depends on the target number of elements that a client may watch.

I attached some of the results, and would be happy to share the CSV and raw notebook if people are interested.


On Fri, May 25, 2018 at 2:14 PM Gregory Maxwell via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
On Fri, May 25, 2018 at 6:42 PM, Gregory Maxwell <greg@xiph.org> wrote:
> configuration is roughly right, then M=1569861 and rice parameter 19
> should be used.

That should have been M=784931 B=19  ... paste error.
_______________________________________________
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev