public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Jim Posen <jim.posen@gmail•com>
To: Gregory Maxwell <greg@xiph•org>
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists•linuxfoundation.org>
Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size
Date: Wed, 23 May 2018 20:48:00 -0700	[thread overview]
Message-ID: <CADZtCShDzPK_jqeOrK4XBoB2uriU9c9T8Dm7By-8ew3XOoAeQg@mail.gmail.com> (raw)
In-Reply-To: <CAFfwr8F+ghYb2HYEgC7Lh7Z-ytNE7EABr6cxiVXYhWLk-TPO7A@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 5583 bytes --]

Greg, I've attached a graph including the input scripts.

In the top graph, we can see how the input script filter compares to the
input outpoint filter. It is definitely smaller as a result of address
reuse. The bottom graph shows the ratio over time of combining the input
prev script and output script filters vs keeping them separate. In more
recent blocks, it appears that there are decreasing savings.

On Wed, May 23, 2018 at 6:04 PM Conner Fromknecht
<conner@lightning•engineering> wrote:

> Hi all,
>
> Jimpo, thanks for looking into those stats! I had always imagined that
> there
> would be a more significant savings in having all filters in one bundle, as
> opposed to separate. These results are interesting, to say the least, and
> definitely offer us some flexibility in options for filter sharding.
>
> So far, the bulk of this discussion has centered around bandwidth. I am
> concerned, however, that splitting up the filters is at odds with the
> other
> goal of the proposal in offering improved privacy.
>
> Allowing clients to choose individual filter sets trivially exposes the
> type of
> data that client is interested in. This alone might be enough to
> fingerprint the
> function of a peer and reduce anonymity set justifying their potential
> behavior.
>
> Furthermore, if a match is encountered, and block requested, full nodes
> have
> more targeted insight into what caused a particular match. They could
> infer that
> the client received funds in a particular block, e.g., if they are only
> requesting
> output scripts.
>
> This is above and beyond the additional complexity of now syncing,
> validating,
> and managing five or six distinct header/filter-header/filter/block chains.
>
> I agree that saving on bandwidth is an important goal, but bandwidth and
> privacy
> are always seemingly at odds. Strictly comparing the bandwidth
> requirements of
> a system that heavily weighs privacy to existing ones, e.g. BIP39, that
> don't is a
> losing battle IMO.
>
> I'm not fundamentally opposed to splitting the filters, I certainly see the
> arguments for flexibility. However, I also want to ensure we are
> considering the
> second order effects that fall out of optimizing for one metric when
> others exist.
>
> Cheers,
> Conner
> On Wed, May 23, 2018 at 10:29 Gregory Maxwell via bitcoin-dev <
> bitcoin-dev@lists•linuxfoundation.org> wrote:
>
>> Any chance you could add a graph of input-scripts  (instead of input
>> outpoints)?
>>
>> On Wed, May 23, 2018 at 7:38 AM, Jim Posen via bitcoin-dev
>> <bitcoin-dev@lists•linuxfoundation.org> wrote:
>> > So I checked filter sizes (as a proportion of block size) for each of
>> the
>> > sub-filters. The graph is attached.
>> >
>> > As interpretation, the first ~120,000 blocks are so small that the
>> > Golomb-Rice coding can't compress the filters that well, which is why
>> the
>> > filter sizes are so high proportional to the block size. Except for the
>> > input filter, because the coinbase input is skipped, so many of them
>> have 0
>> > elements. But after block 120,000 or so, the filter compression
>> converges
>> > pretty quickly to near the optimal value. The encouraging thing here is
>> that
>> > if you look at the ratio of the combined size of the separated filters
>> vs
>> > the size of a filter containing all of them (currently known as the
>> basic
>> > filter), they are pretty much the same size. The mean of the ratio
>> between
>> > them after block 150,000 is 99.4%. So basically, not much compression
>> > efficiently is lost by separating the basic filter into sub-filters.
>> >
>> > On Tue, May 22, 2018 at 5:42 PM, Jim Posen <jim.posen@gmail•com> wrote:
>> >>>
>> >>> My suggestion was to advertise a bitfield for each filter type the
>> node
>> >>> serves,
>> >>> where the bitfield indicates what elements are part of the filters.
>> This
>> >>> essentially
>> >>> removes the notion of decided filter types and instead leaves the
>> >>> decision to
>> >>> full-nodes.
>> >>
>> >>
>> >> I think it makes more sense to construct entirely separate filters for
>> the
>> >> different types of elements and allow clients to download only the
>> ones they
>> >> care about. If there are enough elements per filter, the compression
>> ratio
>> >> shouldn't be much worse by splitting them up. This prevents the
>> exponential
>> >> blowup in the number of filters that you mention, Johan, and it works
>> nicely
>> >> with service bits for advertising different filter types independently.
>> >>
>> >> So if we created three separate filter types, one for output scripts,
>> one
>> >> for input outpoints, and one for TXIDs, each signaled with a separate
>> >> service bit, are people good with that? Or do you think there
>> shouldn't be a
>> >> TXID filter at all, Matt? I didn't include the option of a prev output
>> >> script filter or rolling that into the block output script filter
>> because it
>> >> changes the security model (cannot be proven to be correct/incorrect
>> >> succinctly).
>> >>
>> >> Then there's the question of whether to separate or combine the
>> headers.
>> >> I'd lean towards keeping them separate because it's simpler that way.
>> >
>> >
>> >
>> > _______________________________________________
>> > bitcoin-dev mailing list
>> > bitcoin-dev@lists•linuxfoundation.org
>> > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>> >
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists•linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>

[-- Attachment #1.2: Type: text/html, Size: 7723 bytes --]

[-- Attachment #2: filter_sizes.svg --]
[-- Type: image/svg+xml, Size: 2833874 bytes --]

  reply	other threads:[~2018-05-24  3:48 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-17 15:25 Matt Corallo
2018-05-17 15:43 ` Peter Todd
2018-05-17 15:46   ` Matt Corallo
2018-05-17 16:36 ` Gregory Maxwell
2018-05-17 16:59   ` Matt Corallo
2018-05-17 18:34     ` Gregory Maxwell
2018-05-17 18:34     ` Gregory Maxwell
2018-05-17 20:19       ` Jim Posen
2018-05-17 20:45         ` Gregory Maxwell
2018-05-17 21:27           ` Jim Posen
2018-05-19  3:12             ` Olaoluwa Osuntokun
2018-05-21  8:35               ` Johan Torås Halseth
2018-05-22  1:16                 ` Olaoluwa Osuntokun
2018-05-22  9:23                   ` Johan Torås Halseth
2018-05-23  0:42                     ` Jim Posen
2018-05-23  7:38                       ` Jim Posen
2018-05-23  8:16                         ` Johan Torås Halseth
2018-05-23 17:28                         ` Gregory Maxwell
2018-05-24  1:04                           ` Conner Fromknecht
2018-05-24  3:48                             ` Jim Posen [this message]
2018-05-28 18:18                               ` Tamas Blummer
2018-05-28 18:28                                 ` Tamas Blummer
2018-05-28 19:24                                   ` Gregory Maxwell
2018-05-29  2:42                                     ` Jim Posen
2018-05-29  3:24                                       ` Gregory Maxwell
2018-05-29  4:01                                       ` Olaoluwa Osuntokun
2018-05-31 14:27                                         ` Tamas Blummer
2018-06-01  2:52                                         ` Olaoluwa Osuntokun
2018-06-01  4:15                                           ` Gregory Maxwell
     [not found]                                           ` <CAAS2fgSyVi0d_ixp-auRPPzPfFeffN=hsWhWT5=EzDO3O+Ue1g@mail.gmail.com>
2018-06-02  0:01                                             ` Olaoluwa Osuntokun
2018-06-02  0:22                                               ` Gregory Maxwell
2018-06-02  2:02                                                 ` Jim Posen
2018-06-02 12:41                                                   ` David A. Harding
2018-06-02 22:02                                                     ` Tamas Blummer
2018-06-03  0:28                                                       ` Gregory Maxwell
2018-06-03  5:14                                                         ` Tamas Blummer
2018-06-03  6:11                                                           ` Pieter Wuille
2018-06-03 16:44                                                             ` Tamas Blummer
2018-06-03 16:50                                                               ` Tamas Blummer
2018-06-08  5:03                                                             ` Olaoluwa Osuntokun
2018-06-08 16:14                                                               ` Gregory Maxwell
2018-06-08 23:35                                                                 ` Olaoluwa Osuntokun
2018-06-09 10:34                                                                   ` David A. Harding
2018-06-12 23:51                                                                     ` Olaoluwa Osuntokun
2018-06-09 15:45                                                                   ` Gregory Maxwell
2018-06-12 23:58                                                                     ` Olaoluwa Osuntokun
2018-05-18  8:46   ` Riccardo Casatta
2018-05-19  3:08     ` Olaoluwa Osuntokun
2018-05-19  2:57   ` Olaoluwa Osuntokun
2018-05-19  3:06     ` Pieter Wuille
2018-05-22  1:15       ` Olaoluwa Osuntokun
2018-05-18  6:28 ` Karl-Johan Alm
2018-06-04  8:42   ` Riccardo Casatta
2018-06-05  1:08     ` Jim Posen
2018-06-05  4:33       ` Karl-Johan Alm
2018-06-05 17:22         ` Jim Posen
2018-06-05 17:52       ` Gregory Maxwell
2018-06-06  1:12     ` Olaoluwa Osuntokun
2018-06-06 15:14       ` Riccardo Casatta
2018-05-19  2:51 ` Olaoluwa Osuntokun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADZtCShDzPK_jqeOrK4XBoB2uriU9c9T8Dm7By-8ew3XOoAeQg@mail.gmail.com \
    --to=jim.posen@gmail$(echo .)com \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    --cc=greg@xiph$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox