public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Olaoluwa Osuntokun <laolu32@gmail•com>
To: Pieter Wuille <pieter.wuille@gmail•com>
Cc: Bitcoin Dev <bitcoin-dev@lists•linuxfoundation.org>
Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size
Date: Mon, 21 May 2018 18:15:22 -0700	[thread overview]
Message-ID: <CAO3Pvs_fSWN5mQeJzD302LuGyy-VFrNbv9r8JVPJ=v3-_c4kCQ@mail.gmail.com> (raw)
In-Reply-To: <CAPg+sBhL8ZV+kswgyQfQyhd0Qv5Mkt1cYxrfFV4H32s9QYLo0A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4101 bytes --]

Hi Y'all,

The script finished a few days ago with the following results:

reg-filter-prev-script total size:  161236078  bytes
reg-filter-prev-script avg:         16123.6078 bytes
reg-filter-prev-script median:      16584      bytes
reg-filter-prev-script max:         59480      bytes

Compared to the original median size of the same block range, but with the
current filter (has both txid, prev outpoint, output scripts), we see a
roughly 34% reduction in filter size (current median is 22258 bytes).
Compared to the suggested modified filter (no txid, prev outpoint, output
scripts), we see a 15% reduction in size (median of that was 19198 bytes).
This shows that script re-use is still pretty prevalent in the chain as of
recent.

One thing that occurred to me, is that on the application level, switching
to the input prev output script can make things a bit awkward. Observe that
when looking for matches in the filter, upon a match, one would need access
to an additional (outpoint -> script) map in order to locate _which_
particular transaction matched w/o access to an up-to-date UTOX set. In
contrast, as is atm, one can locate the matching transaction with no
additional information (as we're matching on the outpoint).

At this point, if we feel filter sizes need to drop further, then we may
need to consider raising the false positive rate.

Does anyone have any estimates or direct measures w.r.t how much bandwidth
current BIP 37 light clients consume? It would be nice to have a direct
comparison. We'd need to consider the size of their base bloom filter, the
accumulated bandwidth as a result of repeated filterload commands (to adjust
the fp rate), and also the overhead of receiving the merkle branch and
transactions in distinct messages (both due to matches and false positives).

Finally, I'd be open to removing the current "extended" filter from the BIP
as is all together for now. If a compelling use case for being able to
filter the sigScript/witness arises, then we can examine re-adding it with a
distinct service bit. After all it would be harder to phase out the filter
once wider deployment was already reached. Similarly, if the 16% savings
achieved by removing the txid is attractive, then we can create an
additional
filter just for the txids to allow those applications which need the
information to seek out that extra filter.

-- Laolu


On Fri, May 18, 2018 at 8:06 PM Pieter Wuille <pieter.wuille@gmail•com>
wrote:

> On Fri, May 18, 2018, 19:57 Olaoluwa Osuntokun via bitcoin-dev <
> bitcoin-dev@lists•linuxfoundation.org> wrote:
>
>> Greg wrote:
>> > What about also making input prevouts filter based on the scriptpubkey
>> being
>> > _spent_?  Layering wise in the processing it's a bit ugly, but if you
>> > validated the block you have the data needed.
>>
>> AFAICT, this would mean that in order for a new node to catch up the
>> filter
>> index (index all historical blocks), they'd either need to: build up a
>> utxo-set in memory during indexing, or would require a txindex in order to
>> look up the prev out's script. The first option increases the memory load
>> during indexing, and the second requires nodes to have a transaction index
>> (and would also add considerable I/O load). When proceeding from tip, this
>> doesn't add any additional load assuming that your synchronously index the
>> block as you validate it, otherwise the utxo set will already have been
>> updated (the spent scripts removed).
>>
>
> I was wondering about that too, but it turns out that isn't necessary. At
> least in Bitcoin Core, all the data needed for such a filter is in the
> block + undo files (the latter contain the scriptPubKeys of the outputs
> being spent).
>
> I have a script running to compare the filter sizes assuming the regular
>> filter switches to include the prev out's script rather than the prev
>> outpoint itself. The script hasn't yet finished (due to the increased I/O
>> load to look up the scripts when indexing), but I'll report back once it's
>> finished.
>>
>
> That's very helpful, thank you.
>
> Cheers,
>
> --
> Pieter
>
>

[-- Attachment #2: Type: text/html, Size: 6050 bytes --]

  reply	other threads:[~2018-05-22  1:15 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-17 15:25 Matt Corallo
2018-05-17 15:43 ` Peter Todd
2018-05-17 15:46   ` Matt Corallo
2018-05-17 16:36 ` Gregory Maxwell
2018-05-17 16:59   ` Matt Corallo
2018-05-17 18:34     ` Gregory Maxwell
2018-05-17 20:19       ` Jim Posen
2018-05-17 20:45         ` Gregory Maxwell
2018-05-17 21:27           ` Jim Posen
2018-05-19  3:12             ` Olaoluwa Osuntokun
2018-05-21  8:35               ` Johan Torås Halseth
2018-05-22  1:16                 ` Olaoluwa Osuntokun
2018-05-22  9:23                   ` Johan Torås Halseth
2018-05-23  0:42                     ` Jim Posen
2018-05-23  7:38                       ` Jim Posen
2018-05-23  8:16                         ` Johan Torås Halseth
2018-05-23 17:28                         ` Gregory Maxwell
2018-05-24  1:04                           ` Conner Fromknecht
2018-05-24  3:48                             ` Jim Posen
2018-05-28 18:18                               ` Tamas Blummer
2018-05-28 18:28                                 ` Tamas Blummer
2018-05-28 19:24                                   ` Gregory Maxwell
2018-05-29  2:42                                     ` Jim Posen
2018-05-29  3:24                                       ` Gregory Maxwell
2018-05-29  4:01                                       ` Olaoluwa Osuntokun
2018-05-31 14:27                                         ` Tamas Blummer
2018-06-01  2:52                                         ` Olaoluwa Osuntokun
2018-06-01  4:15                                           ` Gregory Maxwell
     [not found]                                           ` <CAAS2fgSyVi0d_ixp-auRPPzPfFeffN=hsWhWT5=EzDO3O+Ue1g@mail.gmail.com>
2018-06-02  0:01                                             ` Olaoluwa Osuntokun
2018-06-02  0:22                                               ` Gregory Maxwell
2018-06-02  2:02                                                 ` Jim Posen
2018-06-02 12:41                                                   ` David A. Harding
2018-06-02 22:02                                                     ` Tamas Blummer
2018-06-03  0:28                                                       ` Gregory Maxwell
2018-06-03  5:14                                                         ` Tamas Blummer
2018-06-03  6:11                                                           ` Pieter Wuille
2018-06-03 16:44                                                             ` Tamas Blummer
2018-06-03 16:50                                                               ` Tamas Blummer
2018-06-08  5:03                                                             ` Olaoluwa Osuntokun
2018-06-08 16:14                                                               ` Gregory Maxwell
2018-06-08 23:35                                                                 ` Olaoluwa Osuntokun
2018-06-09 10:34                                                                   ` David A. Harding
2018-06-12 23:51                                                                     ` Olaoluwa Osuntokun
2018-06-09 15:45                                                                   ` Gregory Maxwell
2018-06-12 23:58                                                                     ` Olaoluwa Osuntokun
2018-05-17 18:34     ` Gregory Maxwell
2018-05-18  8:46   ` Riccardo Casatta
2018-05-19  3:08     ` Olaoluwa Osuntokun
2018-05-19  2:57   ` Olaoluwa Osuntokun
2018-05-19  3:06     ` Pieter Wuille
2018-05-22  1:15       ` Olaoluwa Osuntokun [this message]
2018-05-18  6:28 ` Karl-Johan Alm
2018-06-04  8:42   ` Riccardo Casatta
2018-06-05  1:08     ` Jim Posen
2018-06-05  4:33       ` Karl-Johan Alm
2018-06-05 17:22         ` Jim Posen
2018-06-05 17:52       ` Gregory Maxwell
2018-06-06  1:12     ` Olaoluwa Osuntokun
2018-06-06 15:14       ` Riccardo Casatta
2018-05-19  2:51 ` Olaoluwa Osuntokun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAO3Pvs_fSWN5mQeJzD302LuGyy-VFrNbv9r8JVPJ=v3-_c4kCQ@mail.gmail.com' \
    --to=laolu32@gmail$(echo .)com \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    --cc=pieter.wuille@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox