public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
* [bitcoin-dev] "Compressed" headers stream
@ 2017-08-28 15:50 Riccardo Casatta
  2017-08-28 16:13 ` Greg Sanders
       [not found] ` <CAAS2fgS3uG=4vgFuObPKA_5MstoGm4AabO=60fhV3EU_0dvejg@mail.gmail.com>
  0 siblings, 2 replies; 16+ messages in thread
From: Riccardo Casatta @ 2017-08-28 15:50 UTC (permalink / raw)
  To: bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 1586 bytes --]

Hi everyone,

the Bitcoin headers are probably the most condensed and important piece of
data in the world, their demand is expected to grow.

When sending a stream of continuous block headers, a common case in IBD and
in disconnected clients, I think there is a possible optimization of the
transmitted data:
The headers after the first could avoid transmitting the previous hash
cause the receiver could compute it by double hashing the previous header
(an operation he needs to do anyway to verify PoW).
In a long stream, for example 2016 headers, the savings in bandwidth are
about 32/80 ~= 40%
without compressed headers 2016*80=161280 bytes
with compressed headers 80+2015*48=96800 bytes

What do you think?


In OpenTimestamps calendars we are going to use this compression to give
lite-client a reasonable secure proofs (a full node give higher security
but isn't feasible in all situations, for example for in-browser
verification)
To speed up sync of a new client Electrum starts with the download of a file
<https://headers.electrum.org/blockchain_headers> ~36MB containing the
first 477637 headers.
For this kind of clients could be useful a common http API with fixed
position chunks to leverage http caching. For example /headers/2016/0
returns the headers from the genesis to the 2015 header included while
/headers/2016/1 gives the headers from the 2016th to the 4031.
Other endpoints could have chunks of 20160 blocks or 201600 such that with
about 10 http requests a client could fast sync the headers


-- 
Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>

[-- Attachment #2: Type: text/html, Size: 1935 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-08-28 15:50 [bitcoin-dev] "Compressed" headers stream Riccardo Casatta
@ 2017-08-28 16:13 ` Greg Sanders
  2017-08-28 16:25   ` Riccardo Casatta
       [not found] ` <CAAS2fgS3uG=4vgFuObPKA_5MstoGm4AabO=60fhV3EU_0dvejg@mail.gmail.com>
  1 sibling, 1 reply; 16+ messages in thread
From: Greg Sanders @ 2017-08-28 16:13 UTC (permalink / raw)
  To: Riccardo Casatta, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 2049 bytes --]

Is there any reason to believe that you need Bitcoin "full security" at all
for timestamping?

On Mon, Aug 28, 2017 at 11:50 AM, Riccardo Casatta via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:

> Hi everyone,
>
> the Bitcoin headers are probably the most condensed and important piece of
> data in the world, their demand is expected to grow.
>
> When sending a stream of continuous block headers, a common case in IBD
> and in disconnected clients, I think there is a possible optimization of
> the transmitted data:
> The headers after the first could avoid transmitting the previous hash
> cause the receiver could compute it by double hashing the previous header
> (an operation he needs to do anyway to verify PoW).
> In a long stream, for example 2016 headers, the savings in bandwidth are
> about 32/80 ~= 40%
> without compressed headers 2016*80=161280 bytes
> with compressed headers 80+2015*48=96800 bytes
>
> What do you think?
>
>
> In OpenTimestamps calendars we are going to use this compression to give
> lite-client a reasonable secure proofs (a full node give higher security
> but isn't feasible in all situations, for example for in-browser
> verification)
> To speed up sync of a new client Electrum starts with the download of a
> file <https://headers.electrum.org/blockchain_headers> ~36MB containing
> the first 477637 headers.
> For this kind of clients could be useful a common http API with fixed
> position chunks to leverage http caching. For example /headers/2016/0
> returns the headers from the genesis to the 2015 header included while
> /headers/2016/1 gives the headers from the 2016th to the 4031.
> Other endpoints could have chunks of 20160 blocks or 201600 such that with
> about 10 http requests a client could fast sync the headers
>
>
> --
> Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists•linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>

[-- Attachment #2: Type: text/html, Size: 2952 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-08-28 16:13 ` Greg Sanders
@ 2017-08-28 16:25   ` Riccardo Casatta
  2017-08-28 16:26     ` Greg Sanders
  0 siblings, 1 reply; 16+ messages in thread
From: Riccardo Casatta @ 2017-08-28 16:25 UTC (permalink / raw)
  To: Greg Sanders; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 2578 bytes --]

2017-08-28 18:13 GMT+02:00 Greg Sanders <gsanders87@gmail•com>:

> Is there any reason to believe that you need Bitcoin "full security" at
> all for timestamping?
>

This is a little bit out of the main topic of the email which is the
savings in bandwidth in transmitting headers, any comment about that?


P.S. As a personal experience timestamping is nowadays used to prove date
and integrity of private databases containing a lot of value, so yes, in
that cases I will go with Bitcoin "full security"


>
> On Mon, Aug 28, 2017 at 11:50 AM, Riccardo Casatta via bitcoin-dev <
> bitcoin-dev@lists•linuxfoundation.org> wrote:
>
>> Hi everyone,
>>
>> the Bitcoin headers are probably the most condensed and important piece
>> of data in the world, their demand is expected to grow.
>>
>> When sending a stream of continuous block headers, a common case in IBD
>> and in disconnected clients, I think there is a possible optimization of
>> the transmitted data:
>> The headers after the first could avoid transmitting the previous hash
>> cause the receiver could compute it by double hashing the previous header
>> (an operation he needs to do anyway to verify PoW).
>> In a long stream, for example 2016 headers, the savings in bandwidth are
>> about 32/80 ~= 40%
>> without compressed headers 2016*80=161280 bytes
>> with compressed headers 80+2015*48=96800 bytes
>>
>> What do you think?
>>
>>
>> In OpenTimestamps calendars we are going to use this compression to give
>> lite-client a reasonable secure proofs (a full node give higher security
>> but isn't feasible in all situations, for example for in-browser
>> verification)
>> To speed up sync of a new client Electrum starts with the download of a
>> file <https://headers.electrum.org/blockchain_headers> ~36MB containing
>> the first 477637 headers.
>> For this kind of clients could be useful a common http API with fixed
>> position chunks to leverage http caching. For example /headers/2016/0
>> returns the headers from the genesis to the 2015 header included while
>> /headers/2016/1 gives the headers from the 2016th to the 4031.
>> Other endpoints could have chunks of 20160 blocks or 201600 such that
>> with about 10 http requests a client could fast sync the headers
>>
>>
>> --
>> Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists•linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>


-- 
Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>

[-- Attachment #2: Type: text/html, Size: 4246 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-08-28 16:25   ` Riccardo Casatta
@ 2017-08-28 16:26     ` Greg Sanders
  2017-09-04 14:10       ` Peter Todd
  0 siblings, 1 reply; 16+ messages in thread
From: Greg Sanders @ 2017-08-28 16:26 UTC (permalink / raw)
  To: Riccardo Casatta; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 2913 bytes --]

Well, if anything my question may bolster your use-case. If there's a
heavier chain that is invalid, I kind of doubt it matters for timestamping
reasons.

/digression

On Mon, Aug 28, 2017 at 12:25 PM, Riccardo Casatta <
riccardo.casatta@gmail•com> wrote:

>
> 2017-08-28 18:13 GMT+02:00 Greg Sanders <gsanders87@gmail•com>:
>
>> Is there any reason to believe that you need Bitcoin "full security" at
>> all for timestamping?
>>
>
> This is a little bit out of the main topic of the email which is the
> savings in bandwidth in transmitting headers, any comment about that?
>
>
> P.S. As a personal experience timestamping is nowadays used to prove date
> and integrity of private databases containing a lot of value, so yes, in
> that cases I will go with Bitcoin "full security"
>
>
>>
>> On Mon, Aug 28, 2017 at 11:50 AM, Riccardo Casatta via bitcoin-dev <
>> bitcoin-dev@lists•linuxfoundation.org> wrote:
>>
>>> Hi everyone,
>>>
>>> the Bitcoin headers are probably the most condensed and important piece
>>> of data in the world, their demand is expected to grow.
>>>
>>> When sending a stream of continuous block headers, a common case in IBD
>>> and in disconnected clients, I think there is a possible optimization of
>>> the transmitted data:
>>> The headers after the first could avoid transmitting the previous hash
>>> cause the receiver could compute it by double hashing the previous header
>>> (an operation he needs to do anyway to verify PoW).
>>> In a long stream, for example 2016 headers, the savings in bandwidth are
>>> about 32/80 ~= 40%
>>> without compressed headers 2016*80=161280 bytes
>>> with compressed headers 80+2015*48=96800 bytes
>>>
>>> What do you think?
>>>
>>>
>>> In OpenTimestamps calendars we are going to use this compression to give
>>> lite-client a reasonable secure proofs (a full node give higher security
>>> but isn't feasible in all situations, for example for in-browser
>>> verification)
>>> To speed up sync of a new client Electrum starts with the download of a
>>> file <https://headers.electrum.org/blockchain_headers> ~36MB containing
>>> the first 477637 headers.
>>> For this kind of clients could be useful a common http API with fixed
>>> position chunks to leverage http caching. For example /headers/2016/0
>>> returns the headers from the genesis to the 2015 header included while
>>> /headers/2016/1 gives the headers from the 2016th to the 4031.
>>> Other endpoints could have chunks of 20160 blocks or 201600 such that
>>> with about 10 http requests a client could fast sync the headers
>>>
>>>
>>> --
>>> Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists•linuxfoundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>
>>>
>>
>
>
> --
> Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>
>

[-- Attachment #2: Type: text/html, Size: 5004 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [bitcoin-dev] Fwd:  "Compressed" headers stream
       [not found] ` <CAAS2fgS3uG=4vgFuObPKA_5MstoGm4AabO=60fhV3EU_0dvejg@mail.gmail.com>
@ 2017-08-28 17:12   ` Gregory Maxwell
  2017-08-28 17:54     ` Kalle Rosenbaum
  2017-09-04 14:06     ` Peter Todd
  0 siblings, 2 replies; 16+ messages in thread
From: Gregory Maxwell @ 2017-08-28 17:12 UTC (permalink / raw)
  To: Bitcoin Dev

On Mon, Aug 28, 2017 at 3:50 PM, Riccardo Casatta via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
> Hi everyone,
>
> the Bitcoin headers are probably the most condensed and important piece of
> data in the world, their demand is expected to grow.
>
> When sending a stream of continuous block headers, a common case in IBD and
> in disconnected clients, I think there is a possible optimization of the
> transmitted data:
> The headers after the first could avoid transmitting the previous hash cause
> the receiver could compute it by double hashing the previous header (an
> operation he needs to do anyway to verify PoW).
> In a long stream, for example 2016 headers, the savings in bandwidth are
> about 32/80 ~= 40%
> without compressed headers 2016*80=161280 bytes
> with compressed headers 80+2015*48=96800 bytes
>
> What do you think?

You are leaving a lot of bytes on the table.

The bits field can only change every 2016 blocks (4 bytes per header),
the timestamp can not be less than the median of the last 11 and is
usually only a small amount over the last one (saves 2 bytes per
header), the block version is usually one of the last few (save 3
bytes per header).

But all these things improvements are just a constant factor. I think
you want the compact SPV proofs described in the appendix of the
sidechains whitepaper which creates log scaling proofs.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] Fwd: "Compressed" headers stream
  2017-08-28 17:12   ` [bitcoin-dev] Fwd: " Gregory Maxwell
@ 2017-08-28 17:54     ` Kalle Rosenbaum
  2017-09-04 14:06     ` Peter Todd
  1 sibling, 0 replies; 16+ messages in thread
From: Kalle Rosenbaum @ 2017-08-28 17:54 UTC (permalink / raw)
  To: Gregory Maxwell, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 961 bytes --]

2017-08-28 19:12 GMT+02:00 Gregory Maxwell via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org>:

>
> The bits field can only change every 2016 blocks (4 bytes per header),
> the timestamp can not be less than the median of the last 11 and is
> usually only a small amount over the last one (saves 2 bytes per
> header), the block version is usually one of the last few (save 3
> bytes per header).
>

 ... and I guess the nonce can be arbitrarily truncated as well, just brute
force the missing bits :-P.


> But all these things improvements are just a constant factor. I think
> you want the compact SPV proofs described in the appendix of the
> sidechains whitepaper which creates log scaling proofs.
>

I think that my blog post on compact spv proofs can be helpful also. It
tries to make the pretty compact formulations in the sidechains paper a bit
more graspable by normal people.

http://popeller.io/index.php/2016/09/15/compact-spv-proofs/

Kalle

[-- Attachment #2: Type: text/html, Size: 1635 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] Fwd:  "Compressed" headers stream
  2017-08-28 17:12   ` [bitcoin-dev] Fwd: " Gregory Maxwell
  2017-08-28 17:54     ` Kalle Rosenbaum
@ 2017-09-04 14:06     ` Peter Todd
  1 sibling, 0 replies; 16+ messages in thread
From: Peter Todd @ 2017-09-04 14:06 UTC (permalink / raw)
  To: Gregory Maxwell, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1643 bytes --]

On Mon, Aug 28, 2017 at 05:12:15PM +0000, Gregory Maxwell via bitcoin-dev wrote:
> You are leaving a lot of bytes on the table.
> 
> The bits field can only change every 2016 blocks (4 bytes per header),
> the timestamp can not be less than the median of the last 11 and is
> usually only a small amount over the last one (saves 2 bytes per
> header), the block version is usually one of the last few (save 3
> bytes per header).
> 
> But all these things improvements are just a constant factor. I think
> you want the compact SPV proofs described in the appendix of the
> sidechains whitepaper which creates log scaling proofs.

Note that I'm already planning on OpenTimestamps having infrastructure for
trusted validity attestations; log scaling proofs alone only prove total work,
not validity. Timestamping has all kinds of very dubious security properties
when done via proof-of-work, due to various ways that miners can get away with
inaccurate block times. In particular, setting a block time backwards is
something that miners can do, particularly with majority hashing power, which
is the exact thing we're trying to prevent in a timestamp proof.

This all makes me dubious about risking further weakening of this already weak
security with compact SPV proofs; we'd need a lot more analysis to understand
what we're risking. Also note that we can ship a known-good
sum-merkle-tree tip hash with the software, which further reduces initial
download bandwidth needed to get the block headers on top of this obviously
safe eliding of redundant hashes.

-- 
https://petertodd.org 'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-08-28 16:26     ` Greg Sanders
@ 2017-09-04 14:10       ` Peter Todd
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Todd @ 2017-09-04 14:10 UTC (permalink / raw)
  To: Greg Sanders, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 705 bytes --]

On Mon, Aug 28, 2017 at 12:26:48PM -0400, Greg Sanders via bitcoin-dev wrote:
> Well, if anything my question may bolster your use-case. If there's a
> heavier chain that is invalid, I kind of doubt it matters for timestamping
> reasons.

Timestamping can easily be *more* vulnerable to malicious miners than financial
applications for a number of reasons, including the fact that there's no
financial feedback loop of miners destroying the value of the coins they
produce - timestamping is a non-financial piggy-back application that doesn't
directly interact with the Bitcoin economy, beyond a trival number of timestamp
transactions.

-- 
https://petertodd.org 'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-12-12 21:07   ` Suhas Daftuar
  2017-12-13  0:01     ` Gregory Maxwell
@ 2017-12-13  0:12     ` Gregory Maxwell
  1 sibling, 0 replies; 16+ messages in thread
From: Gregory Maxwell @ 2017-12-13  0:12 UTC (permalink / raw)
  To: Suhas Daftuar; +Cc: Bitcoin Protocol Discussion

On Tue, Dec 12, 2017 at 9:07 PM, Suhas Daftuar <sdaftuar@gmail•com> wrote:
> I agree with this.  Specifically the way I envisioned this working is that
> we could introduce a new 'cmpctheaders'/'getcmpcthdrs' message pair for
> syncing using this new message type, while leaving the existing
> 'headers'/'getheaders' messages unchanged.  So when communicating with
> upgraded peers, we'd never use 'getheaders' messages, and we'd only use
> 'headers' messages for potentially announcing new blocks.

The question becomes there-- how should it work.

In an ideal world, we'd decide what peers to fetch headers from based
on a compact proof of the total work in their chains... but we cannot
construct such proofs in Bitcoin today.

I think that instead of that a weak heuristic is to fetch first from
the peers with the tip at the highest difficulty. Then work backwards.

See: https://en.bitcoin.it/wiki/User:Gmaxwell/Reverse_header-fetching_sync

Which is the inspiration for the current headers first sync, but
without the reverse part because the protocol didn't permit it: you
can't request a linear wad of headers that come before a header. This
is the thing I was mostly thinking of when I mentioned that we may
want to change the interface.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-12-12 21:07   ` Suhas Daftuar
@ 2017-12-13  0:01     ` Gregory Maxwell
  2017-12-13  0:12     ` Gregory Maxwell
  1 sibling, 0 replies; 16+ messages in thread
From: Gregory Maxwell @ 2017-12-13  0:01 UTC (permalink / raw)
  To: Suhas Daftuar; +Cc: Bitcoin Protocol Discussion

On Tue, Dec 12, 2017 at 9:07 PM, Suhas Daftuar <sdaftuar@gmail•com> wrote:
> But I think we should be able to get nearly all the benefit just by
> including nBits in any messages where the value is ambiguous; ie we include
> it with the first header in a message, and whenever it changes from the
> previous header's nBits.

Yes, that is what I was thinking last time we discussed it, just with
each header include a one byte flag that lets you express:

bit: meaning
(0) if nbits is the same as last,
(1) if timestamp is a full field or a small offset, (e.g. two bytes
defined as unsigned offset between the last time - 7200 and the new
time).
(2,3,4) if version is the same as the last distinct value .. 7th last,
or a new 32bit distinct value;
(5,6) if prev is entirely there, entirely gone, first 4 bytes
provided, or first 8 bytes provided. (if provided they override the
computed values).

That would be 7 bits in total; the 8th could be reserved or use to
signal "more headers follow" to make the encoding self-delimiting.

The downside with nbits the same as last as the optimization is that
if we ever change consensus rules to ones where difficulty management
works differently it may be the case that nbits changes every block.

Alternatively, nbits could get a differential encoding that could be
opted into for small differences-- though I haven't thought much about
it to see if a one byte difference would be that useful (e.g. can bch
differences usually be expressed with one byte?)

I'm kind of dubious of the consensus layer anti-dos separation:  nbits
minimum is so low compared to the speed of a mining device, virtually
any attack that you might do with invalid headers could still be done
with headers at the minimum difficulty. But I'm fully willing to
accept that simpler is better...



>> I would rather not change the serialization of existing messages,
>> nodes are going to have to support speaking both messages for a long
>> time, and I think we already want a different protocol flow for
>> headers fetching in any case.
>
>
> I agree with this.  Specifically the way I envisioned this working is that
> we could introduce a new 'cmpctheaders'/'getcmpcthdrs' message pair for
> syncing using this new message type, while leaving the existing
> 'headers'/'getheaders' messages unchanged.  So when communicating with
> upgraded peers, we'd never use 'getheaders' messages, and we'd only use
> 'headers' messages for potentially announcing new blocks.
>
> Of course, we'll have to support the existing protocol for a long time.  But
> one downside I've discovered from deploying BIP 130, which introduced block
> announcements via headers messages, is that overloading a 'headers' message
> to be either a block announcement or a response to a 'getheaders' message
> results in p2p-handling logic which is more complicated than it needs to be.
> So splitting off the headers chain-sync functionality to a new message pair
> seems like a nice side-effect benefit, in the long run.
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-12-11 21:04 ` Gregory Maxwell
  2017-12-11 21:56   ` Jim Posen
@ 2017-12-12 21:07   ` Suhas Daftuar
  2017-12-13  0:01     ` Gregory Maxwell
  2017-12-13  0:12     ` Gregory Maxwell
  1 sibling, 2 replies; 16+ messages in thread
From: Suhas Daftuar @ 2017-12-12 21:07 UTC (permalink / raw)
  To: Gregory Maxwell, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 2571 bytes --]

Hi,

First, thanks for resurrecting this, I agree this is worth pursuing.

On Mon, Dec 11, 2017 at 4:04 PM, Gregory Maxwell via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:

>
> Nbits _never_ needs to be sent even with other consensus rules because
> its more or less necessarily a strict function of the prior headers.
> This still holds in every clone of Bitcoin I'm aware of; sending it
> with the first header in a group probably makes sense so it can be
> checked independently.
>

I think it would be nice, though, to not require the consensus-correct
calculation of nBits in order to process p2p messages.  For instance, I
think there's a use for nBits at the p2p layer for calculating the work on
a chain, which can be used as an anti-DoS measure, even without verifying
that the difficulty adjustments are following the consensus rules.

Moreover I think it's a bit messy if the p2p layer depends on intricate
consensus rules in order to reconstruct a message -- either we'd need to
interact with existing consensus logic in a potentially new way, or we'd
reimplement the same logic in the p2p layer, neither of which is very
desirable imo.

But I think we should be able to get nearly all the benefit just by
including nBits in any messages where the value is ambiguous; ie we include
it with the first header in a message, and whenever it changes from the
previous header's nBits.

I would rather not change the serialization of existing messages,
> nodes are going to have to support speaking both messages for a long
> time, and I think we already want a different protocol flow for
> headers fetching in any case.
>

I agree with this.  Specifically the way I envisioned this working is that
we could introduce a new 'cmpctheaders'/'getcmpcthdrs' message pair for
syncing using this new message type, while leaving the existing
'headers'/'getheaders' messages unchanged.  So when communicating with
upgraded peers, we'd never use 'getheaders' messages, and we'd only use
'headers' messages for potentially announcing new blocks.

Of course, we'll have to support the existing protocol for a long time.
But one downside I've discovered from deploying BIP 130, which introduced
block announcements via headers messages, is that overloading a 'headers'
message to be either a block announcement or a response to a 'getheaders'
message results in p2p-handling logic which is more complicated than it
needs to be.  So splitting off the headers chain-sync functionality to a
new message pair seems like a nice side-effect benefit, in the long run.

[-- Attachment #2: Type: text/html, Size: 3414 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-12-11 22:41     ` Tier Nolan
@ 2017-12-11 23:11       ` Gregory Maxwell
  0 siblings, 0 replies; 16+ messages in thread
From: Gregory Maxwell @ 2017-12-11 23:11 UTC (permalink / raw)
  To: Tier Nolan, Bitcoin Protocol Discussion

On Mon, Dec 11, 2017 at 10:41 PM, Tier Nolan via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
> There is a method called "high hash highway" that allows compact proofs of
> total POW.

That provides no security without additional consensus enforced
commitments, so I think pretty off-topic for this discussion.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-12-11 21:56   ` Jim Posen
@ 2017-12-11 22:41     ` Tier Nolan
  2017-12-11 23:11       ` Gregory Maxwell
  0 siblings, 1 reply; 16+ messages in thread
From: Tier Nolan @ 2017-12-11 22:41 UTC (permalink / raw)
  To: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1837 bytes --]

On Mon, Dec 11, 2017 at 9:56 PM, Jim Posen via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:

> Omitting nBits entirely seems reasonable, I wrote up a possible
> implementation here
> <https://github.com/bitcoin/bitcoin/compare/master...jimpo:compact-headers-difficulty>.
> The downside is that it is more complex because it leaks into the
> validation code. The extra 4 byte savings is certainly nice though.
>

A compromise would be to have 1 byte indicating the difference since the
last header.

Since the exponent doesn't use the full range you could steal bits from
there to indicate mode.

- no change
- mantissa offset (for small changes)
- full difficulty

This would support any nBits rule and you say 3 of the 4 bytes.


> Can you elaborate on how parallel header fetching might work? getheaders
> requests could probably already be pipelined, where the node requests the
> next 2,000 headers before processing the current batch (though would make
> sense to check that they are all above min difficulty first).
>

I suggest adding a message where you can ask for the lowest N hashes
between 2 heights on the main chain.

The reply is an array of {height, header} pairs for the N headers with the
lowest hash in the specified range.

All peers should agree on which headers are in the array.  If there is
disagreement, then you can at least narrow down on which segment there is
disagreement.

It works kind of like a cut and choose.  You pick one segment of the ones
he gave you recursively.

You can ask a peer for proof for a segment between 2 headers of the form.

- first header + coinbase with merkle branch
- all headers in the segment

This proves the segment has the correct height and that all the headers
link up.

There is a method called "high hash highway" that allows compact proofs of
total POW.

[-- Attachment #2: Type: text/html, Size: 2865 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-12-11 21:04 ` Gregory Maxwell
@ 2017-12-11 21:56   ` Jim Posen
  2017-12-11 22:41     ` Tier Nolan
  2017-12-12 21:07   ` Suhas Daftuar
  1 sibling, 1 reply; 16+ messages in thread
From: Jim Posen @ 2017-12-11 21:56 UTC (permalink / raw)
  To: Gregory Maxwell; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 3678 bytes --]

On Mon, Dec 11, 2017 at 1:04 PM, Gregory Maxwell <greg@xiph•org> wrote:

> On Mon, Dec 11, 2017 at 8:40 PM, Jim Posen via bitcoin-dev
> <bitcoin-dev@lists•linuxfoundation.org> wrote:
> > Firstly, I don't like the idea of making the net header encoding
> dependent
> > on the specific header validation rules that Bitcoin uses (eg. the fact
> that
> > difficulty is only recalculated every 2016 blocks). This would be
> coupling
>
> In the last proposal I recall writing up, there was a one byte flag on
> each header to indicate what was included.
>
>
Is there a link somewhere to that proposal? The only thing I could find was
your forwarded email
<https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-September/014904.html>
on
this thread.


> Nbits _never_ needs to be sent even with other consensus rules because
> its more or less necessarily a strict function of the prior headers.
> This still holds in every clone of Bitcoin I'm aware of; sending it
> with the first header in a group probably makes sense so it can be
> checked independently.
>
> > with insufficient benefit.
>
> another >18% reduction in size beyond the removal of prev. is not
> insubstantial by any means.  I don't think it should lightly be
> ignored.
>
>
Omitting nBits entirely seems reasonable, I wrote up a possible
implementation here
<https://github.com/bitcoin/bitcoin/compare/master...jimpo:compact-headers-difficulty>.
The downside is that it is more complex because it leaks into the
validation code. The extra 4 byte savings is certainly nice though.


> Prev omission itself is not, sadly, magically compatible:  I am quite
> confident that if there is a bitcoin hardfork it would recover the
> nbits/4-guarenteed always-zero bits of prev to use as extra nonce for
> miners. This has been proposed many times, implemented at least once,
> and the current requirement for mining infrastructure to reach inside
> the coinbase txn to increment a nonce has been a reliable source of
> failures.  So I think we'd want to have the encoding able to encode
> leading prev bits.
>
> Many altcoins also change the header structures. If the better thing
> is altcoin incompatible, we should still do it. Doing otherwise would
> competitively hobble Bitcoin especially considering the frequent
> recklessly incompetent moves made by various altcoins and the near
> total lack of useful novel development we've seen come out of the
> clones.
>
> Probably the most important change in a new header message wouldn't be
> the encoding, but it would be changing the fetching mechanism so that
> header sets could be pulled in parallel, etc.
>
> I would rather not change the serialization of existing messages,
> nodes are going to have to support speaking both messages for a long
> time, and I think we already want a different protocol flow for
> headers fetching in any case.
>

Can you elaborate on how parallel header fetching might work? getheaders
requests could probably already be pipelined, where the node requests the
next 2,000 headers before processing the current batch (though would make
sense to check that they are all above min difficulty first).

I'm open to more ideas on how to optimize the header download or design the
serialization format to be more flexible, but I'm concerned that we forgo a
40-45% bandwidth savings on the current protocol for a long time because
something better might be possible later on or there might be a hard fork
that at some point requires another upgrade. I do recognize that supporting
multiple serialization formats simultaneously adds code complexity, but in
this case the change seems simple enough to me that the tradeoff is worth
it.

[-- Attachment #2: Type: text/html, Size: 4659 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bitcoin-dev] "Compressed" headers stream
  2017-12-11 20:40 [bitcoin-dev] " Jim Posen
@ 2017-12-11 21:04 ` Gregory Maxwell
  2017-12-11 21:56   ` Jim Posen
  2017-12-12 21:07   ` Suhas Daftuar
  0 siblings, 2 replies; 16+ messages in thread
From: Gregory Maxwell @ 2017-12-11 21:04 UTC (permalink / raw)
  To: Jim Posen, Bitcoin Protocol Discussion

On Mon, Dec 11, 2017 at 8:40 PM, Jim Posen via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
> Firstly, I don't like the idea of making the net header encoding dependent
> on the specific header validation rules that Bitcoin uses (eg. the fact that
> difficulty is only recalculated every 2016 blocks). This would be coupling

In the last proposal I recall writing up, there was a one byte flag on
each header to indicate what was included.

Nbits _never_ needs to be sent even with other consensus rules because
its more or less necessarily a strict function of the prior headers.
This still holds in every clone of Bitcoin I'm aware of; sending it
with the first header in a group probably makes sense so it can be
checked independently.

> with insufficient benefit.

another >18% reduction in size beyond the removal of prev. is not
insubstantial by any means.  I don't think it should lightly be
ignored.

Prev omission itself is not, sadly, magically compatible:  I am quite
confident that if there is a bitcoin hardfork it would recover the
nbits/4-guarenteed always-zero bits of prev to use as extra nonce for
miners. This has been proposed many times, implemented at least once,
and the current requirement for mining infrastructure to reach inside
the coinbase txn to increment a nonce has been a reliable source of
failures.  So I think we'd want to have the encoding able to encode
leading prev bits.

Many altcoins also change the header structures. If the better thing
is altcoin incompatible, we should still do it. Doing otherwise would
competitively hobble Bitcoin especially considering the frequent
recklessly incompetent moves made by various altcoins and the near
total lack of useful novel development we've seen come out of the
clones.

Probably the most important change in a new header message wouldn't be
the encoding, but it would be changing the fetching mechanism so that
header sets could be pulled in parallel, etc.

I would rather not change the serialization of existing messages,
nodes are going to have to support speaking both messages for a long
time, and I think we already want a different protocol flow for
headers fetching in any case.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [bitcoin-dev] "Compressed" headers stream
@ 2017-12-11 20:40 Jim Posen
  2017-12-11 21:04 ` Gregory Maxwell
  0 siblings, 1 reply; 16+ messages in thread
From: Jim Posen @ 2017-12-11 20:40 UTC (permalink / raw)
  To: bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 3815 bytes --]

I want to resurrect this thread from August/September because it seems like
a significant improvement for light clients at very little cost. From the
mailing list, it seems like this got stalled in determining how many more
bytes could be save in addition to the prev_block.

The ideas I've gathered from Greg Maxwell's forwarded email are:

1. Omit nBits altogether and have the receiving node determine it from
chain context.
2. Include nBits only on headers with a height that is a multiple of 2016
since it does not change in between.
3. Compress nTime to two bytes by using the bounds on allowed values from
the consensus rules.

I propose just moving ahead with only the exclusion of the prev_block, as
IMO the other savings are not worth the added complexity.

Firstly, I don't like the idea of making the net header encoding dependent
on the specific header validation rules that Bitcoin uses (eg. the fact
that difficulty is only recalculated every 2016 blocks). This would be
coupling together the two layers, breaking net compatibility for some alts,
and possibly making consensus rule changes even more difficult for a
savings with insufficient benefit. So if you buy that argument, I'm not in
favor of #2 or #3.

Option 1 is still viable, though it has some downsides. The implementation
leaks into the validation code, whereas calculating prev_block can occur
just at the net layer (see implementation below). Also, nodes would now be
*required* to sync the header chain from the genesis block, whereas they
had the option of starting from some checkpoint before.

So switching gears, I'd like to ask what the best way to actually implement
this change is. Solutions I can think of are:

1. New headers command name like "cmpctheaders" or "headersv2".
2. Change serialization of existing headers message in a new protocol
version.
3. Change serialization of existing headers message with new service bit.

I wrote up some proof-of-concept implementations in Core a) just omitting
prev_block
<https://github.com/bitcoin/bitcoin/compare/master...jimpo:compact-headers>
and b) omitting nBits as well
<https://github.com/bitcoin/bitcoin/compare/master...jimpo:compact-headers-difficulty>.
If people think a) is reasonable, I'll write up a BIP.


> Hi everyone, the Bitcoin headers are probably the most condensed and
> important piece of data in the world, their demand is expected to grow.
> When sending a stream of continuous block headers, a common case in IBD and
> in disconnected clients, I think there is a possible optimization of the
> transmitted data: The headers after the first could avoid transmitting the
> previous hash cause the receiver could compute it by double hashing the
> previous header (an operation he needs to do anyway to verify PoW). In a
> long stream, for example 2016 headers, the savings in bandwidth are about
> 32/80 ~= 40% without compressed headers 2016*80=161280 bytes with
> compressed headers 80+2015*48=96800 bytes What do you think? In
> OpenTimestamps calendars we are going to use this compression to give
> lite-client a reasonable secure proofs (a full node give higher security
> but isn't feasible in all situations, for example for in-browser
> verification) To speed up sync of a new client Electrum starts with the
> download of a file <https://headers.electrum.org/blockchain_headers>
> ~36MB containing the first 477637 headers. For this kind of clients could
> be useful a common http API with fixed position chunks to leverage http
> caching. For example /headers/2016/0 returns the headers from the genesis
> to the 2015 header included while /headers/2016/1 gives the headers from
> the 2016th to the 4031. Other endpoints could have chunks of 20160 blocks
> or 201600 such that with about 10 http requests a client could fast sync
> the headers

[-- Attachment #2: Type: text/html, Size: 4325 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-12-13  0:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-28 15:50 [bitcoin-dev] "Compressed" headers stream Riccardo Casatta
2017-08-28 16:13 ` Greg Sanders
2017-08-28 16:25   ` Riccardo Casatta
2017-08-28 16:26     ` Greg Sanders
2017-09-04 14:10       ` Peter Todd
     [not found] ` <CAAS2fgS3uG=4vgFuObPKA_5MstoGm4AabO=60fhV3EU_0dvejg@mail.gmail.com>
2017-08-28 17:12   ` [bitcoin-dev] Fwd: " Gregory Maxwell
2017-08-28 17:54     ` Kalle Rosenbaum
2017-09-04 14:06     ` Peter Todd
2017-12-11 20:40 [bitcoin-dev] " Jim Posen
2017-12-11 21:04 ` Gregory Maxwell
2017-12-11 21:56   ` Jim Posen
2017-12-11 22:41     ` Tier Nolan
2017-12-11 23:11       ` Gregory Maxwell
2017-12-12 21:07   ` Suhas Daftuar
2017-12-13  0:01     ` Gregory Maxwell
2017-12-13  0:12     ` Gregory Maxwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox