public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
* [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
@ 2016-11-17  0:06 Jorge Timón
  2016-11-17  0:10 ` Eric Voskuil
  0 siblings, 1 reply; 19+ messages in thread
From: Jorge Timón @ 2016-11-17  0:06 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: Bitcoin Protocol Discussion, Thomas Kerin

On Thu, Nov 17, 2016 at 1:00 AM, Eric Voskuil <eric@voskuil•org> wrote:
> This is a misinterpretation of BIP30. Duplicate transaction hashes can
> and will happen and are perfectly valid in Bitcoin. BIP34 does not
> prevent this.

Sorry for moving the topic, but isn't duplication of tx hashes
precisely what BIP30 prevents?
That was my undesrtanding but should read it again.
Since regular txs take inputs, the collision is extremely unlikely
(again, this is my understanding, please correct me when wrong), the
worrying case is coinbase txs (which don't have input to take entropy
from). By introducing the committed height, collisions on coinbase txs
are prevented too.

If I'm wrong on any of this I'm more than happy to learn why.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17  0:06 [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments) Jorge Timón
@ 2016-11-17  0:10 ` Eric Voskuil
  2016-11-17  0:31   ` Tier Nolan
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Voskuil @ 2016-11-17  0:10 UTC (permalink / raw)
  To: Jorge Timón; +Cc: Bitcoin Protocol Discussion, Thomas Kerin

[-- Attachment #1: Type: text/plain, Size: 1353 bytes --]

No, BIP30 prevents duplicate tx hashes in the case where the new tx hash
duplicates that of a preceding tx with unspent outputs.

There was one such case that had already become buried in the chain at
the time, so it was exempted from validation. There was another case of
a duplicate hash, but it's predecessor was spent so it complied with the
new rule.

Both of these cases resulted from exact duplicate txs, which BIP34 now
precludes. However nothing precludes different txs from having the same
hash.

e

On 11/16/2016 04:06 PM, Jorge Timón wrote:
> On Thu, Nov 17, 2016 at 1:00 AM, Eric Voskuil <eric@voskuil•org> wrote:
>> This is a misinterpretation of BIP30. Duplicate transaction hashes can
>> and will happen and are perfectly valid in Bitcoin. BIP34 does not
>> prevent this.
> 
> Sorry for moving the topic, but isn't duplication of tx hashes
> precisely what BIP30 prevents?
> That was my undesrtanding but should read it again.
> Since regular txs take inputs, the collision is extremely unlikely
> (again, this is my understanding, please correct me when wrong), the
> worrying case is coinbase txs (which don't have input to take entropy
> from). By introducing the committed height, collisions on coinbase txs
> are prevented too.
> 
> If I'm wrong on any of this I'm more than happy to learn why.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17  0:10 ` Eric Voskuil
@ 2016-11-17  0:31   ` Tier Nolan
  2016-11-17  0:43     ` Eric Voskuil
  0 siblings, 1 reply; 19+ messages in thread
From: Tier Nolan @ 2016-11-17  0:31 UTC (permalink / raw)
  To: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1161 bytes --]

On Thu, Nov 17, 2016 at 12:10 AM, Eric Voskuil via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:

> Both of these cases resulted from exact duplicate txs, which BIP34 now
> precludes. However nothing precludes different txs from having the same
> hash.
>

The only way to have two transactions have the same txid is if their
parents are identical, since the txids of the parents are included in a
transaction.

Coinbases have no parents, so it used to be possible for two of them to be
identical.

Duplicate outputs weren't possible in the database, so the later coinbase
transaction effectively overwrote the earlier one.

The happened for two coinbases.  That is what the exceptions are for.

Neither of the those coinbases were spent before the overwrite happened.  I
don't even think those coinbases were spent at all.

This means that every activate coinbase transaction has a unique hash and
all new coinbases will be unique.

This means that all future transactions will have different txids.

There might not be an explicit rule that says that txids have to be unique,
but barring a break of the hash function, they rules do guarantee it.

[-- Attachment #2: Type: text/html, Size: 1638 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17  0:31   ` Tier Nolan
@ 2016-11-17  0:43     ` Eric Voskuil
  2016-11-17  0:53       ` Eric Voskuil
                         ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Eric Voskuil @ 2016-11-17  0:43 UTC (permalink / raw)
  To: Tier Nolan, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1829 bytes --]

> This means that all future transactions will have different txids...
rules do guarantee it.

No, it means that the chance is small, there is a difference.

If there is an address collision, someone may lose some money. If there
is a tx hash collision, and implementations handle this differently, it
will produce a chain split. As such this is not something that a node
can just dismiss. If they do they are implementing a hard fork.

e

On 11/16/2016 04:31 PM, Tier Nolan via bitcoin-dev wrote:
> 
> 
> On Thu, Nov 17, 2016 at 12:10 AM, Eric Voskuil via bitcoin-dev
> <bitcoin-dev@lists•linuxfoundation.org
> <mailto:bitcoin-dev@lists•linuxfoundation.org>> wrote:
> 
>     Both of these cases resulted from exact duplicate txs, which BIP34 now
>     precludes. However nothing precludes different txs from having the same
>     hash.
> 
> 
> The only way to have two transactions have the same txid is if their
> parents are identical, since the txids of the parents are included in a
> transaction.
> 
> Coinbases have no parents, so it used to be possible for two of them to
> be identical.
> 
> Duplicate outputs weren't possible in the database, so the later
> coinbase transaction effectively overwrote the earlier one.
> 
> The happened for two coinbases.  That is what the exceptions are for.
> 
> Neither of the those coinbases were spent before the overwrite
> happened.  I don't even think those coinbases were spent at all.
> 
> This means that every activate coinbase transaction has a unique hash
> and all new coinbases will be unique.
> 
> This means that all future transactions will have different txids.
> 
> There might not be an explicit rule that says that txids have to be
> unique, but barring a break of the hash function, they rules do
> guarantee it.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17  0:43     ` Eric Voskuil
@ 2016-11-17  0:53       ` Eric Voskuil
  2016-11-17  8:44       ` Peter Todd
  2016-11-17 10:22       ` Tier Nolan
  2 siblings, 0 replies; 19+ messages in thread
From: Eric Voskuil @ 2016-11-17  0:53 UTC (permalink / raw)
  To: Tier Nolan, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 2322 bytes --]

Also, it's important to take note of the motivation behind not banning
duplicate tx hashes outright. Doing so would require that spent tx
hashes are retained forever. A pruning node will have no way of knowing
whether a new tx duplicates the hash of a preceding tx. Any
implementation that does retain such hashes and dismisses new txs on
that basis would fork against pruning nodes.

e

On 11/16/2016 04:43 PM, Eric Voskuil wrote:
>> This means that all future transactions will have different txids...
> rules do guarantee it.
> 
> No, it means that the chance is small, there is a difference.
> 
> If there is an address collision, someone may lose some money. If there
> is a tx hash collision, and implementations handle this differently, it
> will produce a chain split. As such this is not something that a node
> can just dismiss. If they do they are implementing a hard fork.
> 
> e
> 
> On 11/16/2016 04:31 PM, Tier Nolan via bitcoin-dev wrote:
>>
>>
>> On Thu, Nov 17, 2016 at 12:10 AM, Eric Voskuil via bitcoin-dev
>> <bitcoin-dev@lists•linuxfoundation.org
>> <mailto:bitcoin-dev@lists•linuxfoundation.org>> wrote:
>>
>>     Both of these cases resulted from exact duplicate txs, which BIP34 now
>>     precludes. However nothing precludes different txs from having the same
>>     hash.
>>
>>
>> The only way to have two transactions have the same txid is if their
>> parents are identical, since the txids of the parents are included in a
>> transaction.
>>
>> Coinbases have no parents, so it used to be possible for two of them to
>> be identical.
>>
>> Duplicate outputs weren't possible in the database, so the later
>> coinbase transaction effectively overwrote the earlier one.
>>
>> The happened for two coinbases.  That is what the exceptions are for.
>>
>> Neither of the those coinbases were spent before the overwrite
>> happened.  I don't even think those coinbases were spent at all.
>>
>> This means that every activate coinbase transaction has a unique hash
>> and all new coinbases will be unique.
>>
>> This means that all future transactions will have different txids.
>>
>> There might not be an explicit rule that says that txids have to be
>> unique, but barring a break of the hash function, they rules do
>> guarantee it.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17  0:43     ` Eric Voskuil
  2016-11-17  0:53       ` Eric Voskuil
@ 2016-11-17  8:44       ` Peter Todd
  2016-11-17  9:58         ` Eric Voskuil
  2016-11-17 10:22       ` Tier Nolan
  2 siblings, 1 reply; 19+ messages in thread
From: Peter Todd @ 2016-11-17  8:44 UTC (permalink / raw)
  To: Eric Voskuil, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1754 bytes --]

On Wed, Nov 16, 2016 at 04:43:08PM -0800, Eric Voskuil via bitcoin-dev wrote:
> > This means that all future transactions will have different txids...
> rules do guarantee it.
> 
> No, it means that the chance is small, there is a difference.
> 
> If there is an address collision, someone may lose some money. If there
> is a tx hash collision, and implementations handle this differently, it
> will produce a chain split. As such this is not something that a node
> can just dismiss. If they do they are implementing a hard fork.

If there is a tx hash collision it is almost certainly going to be because
SHA256 has become weak through advances in cryptography, much like MD5. If that
is the case, Bitcoin is fundementally broken because the blockchain no longer
can be relied upon to commit to a unique transaction history: miners would be
able to generate blocks that have SHA256 collisions in transactions and even
the merkle tree itself, making it possible to simultaneously mine two (or more)
contradictory transaction histories at once.

Meanwhile the probability of SHA256 _not_ being broken and a collision being
found is low enough that we should be more worried about earth-killing
asteroids and mutant sharks, among other things.

Quoting Bruce Schneier:

    These numbers have nothing to do with the technology of the devices; they are
    the maximums that thermodynamics will allow. And they strongly imply that
    brute-force attacks against 256-bit keys will be infeasible until computers are
    built from something other than matter and occupy something other than space.

-https://www.schneier.com/blog/archives/2009/09/the_doghouse_cr.html

-- 
https://petertodd.org 'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17  8:44       ` Peter Todd
@ 2016-11-17  9:58         ` Eric Voskuil
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Voskuil @ 2016-11-17  9:58 UTC (permalink / raw)
  To: Peter Todd, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1497 bytes --]

On 11/17/2016 12:44 AM, Peter Todd wrote:
> On Wed, Nov 16, 2016 at 04:43:08PM -0800, Eric Voskuil via bitcoin-dev wrote:
>>> This means that all future transactions will have different txids...
>> rules do guarantee it.
>>
>> No, it means that the chance is small, there is a difference.
>>
>> If there is an address collision, someone may lose some money. If there
>> is a tx hash collision, and implementations handle this differently, it
>> will produce a chain split. As such this is not something that a node
>> can just dismiss. If they do they are implementing a hard fork.
> 
> If there is a tx hash collision it is almost certainly going to be because
> SHA256 has become weak..

Almost certainly is not certainly. Hash collisions happen because of chance.

BIP30 is quite explicit:

> "Fully-spent transactions are allowed to be duplicated in order not to
hinder pruning at some point in the future. Not allowing any transaction
to be duplicated would require evidence to be kept for each transaction
ever made."

BIP34 motivations:

> "2. Enforce block and transaction uniqueness, and assist unconnected
block validation."

But it only specifies making collisions harder, not impossible (i.e.
explicitly rejected by consensus).

Are you proposing that we draft a new BIP that allows us all to not have
to code for this? If we do so it will be impossible to guard against a
chain split due to pruning, but you seem to think that's unimportant.

e


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17  0:43     ` Eric Voskuil
  2016-11-17  0:53       ` Eric Voskuil
  2016-11-17  8:44       ` Peter Todd
@ 2016-11-17 10:22       ` Tier Nolan
  2016-11-17 11:22         ` Eric Voskuil
  2 siblings, 1 reply; 19+ messages in thread
From: Tier Nolan @ 2016-11-17 10:22 UTC (permalink / raw)
  Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 995 bytes --]

On Thu, Nov 17, 2016 at 12:43 AM, Eric Voskuil <eric@voskuil•org> wrote:

> > This means that all future transactions will have different txids...
> rules do guarantee it.
>
> No, it means that the chance is small, there is a difference.
>

I think we are mostly in agreement then?  It is just terminology.

In terms of discussing the BIP, barring a hash collision, it does make
duplicate txids impossible.

Given that a hash collision is so unlikely, the qualifier should be added
to those making claims that require hash collisions rather than those who
assume that they aren't possible.

You could have said "However nothing precludes different txs from having
the same hash, but it requires a hash collision".

Thinking about it, a re-org to before the enforcement height could allow
it.  The checkpoints protect against that though.


> As such this is not something that a node
> can just dismiss.


The security of many parts of the system is based on hash collisions not
being possible.

[-- Attachment #2: Type: text/html, Size: 1638 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 10:22       ` Tier Nolan
@ 2016-11-17 11:22         ` Eric Voskuil
  2016-11-17 11:38           ` Alex Morcos
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Voskuil @ 2016-11-17 11:22 UTC (permalink / raw)
  To: bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 2605 bytes --]

On 11/17/2016 02:22 AM, Tier Nolan via bitcoin-dev wrote:
> On Thu, Nov 17, 2016 at 12:43 AM, Eric Voskuil <eric@voskuil•org
> <mailto:eric@voskuil•org>> wrote:
> 
>     > This means that all future transactions will have different txids...
>     rules do guarantee it.
> 
>     No, it means that the chance is small, there is a difference.
> 
> I think we are mostly in agreement then?  It is just terminology.

Sure, if you accept that mostly is not fully - just as unlikely is not
impossible.

> In terms of discussing the BIP, barring a hash collision, it does make
> duplicate txids impossible.

That's like saying, as long as we exclude car accidents from
consideration, car accidents are impossible.

> Given that a hash collision is so unlikely, the qualifier should be
> added to those making claims that require hash collisions rather than
> those who assume that they aren't possible.
> 
> You could have said "However nothing precludes different txs from having
> the same hash, but it requires a hash collision".

I generally try to avoid speaking in tautologies :)

> Thinking about it, a re-org to before the enforcement height could allow
> it.  The checkpoints protect against that though.
>  
>     As such this is not something that a node
>     can just dismiss. 
> 
> The security of many parts of the system is based on hash collisions not
> being possible.

This is not the case.

Block hash duplicates within the same chain are invalid as a matter of
consensus, which is the opposite of assuming impossibility.

Tx hash collisions are explicitly allowed in the case that preceding tx
with the same hash is unspent. This is also not a reliance on the
impossibility of hash collision. Core certainly implements this distinction:

https://github.com/bitcoin/bitcoin/blob/master/src/main.cpp#L2419-L2426

Address hashes and script hashes can collide without harming the
security of Bitcoin (although address owner(s) may experience harm).
Rare in this case is sufficient because of this distinction.

Compact blocks contemplates hash collisions:

https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#Random_collision_probabilty

Checkpoints aren't part of Bitcoin security, so even the remote
possibility of two different potential blocks, with the same hash, at
the same height in the same chain, does not indicate a problem.

There is no case where the security of Bitcoin assumes that hashes never
collide. Consensus rules have specific handling for both block hash
collisions and tx hash collisions.

e


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 11:22         ` Eric Voskuil
@ 2016-11-17 11:38           ` Alex Morcos
  2016-11-17 12:22             ` Eric Voskuil
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Morcos @ 2016-11-17 11:38 UTC (permalink / raw)
  To: Eric Voskuil, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 3356 bytes --]

I think this conversation has gone off the rails and is no longer really
appropriate for the list.

But just to be clear to any readers.  Bitcoin Core absolutely does rely on
the impossibility of a hash collision for maintaining consensus.  This
happens in multiple places in the code but in particular we don't check
BIP30 any more since the only way it could get violated is by a hash
collision.





On Thu, Nov 17, 2016 at 6:22 AM, Eric Voskuil via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:

> On 11/17/2016 02:22 AM, Tier Nolan via bitcoin-dev wrote:
> > On Thu, Nov 17, 2016 at 12:43 AM, Eric Voskuil <eric@voskuil•org
> > <mailto:eric@voskuil•org>> wrote:
> >
> >     > This means that all future transactions will have different
> txids...
> >     rules do guarantee it.
> >
> >     No, it means that the chance is small, there is a difference.
> >
> > I think we are mostly in agreement then?  It is just terminology.
>
> Sure, if you accept that mostly is not fully - just as unlikely is not
> impossible.
>
> > In terms of discussing the BIP, barring a hash collision, it does make
> > duplicate txids impossible.
>
> That's like saying, as long as we exclude car accidents from
> consideration, car accidents are impossible.
>
> > Given that a hash collision is so unlikely, the qualifier should be
> > added to those making claims that require hash collisions rather than
> > those who assume that they aren't possible.
> >
> > You could have said "However nothing precludes different txs from having
> > the same hash, but it requires a hash collision".
>
> I generally try to avoid speaking in tautologies :)
>
> > Thinking about it, a re-org to before the enforcement height could allow
> > it.  The checkpoints protect against that though.
> >
> >     As such this is not something that a node
> >     can just dismiss.
> >
> > The security of many parts of the system is based on hash collisions not
> > being possible.
>
> This is not the case.
>
> Block hash duplicates within the same chain are invalid as a matter of
> consensus, which is the opposite of assuming impossibility.
>
> Tx hash collisions are explicitly allowed in the case that preceding tx
> with the same hash is unspent. This is also not a reliance on the
> impossibility of hash collision. Core certainly implements this
> distinction:
>
> https://github.com/bitcoin/bitcoin/blob/master/src/main.cpp#L2419-L2426
>
> Address hashes and script hashes can collide without harming the
> security of Bitcoin (although address owner(s) may experience harm).
> Rare in this case is sufficient because of this distinction.
>
> Compact blocks contemplates hash collisions:
>
> https://github.com/bitcoin/bips/blob/master/bip-0152.
> mediawiki#Random_collision_probabilty
>
> Checkpoints aren't part of Bitcoin security, so even the remote
> possibility of two different potential blocks, with the same hash, at
> the same height in the same chain, does not indicate a problem.
>
> There is no case where the security of Bitcoin assumes that hashes never
> collide. Consensus rules have specific handling for both block hash
> collisions and tx hash collisions.
>
> e
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists•linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>

[-- Attachment #2: Type: text/html, Size: 4766 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 11:38           ` Alex Morcos
@ 2016-11-17 12:22             ` Eric Voskuil
  2016-11-17 15:40               ` Johnson Lau
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Voskuil @ 2016-11-17 12:22 UTC (permalink / raw)
  To: Alex Morcos, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1401 bytes --]

On 11/17/2016 03:38 AM, Alex Morcos wrote:
> I think this conversation has gone off the rails and is no longer really
> appropriate for the list.

If this discussion is not appropriate for the Bitcoin Protocol
Discussion list then the list is pointless.

> But just to be clear to any readers.  Bitcoin Core absolutely does rely
> on the impossibility of a hash collision for maintaining consensus. 
> This happens in multiple places in the code but in particular we don't
> check BIP30 any more since the only way it could get violated is by a
> hash collision.

So the protocol change that I suggested to Peter an hour or so ago was
actually implemented, a year ago, by you:

https://github.com/bitcoin/bitcoin/commit/06d81ad516f1d136da9f03ca2ae823211c0f6988

Given that hash collisions are unquestionably possible, this is a clear
break with BIP30 (irrespective of BIP34) and constitutes a hard fork. Is
there going to be a retroactive BIP for this one at some point as well?

I'm aware that the block hash check is performed against the full chain,
as opposed to the candidate block fork height, and as a result is
insufficient to guard against a block hash collision causing a chain
split (though until now I assumed this was a bug).

Would you care to share the other consensus critical reliances on the
impossibility of hash collision that you are implying?

e


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 12:22             ` Eric Voskuil
@ 2016-11-17 15:40               ` Johnson Lau
  2016-11-17 17:01                 ` Eric Voskuil
  0 siblings, 1 reply; 19+ messages in thread
From: Johnson Lau @ 2016-11-17 15:40 UTC (permalink / raw)
  To: Eric Voskuil, bitcoin-dev


> On 17 Nov 2016, at 20:22, Eric Voskuil via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:
> 
> 
> Given that hash collisions are unquestionably possible, 

Everything you said after this point is irrelevant.

Having hash collision is **by definition** a consensus failure, or a hardfork. You could replace the already-on-chain tx with the collision and create 2 different versions of UTXOs (if the colliding tx is valid), or make some nodes to accept a fork with less PoW (if the colliding tx is invalid, or making the block invalid, such as being to big). To put it simply, the Bitcoin protocol is broken. So with no doubt, Bitcoin Core and any implementation of the Bitcoin protocol should assume SHA256 collision is unquestionably **impossible**. If some refuse to make such assumption, they should have introduced an alternative hash algorithm and somehow run it in parallel with SHA256 to prevent the consensus failure.

jl2012


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 15:40               ` Johnson Lau
@ 2016-11-17 17:01                 ` Eric Voskuil
  2016-11-17 17:22                   ` Johnson Lau
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Voskuil @ 2016-11-17 17:01 UTC (permalink / raw)
  To: Johnson Lau, bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 4481 bytes --]

On 11/17/2016 07:40 AM, Johnson Lau wrote:
>
>> On 17 Nov 2016, at 20:22, Eric Voskuil via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
>>
>> Given that hash collisions are unquestionably possible,
>
> Everything you said after this point is irrelevant.

So... you think hash collisions are not possible, or that it's moot
because Core has broken its ability to handle them.


> Having hash collision is **by definition** a consensus failure,

I suppose if you take fairly recent un-BIPped consensus changes in Core
to be the definition of consensus, you would be right about that.


> or a hardfork.

And those changes could definitely result in a chain split. So right
about that too.


> You could replace the already-on-chain tx with the collision and
create 2 different versions of UTXOs (if the colliding tx is valid), or
make some nodes to accept a fork with less PoW (if the colliding tx is
invalid, or making the block invalid, such as being to big).


Not in accordance with BIP30 and not according to the implementation of
it that existed in Core until Nov 2015. A tx was only valid as a
"replacement" if it did not collide with the hash of an existing tx with
unspent outputs. The collision would have been rejected. And an invalid
colliding tx would not be accepted in any case (since nodes presumably
validate blocks and don't rely on checkpoints as a security measure).

A transaction duplicating the hash of another and taking its place in a
block would not only have to collide the hash, but it would have to be
fully valid in the context of the block you are suggesting it is
substituted into. In that case it's simply a fully valid block. This is
not just the case of a hash collision, this is the case of a hash
collision where both transactions are fully valid in the context of the
same block parent. Even if that unlikely event did occur, it's not a
hard fork, it's a reorg. The chain that builds on this block will be
valid to all nodes but necessarily deviates from the other block's valid
chain. This is true whether the magical block is assembled via compact
blocks or otherwise.

Transaction "replacement" is an implementation detail of Core. Once Core
accepted a replacement of a previously spent transaction it would be
unable to provide the previous block/spent-tx, but that would be a
wallet failure and an inability to provide valid historical blocks, not
a consensus/validation failure. The previously spent outputs no longer
contribute to validation, unless there is a reorg back to before the
original tx's block, and at that point it would be moot, since neither
transaction is on the chain.

You are referring to the *current* behavior ("replacement" without
concern for collision). That was an unpublished hard fork, and is the
very source of the problems you are describing.

> To put it simply, the Bitcoin protocol is broken. So with no doubt,
Bitcoin Core and any implementation of the Bitcoin protocol should
assume SHA256 collision is unquestionably **impossible**.

I'm not disagreeing with you that it is broken. I'm pointing out that it
was broken by code that was merged recently - an undocumented hard fork
that reverted the documented BIP30 behavior that was previously
implemented correctly, based on the assumption that hash collisions
cannot occur, for the modest performance boost of not having to check
for unspent duplicates (sounds sort of familiar).

> If some refuse to make such assumption, they should have introduced an
alternative hash algorithm and somehow run it in parallel with SHA256 to
prevent the consensus failure.

No hash algorithm can prevent hash collisions, including one that is
just two running in parallel. A better resolution would be to fix the
problem.

There is no need to replace the BIP30 rule. That resolves the TX hash
collision problem from a consensus standpoint. In order to serve up
whole blocks in the circumstance requires a more robust store than I
believe is exists in Core, but that has nothing to do with validity.

The block hash check and signature validation caching splits caused by
collision can easily be avoided, and doing so doesn't break with
consensus. I'm not aware of any other aspects of consensus that are
effected by an implementation assumption of non-colliding hashes. But in
any case I'm pretty sure there aren't any that are necessary to consensus.

e



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 17:01                 ` Eric Voskuil
@ 2016-11-17 17:22                   ` Johnson Lau
  2016-11-17 17:49                     ` Eric Voskuil
  0 siblings, 1 reply; 19+ messages in thread
From: Johnson Lau @ 2016-11-17 17:22 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: bitcoin-dev

I’m not sure if you really understand what you and I am talking. It has nothing to do with BIP30, 34, nor any other BIPs.

Say tx1 is confirmed 3 years ago in block X. An attacker finds a valid tx2 which (tx1 != tx2) and (SHA256(tx1) == SHA256(tx2)). Now he could replace tx1 with tx2 in block X and the block is still perfectly valid. Anyone trying to download the blockchain from the beginning may end up with a different ledger. The consensus is irrevocably broken as soon as tx1 or tx2 is spent.

Or, alternatively, an attacker finds an invalid tx3 which (tx1 != tx3) and (SHA256(tx1) == SHA256(tx3)). Now he could replace tx1 with tx3 in block X. Anyone trying to download the blockchain from the beginning will permanently reject the hash of block X. They will instead accept a fork built on top of block X-1. The chain will be permanently forked.

jl2012


> On 18 Nov 2016, at 01:01, Eric Voskuil <eric@voskuil•org> wrote:
> 
> On 11/17/2016 07:40 AM, Johnson Lau wrote:
>> 
>>> On 17 Nov 2016, at 20:22, Eric Voskuil via bitcoin-dev
> <bitcoin-dev@lists•linuxfoundation.org> wrote:
>>> 
>>> Given that hash collisions are unquestionably possible,
>> 
>> Everything you said after this point is irrelevant.
> 
> So... you think hash collisions are not possible, or that it's moot
> because Core has broken its ability to handle them.
> 
> 
>> Having hash collision is **by definition** a consensus failure,
> 
> I suppose if you take fairly recent un-BIPped consensus changes in Core
> to be the definition of consensus, you would be right about that.
> 
> 
>> or a hardfork.
> 
> And those changes could definitely result in a chain split. So right
> about that too.
> 
> 
>> You could replace the already-on-chain tx with the collision and
> create 2 different versions of UTXOs (if the colliding tx is valid), or
> make some nodes to accept a fork with less PoW (if the colliding tx is
> invalid, or making the block invalid, such as being to big).
> 
> 
> Not in accordance with BIP30 and not according to the implementation of
> it that existed in Core until Nov 2015. A tx was only valid as a
> "replacement" if it did not collide with the hash of an existing tx with
> unspent outputs. The collision would have been rejected. And an invalid
> colliding tx would not be accepted in any case (since nodes presumably
> validate blocks and don't rely on checkpoints as a security measure).
> 
> A transaction duplicating the hash of another and taking its place in a
> block would not only have to collide the hash, but it would have to be
> fully valid in the context of the block you are suggesting it is
> substituted into. In that case it's simply a fully valid block. This is
> not just the case of a hash collision, this is the case of a hash
> collision where both transactions are fully valid in the context of the
> same block parent. Even if that unlikely event did occur, it's not a
> hard fork, it's a reorg. The chain that builds on this block will be
> valid to all nodes but necessarily deviates from the other block's valid
> chain. This is true whether the magical block is assembled via compact
> blocks or otherwise.
> 
> Transaction "replacement" is an implementation detail of Core. Once Core
> accepted a replacement of a previously spent transaction it would be
> unable to provide the previous block/spent-tx, but that would be a
> wallet failure and an inability to provide valid historical blocks, not
> a consensus/validation failure. The previously spent outputs no longer
> contribute to validation, unless there is a reorg back to before the
> original tx's block, and at that point it would be moot, since neither
> transaction is on the chain.
> 
> You are referring to the *current* behavior ("replacement" without
> concern for collision). That was an unpublished hard fork, and is the
> very source of the problems you are describing.
> 
>> To put it simply, the Bitcoin protocol is broken. So with no doubt,
> Bitcoin Core and any implementation of the Bitcoin protocol should
> assume SHA256 collision is unquestionably **impossible**.
> 
> I'm not disagreeing with you that it is broken. I'm pointing out that it
> was broken by code that was merged recently - an undocumented hard fork
> that reverted the documented BIP30 behavior that was previously
> implemented correctly, based on the assumption that hash collisions
> cannot occur, for the modest performance boost of not having to check
> for unspent duplicates (sounds sort of familiar).
> 
>> If some refuse to make such assumption, they should have introduced an
> alternative hash algorithm and somehow run it in parallel with SHA256 to
> prevent the consensus failure.
> 
> No hash algorithm can prevent hash collisions, including one that is
> just two running in parallel. A better resolution would be to fix the
> problem.
> 
> There is no need to replace the BIP30 rule. That resolves the TX hash
> collision problem from a consensus standpoint. In order to serve up
> whole blocks in the circumstance requires a more robust store than I
> believe is exists in Core, but that has nothing to do with validity.
> 
> The block hash check and signature validation caching splits caused by
> collision can easily be avoided, and doing so doesn't break with
> consensus. I'm not aware of any other aspects of consensus that are
> effected by an implementation assumption of non-colliding hashes. But in
> any case I'm pretty sure there aren't any that are necessary to consensus.
> 
> e
> 
> 




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 17:22                   ` Johnson Lau
@ 2016-11-17 17:49                     ` Eric Voskuil
  2016-11-17 18:08                       ` Johnson Lau
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Voskuil @ 2016-11-17 17:49 UTC (permalink / raw)
  To: Johnson Lau; +Cc: bitcoin-dev

Actually both possibilities were specifically covered in my description. Sorry if it wasn't clear.

If you create a new valid block out of an old one it's has potential to cause a reorg. The blocks that previously built on the original are still able to do so but presumably cannot build forever on the *new* block as it has a different tx. But other new blocks can. There is no chain split due to a different interpretation of valid, there are simply two valid competing chains.

Note that this scenario requires not only block and tx validity with a tx hash collision, but also that the tx be valid within the block. Pretty far to reach to not even get a chain split, but it could produce a deep reorg with a very low chance of success. As I keep telling people, deep reorgs can happen, they are just unlikely, as is this scenario.

If you create a new invalid block it is discarded by everyone. That does not invalidate the hash of that block. Permanent blocking as you describe it would be a p2p protocol design choice, having nothing to do with consensus. Libbitcoin for example does not ban invalidated hashes at all. It just discards the block and drops the peer.

e

> On Nov 17, 2016, at 9:22 AM, Johnson Lau <jl2012@xbt•hk> wrote:
> 
> I’m not sure if you really understand what you and I am talking. It has nothing to do with BIP30, 34, nor any other BIPs.
> 
> Say tx1 is confirmed 3 years ago in block X. An attacker finds a valid tx2 which (tx1 != tx2) and (SHA256(tx1) == SHA256(tx2)). Now he could replace tx1 with tx2 in block X and the block is still perfectly valid. Anyone trying to download the blockchain from the beginning may end up with a different ledger. The consensus is irrevocably broken as soon as tx1 or tx2 is spent.
> 
> Or, alternatively, an attacker finds an invalid tx3 which (tx1 != tx3) and (SHA256(tx1) == SHA256(tx3)). Now he could replace tx1 with tx3 in block X. Anyone trying to download the blockchain from the beginning will permanently reject the hash of block X. They will instead accept a fork built on top of block X-1. The chain will be permanently forked.
> 
> jl2012
> 
> 
>> On 18 Nov 2016, at 01:01, Eric Voskuil <eric@voskuil•org> wrote:
>> 
>> On 11/17/2016 07:40 AM, Johnson Lau wrote:
>>> 
>>>> On 17 Nov 2016, at 20:22, Eric Voskuil via bitcoin-dev
>> <bitcoin-dev@lists•linuxfoundation.org> wrote:
>>>> 
>>>> Given that hash collisions are unquestionably possible,
>>> 
>>> Everything you said after this point is irrelevant.
>> 
>> So... you think hash collisions are not possible, or that it's moot
>> because Core has broken its ability to handle them.
>> 
>> 
>>> Having hash collision is **by definition** a consensus failure,
>> 
>> I suppose if you take fairly recent un-BIPped consensus changes in Core
>> to be the definition of consensus, you would be right about that.
>> 
>> 
>>> or a hardfork.
>> 
>> And those changes could definitely result in a chain split. So right
>> about that too.
>> 
>> 
>>> You could replace the already-on-chain tx with the collision and
>> create 2 different versions of UTXOs (if the colliding tx is valid), or
>> make some nodes to accept a fork with less PoW (if the colliding tx is
>> invalid, or making the block invalid, such as being to big).
>> 
>> 
>> Not in accordance with BIP30 and not according to the implementation of
>> it that existed in Core until Nov 2015. A tx was only valid as a
>> "replacement" if it did not collide with the hash of an existing tx with
>> unspent outputs. The collision would have been rejected. And an invalid
>> colliding tx would not be accepted in any case (since nodes presumably
>> validate blocks and don't rely on checkpoints as a security measure).
>> 
>> A transaction duplicating the hash of another and taking its place in a
>> block would not only have to collide the hash, but it would have to be
>> fully valid in the context of the block you are suggesting it is
>> substituted into. In that case it's simply a fully valid block. This is
>> not just the case of a hash collision, this is the case of a hash
>> collision where both transactions are fully valid in the context of the
>> same block parent. Even if that unlikely event did occur, it's not a
>> hard fork, it's a reorg. The chain that builds on this block will be
>> valid to all nodes but necessarily deviates from the other block's valid
>> chain. This is true whether the magical block is assembled via compact
>> blocks or otherwise.
>> 
>> Transaction "replacement" is an implementation detail of Core. Once Core
>> accepted a replacement of a previously spent transaction it would be
>> unable to provide the previous block/spent-tx, but that would be a
>> wallet failure and an inability to provide valid historical blocks, not
>> a consensus/validation failure. The previously spent outputs no longer
>> contribute to validation, unless there is a reorg back to before the
>> original tx's block, and at that point it would be moot, since neither
>> transaction is on the chain.
>> 
>> You are referring to the *current* behavior ("replacement" without
>> concern for collision). That was an unpublished hard fork, and is the
>> very source of the problems you are describing.
>> 
>>> To put it simply, the Bitcoin protocol is broken. So with no doubt,
>> Bitcoin Core and any implementation of the Bitcoin protocol should
>> assume SHA256 collision is unquestionably **impossible**.
>> 
>> I'm not disagreeing with you that it is broken. I'm pointing out that it
>> was broken by code that was merged recently - an undocumented hard fork
>> that reverted the documented BIP30 behavior that was previously
>> implemented correctly, based on the assumption that hash collisions
>> cannot occur, for the modest performance boost of not having to check
>> for unspent duplicates (sounds sort of familiar).
>> 
>>> If some refuse to make such assumption, they should have introduced an
>> alternative hash algorithm and somehow run it in parallel with SHA256 to
>> prevent the consensus failure.
>> 
>> No hash algorithm can prevent hash collisions, including one that is
>> just two running in parallel. A better resolution would be to fix the
>> problem.
>> 
>> There is no need to replace the BIP30 rule. That resolves the TX hash
>> collision problem from a consensus standpoint. In order to serve up
>> whole blocks in the circumstance requires a more robust store than I
>> believe is exists in Core, but that has nothing to do with validity.
>> 
>> The block hash check and signature validation caching splits caused by
>> collision can easily be avoided, and doing so doesn't break with
>> consensus. I'm not aware of any other aspects of consensus that are
>> effected by an implementation assumption of non-colliding hashes. But in
>> any case I'm pretty sure there aren't any that are necessary to consensus.
>> 
>> e
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 17:49                     ` Eric Voskuil
@ 2016-11-17 18:08                       ` Johnson Lau
  2016-11-18  3:20                         ` Eric Voskuil
  0 siblings, 1 reply; 19+ messages in thread
From: Johnson Lau @ 2016-11-17 18:08 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: bitcoin-dev

The fact that some implementations ban an invalid block hash and some do not, suggests that it’s not a pure p2p protocol issue. A pure p2p split should be unified by a bridge node. However, a bridge node is not helpful in this case. Banning an invalid block hash is an implicit “first seen” consensus rule.

jl2012

> On 18 Nov 2016, at 01:49, Eric Voskuil <eric@voskuil•org> wrote:
> 
> Actually both possibilities were specifically covered in my description. Sorry if it wasn't clear.
> 
> If you create a new valid block out of an old one it's has potential to cause a reorg. The blocks that previously built on the original are still able to do so but presumably cannot build forever on the *new* block as it has a different tx. But other new blocks can. There is no chain split due to a different interpretation of valid, there are simply two valid competing chains.
> 
> Note that this scenario requires not only block and tx validity with a tx hash collision, but also that the tx be valid within the block. Pretty far to reach to not even get a chain split, but it could produce a deep reorg with a very low chance of success. As I keep telling people, deep reorgs can happen, they are just unlikely, as is this scenario.
> 
> If you create a new invalid block it is discarded by everyone. That does not invalidate the hash of that block. Permanent blocking as you describe it would be a p2p protocol design choice, having nothing to do with consensus. Libbitcoin for example does not ban invalidated hashes at all. It just discards the block and drops the peer.
> 
> e




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-17 18:08                       ` Johnson Lau
@ 2016-11-18  3:20                         ` Eric Voskuil
  2016-11-18 14:43                           ` Johnson Lau
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Voskuil @ 2016-11-18  3:20 UTC (permalink / raw)
  To: Johnson Lau; +Cc: bitcoin-dev

You are suggesting that, since a node implements a denial of service policy that actually denies itself otherwise valid blocks, those blocks are conditionally invalid. And that, since the validity condition is based on order of arrival and therefore independently unverifiable, Bitcoin consensus is broken in the face of a hash collision.

I am aware of two other hash collision scenarios that cause Core to declare blocks invalid based on ordering. The block hash duplicate check (it's not fork-point relative) and signature verification caching. Like the "block banning" issue above, the latter is related to an internal optimization. I would categorize the former as a simple oversight that presumably goes way back.

What then is the consequence of validity that is unverifiable? You believe this means that Bitcoin consensus is broken. This is incorrect. First understand that it is not possible for consensus rules to invalidate blocks based on order of arrival. As such any *implementation* that invalidates blocks based on order of arrival is broken. It is an error to claim that these behaviors are part of consensus, despite being implemented in the satoshi node(s).

Validity must be verifiable independent of the state of other nodes. Consensus is a function of block history and time alone. Time is presumed to be universally consistent. To be a consensus rule all nodes must be able to independently reach the same validity conclusion, given the same set of blocks, independent of order. If this is not the case the behavior is not a consensus rule, it is simply a bug. 

Deviating from such bugs is not a break with consensus, since such non-rules cannot be part of consensus. One node implementation can behave deterministically while others are behaving non-deterministically, with the two nodes remaining consistent from a consensus standpoint (deterministic produces a subset of non-deterministic results). But, unlike arbitrary nodes, deterministic nodes will not cause disruption on the network.

You imply that these determinism bugs are necessary, that there is no fix. This is also incorrect.

The block banning hash collision bug is avoided by not using non-chain/clock state to determine validity. Doing otherwise is clearly a bug. The hash of a block is not the block itself, a logically-correct ban would be to compare the wire serialization of the block as opposed to the hash, or not maintain the feature at all.

The signature verification caching hash collision bug is the same problem, an optimization based on an invalid assumption. A full serialization comparison (true identity), or elimination of the feature resolves the  bug.

The block hash check collision bug is trivially resolved by checking at the fork point as opposed to the tip. This prevents arbitrary (and irrational) invalidity based on conflict with irrelevant blocks that may or may not exist above the fork point.

Libbitcoin is deterministic in all three cases (although the third issue is not made consistent until v3). I am not aware of any other non-determinism in Core, but I don't spend a lot of time there. There is no need to study other implementations to ensure determinism, as that can be verified independently.

Any situation in which a node cannot provide deterministic validation of unordered blocks constitutes a non-consensus bug, as the behavior is not consistently verifiable by others under any conditions. Fixing/preventing these bugs is responsible development behavior, and does not require forks or BIPs, since Bitcoin doesn't inherently contain any such bugs. They are the consequence of incorrect implementation, and in two of the three cases above have resulted from supposed optimizations. But any code that creates non-determinism in exchange for speed, etc. is not an optimization, it's a bug. A node must implement its optimizations in a manner that does not alter consensus.

The BIP30 regression hard fork is not a case of non-determinism. This will produce deterministic results (apart from the impact of unrelated bugs). However the results are both a clear break from previous (and documented) consensus but also produce a very undesirable outcome - destruction of all unspent outputs in the "replaced" transaction for starters. So this is a distinct category, not a determinism bug but a hard fork that produces undesired consequences.

The BIP30 regression hard fork actually enables the various pathological scenarios that you were describing, where no such issues existed in Bitcoin consensus previously. It is now possible to produce a block that mutates another arbitrarily deep block, and forces a reorg all the way back to the mutated block. This was done to save microseconds per block. Despite the improbability of hash collisions, I find this deplorable and the lack of public discussion on the decision concerning.

With respect to the original post, the point at issue is the introduction of another hard fork, with some odd behaviors, but without any justification apart from tidying up the small amount of necessary code. These issues are related in that they are both consensus forks that have been introduced as supposed optimizations, with no public discussion prior to release (or at least merging to master with the presumption of shipping in the latter case). Two of the three hash collision issues above are also related in that they are bugs introduced by a desire to optimize internals.

The engineering lesson here should be clear - watch out for developers bearing optimizations. A trade against correctness is not an optimization, it's a break. Satoshi was clearly a fan of the premature optimization. FindAndDelete is a howler. So this is a tradition in Bitcoin. My intent is not to sling mud but to improve the situation.

It is very possible to produce straightforward and deterministic code that abides consensus and materially outperforms Core, without any of the above optimization breaks, even avoiding the utxo set optimization. Even the tx (memory) and block (orphan) pools are complex store denormalizations implemented as optimizations. Optimizing before producing a clean conceptual model architecture and design is a software development anti-pattern (premature optimization). The proposed fork is a premature optimization. There are much more significant opportunities to better organize code (and improve performance). I cannot support the decision to advance it.

I was unaware Core had regressed BIP30. Given that the behavior is catastrophic and that it introduces the *only* hash-collision consensus misbehavior (unless we consider a deep reorg sans the otherwise necessary proof of work desirable behavior), I strongly recommend it be reverted, with a post-mortem BIP.

Finally I recommend people contemplate the difference between unlikely and impossible. The chance of random collision is very small, but not zero. Colliding hashes is extremely difficult, but not impossible. But Bitcoin does not rely on impossibility for correct behavior. It relies of difficulty. This is a subtle but important distinction that people are missing.

Difficulty is a knowable quantity - a function of computing power.  If hash operations remain difficult, Bitcoin is undeterred. Collisions will have no impact, even if they happen with unexpected frequency (which would still be vanishingly infrequent). If the difficulty of producing a collision is reduced to the point where people cannot rely on addresses (for example), then Bitcoin has a problem, as it has become a leaky ship (and then there's mining). But with the unnecessary problems described above, a single hash collision can be catastrophic. Unlike difficulty, which is known, nobody can know when a single collision will show up. Betting Bitcoin, and potentially the world's money, on the unknowable is poor reasoning, especially given that the cost of not doing so is so very low.

e

> On Nov 17, 2016, at 10:08 AM, Johnson Lau <jl2012@xbt•hk> wrote:
> 
> The fact that some implementations ban an invalid block hash and some do not, suggests that it’s not a pure p2p protocol issue. A pure p2p split should be unified by a bridge node. However, a bridge node is not helpful in this case. Banning an invalid block hash is an implicit “first seen” consensus rule.
> 
> jl2012
> 
>> On 18 Nov 2016, at 01:49, Eric Voskuil <eric@voskuil•org> wrote:
>> 
>> Actually both possibilities were specifically covered in my description. Sorry if it wasn't clear.
>> 
>> If you create a new valid block out of an old one it's has potential to cause a reorg. The blocks that previously built on the original are still able to do so but presumably cannot build forever on the *new* block as it has a different tx. But other new blocks can. There is no chain split due to a different interpretation of valid, there are simply two valid competing chains.
>> 
>> Note that this scenario requires not only block and tx validity with a tx hash collision, but also that the tx be valid within the block. Pretty far to reach to not even get a chain split, but it could produce a deep reorg with a very low chance of success. As I keep telling people, deep reorgs can happen, they are just unlikely, as is this scenario.
>> 
>> If you create a new invalid block it is discarded by everyone. That does not invalidate the hash of that block. Permanent blocking as you describe it would be a p2p protocol design choice, having nothing to do with consensus. Libbitcoin for example does not ban invalidated hashes at all. It just discards the block and drops the peer.
>> 
>> e
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-18  3:20                         ` Eric Voskuil
@ 2016-11-18 14:43                           ` Johnson Lau
  2016-11-18 16:47                             ` Eric Voskuil
  0 siblings, 1 reply; 19+ messages in thread
From: Johnson Lau @ 2016-11-18 14:43 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: bitcoin-dev

In this case I don’t understand how your implementation won’t be DoS-ed. An attacker could keep sending you inv for the same block / transaction. Since you don’t assume the hash is unique, each time you have to download the block/tx again before you could tell if that is the same one you have already known. Otherwise, you are implementing the “first seen” rule.

Also, you can’t ban a peer just because you get an invalid tx from him, because he might be referring to a hash-colliding UTXO that you don’t know. In that case you need to request for the parent tx to verify. I wonder if you are really doing that.

> On 18 Nov 2016, at 11:20, Eric Voskuil <eric@voskuil•org> wrote:
> 
> You are suggesting that, since a node implements a denial of service policy that actually denies itself otherwise valid blocks, those blocks are conditionally invalid. And that, since the validity condition is based on order of arrival and therefore independently unverifiable, Bitcoin consensus is broken in the face of a hash collision.
> 
> I am aware of two other hash collision scenarios that cause Core to declare blocks invalid based on ordering. The block hash duplicate check (it's not fork-point relative) and signature verification caching. Like the "block banning" issue above, the latter is related to an internal optimization. I would categorize the former as a simple oversight that presumably goes way back.
> 
> What then is the consequence of validity that is unverifiable? You believe this means that Bitcoin consensus is broken. This is incorrect. First understand that it is not possible for consensus rules to invalidate blocks based on order of arrival. As such any *implementation* that invalidates blocks based on order of arrival is broken. It is an error to claim that these behaviors are part of consensus, despite being implemented in the satoshi node(s).
> 
> Validity must be verifiable independent of the state of other nodes. Consensus is a function of block history and time alone. Time is presumed to be universally consistent. To be a consensus rule all nodes must be able to independently reach the same validity conclusion, given the same set of blocks, independent of order. If this is not the case the behavior is not a consensus rule, it is simply a bug. 
> 
> Deviating from such bugs is not a break with consensus, since such non-rules cannot be part of consensus. One node implementation can behave deterministically while others are behaving non-deterministically, with the two nodes remaining consistent from a consensus standpoint (deterministic produces a subset of non-deterministic results). But, unlike arbitrary nodes, deterministic nodes will not cause disruption on the network.
> 
> You imply that these determinism bugs are necessary, that there is no fix. This is also incorrect.
> 
> The block banning hash collision bug is avoided by not using non-chain/clock state to determine validity. Doing otherwise is clearly a bug. The hash of a block is not the block itself, a logically-correct ban would be to compare the wire serialization of the block as opposed to the hash, or not maintain the feature at all.
> 
> The signature verification caching hash collision bug is the same problem, an optimization based on an invalid assumption. A full serialization comparison (true identity), or elimination of the feature resolves the  bug.
> 
> The block hash check collision bug is trivially resolved by checking at the fork point as opposed to the tip. This prevents arbitrary (and irrational) invalidity based on conflict with irrelevant blocks that may or may not exist above the fork point.
> 
> Libbitcoin is deterministic in all three cases (although the third issue is not made consistent until v3). I am not aware of any other non-determinism in Core, but I don't spend a lot of time there. There is no need to study other implementations to ensure determinism, as that can be verified independently.
> 
> Any situation in which a node cannot provide deterministic validation of unordered blocks constitutes a non-consensus bug, as the behavior is not consistently verifiable by others under any conditions. Fixing/preventing these bugs is responsible development behavior, and does not require forks or BIPs, since Bitcoin doesn't inherently contain any such bugs. They are the consequence of incorrect implementation, and in two of the three cases above have resulted from supposed optimizations. But any code that creates non-determinism in exchange for speed, etc. is not an optimization, it's a bug. A node must implement its optimizations in a manner that does not alter consensus.
> 
> The BIP30 regression hard fork is not a case of non-determinism. This will produce deterministic results (apart from the impact of unrelated bugs). However the results are both a clear break from previous (and documented) consensus but also produce a very undesirable outcome - destruction of all unspent outputs in the "replaced" transaction for starters. So this is a distinct category, not a determinism bug but a hard fork that produces undesired consequences.
> 
> The BIP30 regression hard fork actually enables the various pathological scenarios that you were describing, where no such issues existed in Bitcoin consensus previously. It is now possible to produce a block that mutates another arbitrarily deep block, and forces a reorg all the way back to the mutated block. This was done to save microseconds per block. Despite the improbability of hash collisions, I find this deplorable and the lack of public discussion on the decision concerning.
> 
> With respect to the original post, the point at issue is the introduction of another hard fork, with some odd behaviors, but without any justification apart from tidying up the small amount of necessary code. These issues are related in that they are both consensus forks that have been introduced as supposed optimizations, with no public discussion prior to release (or at least merging to master with the presumption of shipping in the latter case). Two of the three hash collision issues above are also related in that they are bugs introduced by a desire to optimize internals.
> 
> The engineering lesson here should be clear - watch out for developers bearing optimizations. A trade against correctness is not an optimization, it's a break. Satoshi was clearly a fan of the premature optimization. FindAndDelete is a howler. So this is a tradition in Bitcoin. My intent is not to sling mud but to improve the situation.
> 
> It is very possible to produce straightforward and deterministic code that abides consensus and materially outperforms Core, without any of the above optimization breaks, even avoiding the utxo set optimization. Even the tx (memory) and block (orphan) pools are complex store denormalizations implemented as optimizations. Optimizing before producing a clean conceptual model architecture and design is a software development anti-pattern (premature optimization). The proposed fork is a premature optimization. There are much more significant opportunities to better organize code (and improve performance). I cannot support the decision to advance it.
> 
> I was unaware Core had regressed BIP30. Given that the behavior is catastrophic and that it introduces the *only* hash-collision consensus misbehavior (unless we consider a deep reorg sans the otherwise necessary proof of work desirable behavior), I strongly recommend it be reverted, with a post-mortem BIP.
> 
> Finally I recommend people contemplate the difference between unlikely and impossible. The chance of random collision is very small, but not zero. Colliding hashes is extremely difficult, but not impossible. But Bitcoin does not rely on impossibility for correct behavior. It relies of difficulty. This is a subtle but important distinction that people are missing.
> 
> Difficulty is a knowable quantity - a function of computing power.  If hash operations remain difficult, Bitcoin is undeterred. Collisions will have no impact, even if they happen with unexpected frequency (which would still be vanishingly infrequent). If the difficulty of producing a collision is reduced to the point where people cannot rely on addresses (for example), then Bitcoin has a problem, as it has become a leaky ship (and then there's mining). But with the unnecessary problems described above, a single hash collision can be catastrophic. Unlike difficulty, which is known, nobody can know when a single collision will show up. Betting Bitcoin, and potentially the world's money, on the unknowable is poor reasoning, especially given that the cost of not doing so is so very low.
> 
> e
> 
>> On Nov 17, 2016, at 10:08 AM, Johnson Lau <jl2012@xbt•hk> wrote:
>> 
>> The fact that some implementations ban an invalid block hash and some do not, suggests that it’s not a pure p2p protocol issue. A pure p2p split should be unified by a bridge node. However, a bridge node is not helpful in this case. Banning an invalid block hash is an implicit “first seen” consensus rule.
>> 
>> jl2012
>> 
>>> On 18 Nov 2016, at 01:49, Eric Voskuil <eric@voskuil•org> wrote:
>>> 
>>> Actually both possibilities were specifically covered in my description. Sorry if it wasn't clear.
>>> 
>>> If you create a new valid block out of an old one it's has potential to cause a reorg. The blocks that previously built on the original are still able to do so but presumably cannot build forever on the *new* block as it has a different tx. But other new blocks can. There is no chain split due to a different interpretation of valid, there are simply two valid competing chains.
>>> 
>>> Note that this scenario requires not only block and tx validity with a tx hash collision, but also that the tx be valid within the block. Pretty far to reach to not even get a chain split, but it could produce a deep reorg with a very low chance of success. As I keep telling people, deep reorgs can happen, they are just unlikely, as is this scenario.
>>> 
>>> If you create a new invalid block it is discarded by everyone. That does not invalidate the hash of that block. Permanent blocking as you describe it would be a p2p protocol design choice, having nothing to do with consensus. Libbitcoin for example does not ban invalidated hashes at all. It just discards the block and drops the peer.
>>> 
>>> e
>> 
>> 




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments)
  2016-11-18 14:43                           ` Johnson Lau
@ 2016-11-18 16:47                             ` Eric Voskuil
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Voskuil @ 2016-11-18 16:47 UTC (permalink / raw)
  To: Johnson Lau; +Cc: bitcoin-dev

What is the difference between downloading a hash and comparing it to a hash vs downloading a hash and then a block and comparing it to a block?

You are talking about breaking a system in order to make it run faster. Using the hash is an non-optimization trade against correctness.

There is no "first seen" rule, there is only valid and invalid. Even the name exposes the error of this thinking as "first" requires order.

Caching invalidity for DOS protection is fine. It should be quite obvious that the blockchain is nothing more than a coach of validity. If it's faster in some cases to store both validity and all invalidity that you are aware of it is fine, you are trading space for time.

But caching information that is neither validity nor invalidity, and using it to validate blocks is a break.

I cannot emphasize this point enough. A technique that provides varied results based on communication history, such as this "rule", is an attack vector. It allows the attacker to place information into your cache and read it back later from another connection. Even optimizing correct results based on communication history exposes the node in this manner. These sort of attacks have been shown to be very effective at deanonymizing hidden nodes.

The p2p protocol actually makes this sort of attack a matter of communication standard via the sharing of address information, but this can be disabled without impacting correctness. Due to such non-optimizations as the first seen "rule" however, a node becomes a candy store of fingerprinting attack vectors.

Bitcoin provides the mechanism to reject cheaply-produced invalid blocks quickly. This is after all the fundamental principle of hash cash - force the attacker to pay to spam attack. By obtaining headers first a node can obtain proof of work and perform correct and fast validation before ever obtaining the block's transactions. This technique is probably no more time-costly than the incorrect technique of checking a cache of hashes (ironically, a "hash cache" is an incorrect "hash cash"), and avoids the extra space of a secondary cache (the blockchain is the primary cache). It also avoids the varied time response that a secondary cache creates.

So once again, premature optimization erupts from the underlying design flaw, and creates more problems than proper design. The p2p network standard didn't have headers first at one point, making correct checks more costly. That is no longer the case. But nevertheless, one cannot trade correctness for time.

The tx pool, like the orphan pool, as I mentioned previously, is an optimization. It is not a part of consensus, so it isn't relevant to a discussion about forks. It is also a design flaw that nodes are expected to hold invalid transactions. It exposes nodes to both DOS and fingerprinting attacks. Proper tx handling implies that a tx connect to a valid block. There is no "header" for a transaction so correctness requires that the tx be downloaded before it can be validated.

e

> On Nov 18, 2016, at 8:43 AM, Johnson Lau <jl2012@xbt•hk> wrote:
> 
> In this case I don’t understand how your implementation won’t be DoS-ed. An attacker could keep sending you inv for the same block / transaction. Since you don’t assume the hash is unique, each time you have to download the block/tx again before you could tell if that is the same one you have already known. Otherwise, you are implementing the “first seen” rule.
> 
> Also, you can’t ban a peer just because you get an invalid tx from him, because he might be referring to a hash-colliding UTXO that you don’t know. In that case you need to request for the parent tx to verify. I wonder if you are really doing that.
> 
>> On 18 Nov 2016, at 11:20, Eric Voskuil <eric@voskuil•org> wrote:
>> 
>> You are suggesting that, since a node implements a denial of service policy that actually denies itself otherwise valid blocks, those blocks are conditionally invalid. And that, since the validity condition is based on order of arrival and therefore independently unverifiable, Bitcoin consensus is broken in the face of a hash collision.
>> 
>> I am aware of two other hash collision scenarios that cause Core to declare blocks invalid based on ordering. The block hash duplicate check (it's not fork-point relative) and signature verification caching. Like the "block banning" issue above, the latter is related to an internal optimization. I would categorize the former as a simple oversight that presumably goes way back.
>> 
>> What then is the consequence of validity that is unverifiable? You believe this means that Bitcoin consensus is broken. This is incorrect. First understand that it is not possible for consensus rules to invalidate blocks based on order of arrival. As such any *implementation* that invalidates blocks based on order of arrival is broken. It is an error to claim that these behaviors are part of consensus, despite being implemented in the satoshi node(s).
>> 
>> Validity must be verifiable independent of the state of other nodes. Consensus is a function of block history and time alone. Time is presumed to be universally consistent. To be a consensus rule all nodes must be able to independently reach the same validity conclusion, given the same set of blocks, independent of order. If this is not the case the behavior is not a consensus rule, it is simply a bug. 
>> 
>> Deviating from such bugs is not a break with consensus, since such non-rules cannot be part of consensus. One node implementation can behave deterministically while others are behaving non-deterministically, with the two nodes remaining consistent from a consensus standpoint (deterministic produces a subset of non-deterministic results). But, unlike arbitrary nodes, deterministic nodes will not cause disruption on the network.
>> 
>> You imply that these determinism bugs are necessary, that there is no fix. This is also incorrect.
>> 
>> The block banning hash collision bug is avoided by not using non-chain/clock state to determine validity. Doing otherwise is clearly a bug. The hash of a block is not the block itself, a logically-correct ban would be to compare the wire serialization of the block as opposed to the hash, or not maintain the feature at all.
>> 
>> The signature verification caching hash collision bug is the same problem, an optimization based on an invalid assumption. A full serialization comparison (true identity), or elimination of the feature resolves the  bug.
>> 
>> The block hash check collision bug is trivially resolved by checking at the fork point as opposed to the tip. This prevents arbitrary (and irrational) invalidity based on conflict with irrelevant blocks that may or may not exist above the fork point.
>> 
>> Libbitcoin is deterministic in all three cases (although the third issue is not made consistent until v3). I am not aware of any other non-determinism in Core, but I don't spend a lot of time there. There is no need to study other implementations to ensure determinism, as that can be verified independently.
>> 
>> Any situation in which a node cannot provide deterministic validation of unordered blocks constitutes a non-consensus bug, as the behavior is not consistently verifiable by others under any conditions. Fixing/preventing these bugs is responsible development behavior, and does not require forks or BIPs, since Bitcoin doesn't inherently contain any such bugs. They are the consequence of incorrect implementation, and in two of the three cases above have resulted from supposed optimizations. But any code that creates non-determinism in exchange for speed, etc. is not an optimization, it's a bug. A node must implement its optimizations in a manner that does not alter consensus.
>> 
>> The BIP30 regression hard fork is not a case of non-determinism. This will produce deterministic results (apart from the impact of unrelated bugs). However the results are both a clear break from previous (and documented) consensus but also produce a very undesirable outcome - destruction of all unspent outputs in the "replaced" transaction for starters. So this is a distinct category, not a determinism bug but a hard fork that produces undesired consequences.
>> 
>> The BIP30 regression hard fork actually enables the various pathological scenarios that you were describing, where no such issues existed in Bitcoin consensus previously. It is now possible to produce a block that mutates another arbitrarily deep block, and forces a reorg all the way back to the mutated block. This was done to save microseconds per block. Despite the improbability of hash collisions, I find this deplorable and the lack of public discussion on the decision concerning.
>> 
>> With respect to the original post, the point at issue is the introduction of another hard fork, with some odd behaviors, but without any justification apart from tidying up the small amount of necessary code. These issues are related in that they are both consensus forks that have been introduced as supposed optimizations, with no public discussion prior to release (or at least merging to master with the presumption of shipping in the latter case). Two of the three hash collision issues above are also related in that they are bugs introduced by a desire to optimize internals.
>> 
>> The engineering lesson here should be clear - watch out for developers bearing optimizations. A trade against correctness is not an optimization, it's a break. Satoshi was clearly a fan of the premature optimization. FindAndDelete is a howler. So this is a tradition in Bitcoin. My intent is not to sling mud but to improve the situation.
>> 
>> It is very possible to produce straightforward and deterministic code that abides consensus and materially outperforms Core, without any of the above optimization breaks, even avoiding the utxo set optimization. Even the tx (memory) and block (orphan) pools are complex store denormalizations implemented as optimizations. Optimizing before producing a clean conceptual model architecture and design is a software development anti-pattern (premature optimization). The proposed fork is a premature optimization. There are much more significant opportunities to better organize code (and improve performance). I cannot support the decision to advance it.
>> 
>> I was unaware Core had regressed BIP30. Given that the behavior is catastrophic and that it introduces the *only* hash-collision consensus misbehavior (unless we consider a deep reorg sans the otherwise necessary proof of work desirable behavior), I strongly recommend it be reverted, with a post-mortem BIP.
>> 
>> Finally I recommend people contemplate the difference between unlikely and impossible. The chance of random collision is very small, but not zero. Colliding hashes is extremely difficult, but not impossible. But Bitcoin does not rely on impossibility for correct behavior. It relies of difficulty. This is a subtle but important distinction that people are missing.
>> 
>> Difficulty is a knowable quantity - a function of computing power.  If hash operations remain difficult, Bitcoin is undeterred. Collisions will have no impact, even if they happen with unexpected frequency (which would still be vanishingly infrequent). If the difficulty of producing a collision is reduced to the point where people cannot rely on addresses (for example), then Bitcoin has a problem, as it has become a leaky ship (and then there's mining). But with the unnecessary problems described above, a single hash collision can be catastrophic. Unlike difficulty, which is known, nobody can know when a single collision will show up. Betting Bitcoin, and potentially the world's money, on the unknowable is poor reasoning, especially given that the cost of not doing so is so very low.
>> 
>> e
>> 
>>> On Nov 17, 2016, at 10:08 AM, Johnson Lau <jl2012@xbt•hk> wrote:
>>> 
>>> The fact that some implementations ban an invalid block hash and some do not, suggests that it’s not a pure p2p protocol issue. A pure p2p split should be unified by a bridge node. However, a bridge node is not helpful in this case. Banning an invalid block hash is an implicit “first seen” consensus rule.
>>> 
>>> jl2012
>>> 
>>>> On 18 Nov 2016, at 01:49, Eric Voskuil <eric@voskuil•org> wrote:
>>>> 
>>>> Actually both possibilities were specifically covered in my description. Sorry if it wasn't clear.
>>>> 
>>>> If you create a new valid block out of an old one it's has potential to cause a reorg. The blocks that previously built on the original are still able to do so but presumably cannot build forever on the *new* block as it has a different tx. But other new blocks can. There is no chain split due to a different interpretation of valid, there are simply two valid competing chains.
>>>> 
>>>> Note that this scenario requires not only block and tx validity with a tx hash collision, but also that the tx be valid within the block. Pretty far to reach to not even get a chain split, but it could produce a deep reorg with a very low chance of success. As I keep telling people, deep reorgs can happen, they are just unlikely, as is this scenario.
>>>> 
>>>> If you create a new invalid block it is discarded by everyone. That does not invalidate the hash of that block. Permanent blocking as you describe it would be a p2p protocol design choice, having nothing to do with consensus. Libbitcoin for example does not ban invalidated hashes at all. It just discards the block and drops the peer.
>>>> 
>>>> e
>>> 
>>> 
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-11-18 16:47 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-17  0:06 [bitcoin-dev] BIP30 and BIP34 interaction (was Re: [BIP Proposal] Buried Deployments) Jorge Timón
2016-11-17  0:10 ` Eric Voskuil
2016-11-17  0:31   ` Tier Nolan
2016-11-17  0:43     ` Eric Voskuil
2016-11-17  0:53       ` Eric Voskuil
2016-11-17  8:44       ` Peter Todd
2016-11-17  9:58         ` Eric Voskuil
2016-11-17 10:22       ` Tier Nolan
2016-11-17 11:22         ` Eric Voskuil
2016-11-17 11:38           ` Alex Morcos
2016-11-17 12:22             ` Eric Voskuil
2016-11-17 15:40               ` Johnson Lau
2016-11-17 17:01                 ` Eric Voskuil
2016-11-17 17:22                   ` Johnson Lau
2016-11-17 17:49                     ` Eric Voskuil
2016-11-17 18:08                       ` Johnson Lau
2016-11-18  3:20                         ` Eric Voskuil
2016-11-18 14:43                           ` Johnson Lau
2016-11-18 16:47                             ` Eric Voskuil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox