[bitcoindev] Great Consensus Cleanup Revival

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* [bitcoindev] Great Consensus Cleanup Revival
@ 2024-03-24 18:10 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-03-26 19:11 ` [bitcoindev] " Antoine Riard
  2024-06-17 22:15 ` Eric Voskuil
  0 siblings, 2 replies; 33+ messages in thread
From: 'Antoine Poinsot' via Bitcoin Development Mailing List @ 2024-03-24 18:10 UTC (permalink / raw)
  To: bitcoindev

Hey all,

I've recently posted about the Great Consensus Cleanup there: https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710.

I'm starting a thread on the mailing list as well to get comments and opinions from people who are not on Delving.

TL;DR:
- i think the worst block validation time is concerning. The mitigations proposed by Matt are effective, but i think we should also limit the maximum size of legacy transactions for an additional safety margin;
- i believe it's more important to fix the timewarp bug than people usually think;
- it would be nice to include a fix to make coinbase transactions unique once and for all, to avoid having to resort back to doing BIP30 validation after block 1,983,702;
- 64 bytes transactions should definitely be made invalid, but i don't think there is a strong case for making less than 64 bytes transactions invalid.

Anything in there that people disagree with conceptually?
Anything in there that people think shouldn't (or don't need to) be fixed?
Anything in there which can be improved (a simpler, or better fix)?
Anything NOT in there that people think should be fixed?


Antoine Poinsot

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/gnM89sIQ7MhDgI62JciQEGy63DassEv7YZAMhj0IEuIo0EdnafykF6RH4OqjTTHIHsIoZvC2MnTUzJI7EfET4o-UQoD-XAQRDcct994VarE%3D%40protonmail.com.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-03-24 18:10 [bitcoindev] Great Consensus Cleanup Revival 'Antoine Poinsot' via Bitcoin Development Mailing List
@ 2024-03-26 19:11 ` Antoine Riard
  2024-03-27 10:35   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-06-17 22:15 ` Eric Voskuil
  1 sibling, 1 reply; 33+ messages in thread
From: Antoine Riard @ 2024-03-26 19:11 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 4549 bytes --]

Hi Poinsot,

I think fixing the timewarp attack is a good idea, especially w.r.t safety 
implications of long-term timelocks usage.

The only beneficial case I can remember about the timewarp issue is 
"forwarding blocks" by maaku for on-chain scaling:
http://freico.in/forward-blocks-scalingbitcoin-paper.pdf

Shall we as a community completely disregard this approach for on-chain 
settlement throughput scaling ?
Personally, I think you can still design extension-block / side-chains like 
protocols by using other today available
Bitcoin Script mechanisms and get roughly (?) the same security / 
scalability trade-offs. Shall be fine to me to fix timewarp.

Worst-block validation time is concerning. I bet you can do worst than your 
examples if you're playing with other vectors like
low-level ECC tricks and micro-architectural layout of modern processors.

Consensus invalidation of old legacy scripts was quite controversial last 
time a consensus cleanup was proposed:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-March/016714.html

Only making scripts invalid after a given block height (let's say the 
consensus cleanup activation height) is obviously a
way to solve the concern and any remaining sleeping DoSy unspent coins can 
be handled with  newly crafted and dedicated
transaction-relay rules (e.g at max 1000 DoSy coins can be spent per block 
for a given IBT span).

I think any consensus boundaries on the minimal transaction size would need 
to be done carefully and have all lightweight
clients update their own transaction acceptance logic to enforce the check 
to avoid years-long transitory massive double-spend
due to software incoordination. I doubt `MIN_STANDARD_TX_NON_WITNESS_SIZE` 
is implemented correctly by all transaction-relay
backends and it's a mess in this area. Quid if we have  < 64 bytes 
transaction where the only witness is enforced to be a minimal 1-byte
as witness elements are only used for higher layers protocols semantics  ? 
Shall get its own "only-after-height-X" exemption, I think.

Making coinbase unique by requesting the block height to be enforced in 
nLocktime, sounds more robust to take a monotonic counter
in the past in case of accidental or provoked shallow reorgs. I can see of 
you would have to re-compute a block template, loss a round-trip
compare to your mining competitors. Better if it doesn't introduce a new 
DoS vector at mining job distribution and control.

Beyond, I don't deny other mentioned issues (e.g UTXO entries growth limit) 
could be source of denial-of-service but a) I think it's hard
to tell if they're economically neutral on modern Bitcoin use-cases and 
their plausible evolvability and b) it's already a lot of careful consensus
code to get right :)

Best,
Antoine

Le dimanche 24 mars 2024 à 19:06:57 UTC, Antoine Poinsot a écrit :

> Hey all,
>
> I've recently posted about the Great Consensus Cleanup there: 
> https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710.
>
> I'm starting a thread on the mailing list as well to get comments and 
> opinions from people who are not on Delving.
>
> TL;DR:
> - i think the worst block validation time is concerning. The mitigations 
> proposed by Matt are effective, but i think we should also limit the 
> maximum size of legacy transactions for an additional safety margin;
> - i believe it's more important to fix the timewarp bug than people 
> usually think;
> - it would be nice to include a fix to make coinbase transactions unique 
> once and for all, to avoid having to resort back to doing BIP30 validation 
> after block 1,983,702;
> - 64 bytes transactions should definitely be made invalid, but i don't 
> think there is a strong case for making less than 64 bytes transactions 
> invalid.
>
> Anything in there that people disagree with conceptually?
> Anything in there that people think shouldn't (or don't need to) be fixed?
> Anything in there which can be improved (a simpler, or better fix)?
> Anything NOT in there that people think should be fixed?
>
>
> Antoine Poinsot
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/dc2cc46f-e697-4b14-91b3-34cf11de29a3n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 5738 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-03-26 19:11 ` [bitcoindev] " Antoine Riard
@ 2024-03-27 10:35   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-03-27 18:57     ` Antoine Riard
                       ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: 'Antoine Poinsot' via Bitcoin Development Mailing List @ 2024-03-27 10:35 UTC (permalink / raw)
  To: Antoine Riard; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 5840 bytes --]

> Hi Poinsot,

Hi Riard,

> The only beneficial case I can remember about the timewarp issue is "forwarding blocks" by maaku for on-chain scaling:
> http://freico.in/forward-blocks-scalingbitcoin-paper.pdf

I would not qualify this hack of "beneficial". Besides the centralization pressure of an increased block frequency, leveraging the timewarp to achieve it would put the network constantly on the Brink of being seriously (fatally?) harmed. And this sets pernicious incentives too. Every individual user has a short-term incentive to get lower fees by the increased block space, at the expense of all users longer term. And every individual miner has an incentive to get more block reward at the expense of future miners. (And of course bigger miners benefit from an increased block frequency.)

> I think any consensus boundaries on the minimal transaction size would need to be done carefully and have all lightweight
> clients update their own transaction acceptance logic to enforce the check to avoid years-long transitory massive double-spend
> due to software incoordination.

Note in my writeup i suggest we do not introduce a minimum transaction, but we instead only make 64 bytes transactions invalid. See https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710#can-we-come-up-with-a-better-fix-10:

> However the BIP proposes to also make less-than-64-bytes transactions invalid. Although they are of no (or little) use, such transactions are not harmful. I believe considering a type of transaction useless is not sufficient motivation for making them invalid through a soft fork.
>
> Making (exactly) 64 bytes long transactions invalid is also what AJ implemented in [his pull request to Bitcoin-inquisition](https://github.com/bitcoin-inquisition/bitcoin/pull/24).

> I doubt `MIN_STANDARD_TX_NON_WITNESS_SIZE` is implemented correctly by all transaction-relay backends and it's a mess in this area.

What type of backend are you referring to here? Bitcoin full nodes reimplementations? These transactions have been non-standard in Bitcoin Core for the past 6 years (commit 7485488e907e236133a016ba7064c89bf9ab6da3).

> Quid if we have < 64 bytes transaction where the only witness is enforced to be a minimal 1-byte
> as witness elements are only used for higher layers protocols semantics ?

This restriction is on the size of the transaction serialized without witness. So this particular instance would not be affected and whatever the witness is isn't relevant.

> Making coinbase unique by requesting the block height to be enforced in nLocktime, sounds more robust to take a monotonic counter
> in the past in case of accidental or provoked shallow reorgs. I can see of you would have to re-compute a block template, loss a round-trip
> compare to your mining competitors. Better if it doesn't introduce a new DoS vector at mining job distribution and control.

Could you clarify? Are you suggesting something else than to set the nLockTime in the coinbase transaction to the height of the block? If so, what exactly are you referring to by "monotonic counter in the past"?

At any rate in my writeup i suggested making the coinbase commitment mandatory (even when empty) instead for compatibility reasons.

That said, since we could make this rule kick in in 25 years from now, we might want to just do the Obvious Thing and just require the height in nLockTime.

> and b) it's already a lot of careful consensus
> code to get right :)

Definitely. I just want to make sure we are not missing anything important if a soft fork gets proposed along these lines in the future.

> Best,
> Antoine
>
> Le dimanche 24 mars 2024 à 19:06:57 UTC, Antoine Poinsot a écrit :
>
>> Hey all,
>>
>> I've recently posted about the Great Consensus Cleanup there: https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710.
>>
>> I'm starting a thread on the mailing list as well to get comments and opinions from people who are not on Delving.
>>
>> TL;DR:
>> - i think the worst block validation time is concerning. The mitigations proposed by Matt are effective, but i think we should also limit the maximum size of legacy transactions for an additional safety margin;
>> - i believe it's more important to fix the timewarp bug than people usually think;
>> - it would be nice to include a fix to make coinbase transactions unique once and for all, to avoid having to resort back to doing BIP30 validation after block 1,983,702;
>> - 64 bytes transactions should definitely be made invalid, but i don't think there is a strong case for making less than 64 bytes transactions invalid.
>>
>> Anything in there that people disagree with conceptually?
>> Anything in there that people think shouldn't (or don't need to) be fixed?
>> Anything in there which can be improved (a simpler, or better fix)?
>> Anything NOT in there that people think should be fixed?
>>
>> Antoine Poinsot
>
> --
> You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/dc2cc46f-e697-4b14-91b3-34cf11de29a3n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/1KbVdD952_XRfsKzMKaX-y4lrPOxYiknn8xXOMDQGt2Qz2fHFM-KoSplL-A_GRE1yuUkgNMeoEBHZiEDlMYwiqOiITFQTKEm5u1p1oVlL9I%3D%40protonmail.com.

[-- Attachment #2: Type: text/html, Size: 11943 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-03-27 10:35   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
@ 2024-03-27 18:57     ` Antoine Riard
  2024-04-18  0:46     ` Mark F
  2024-07-20 21:39     ` Murad Ali
  2 siblings, 0 replies; 33+ messages in thread
From: Antoine Riard @ 2024-03-27 18:57 UTC (permalink / raw)
  To: Antoine Poinsot; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 8001 bytes --]

Hi Darosior,

> I would not qualify this hack of "beneficial". Besides the centralization
pressure of an increased block frequency, leveraging the timewarp to
achieve it would put the network constantly on the Brink of being seriously
(fatally?) harmed. And this sets pernicious incentives too. Every
individual user has a short-term incentive to get lower fees by the
increased block space, at the expense of all users longer term. And every
individual miner has an incentive to get more block reward at the expense
of future miners. (And of course bigger miners benefit from an increased
block frequency.)

I'm not saying the hack is beneficial either. The "forward block" paper is
just good to provide more context around timewarp.

> Note in my writeup i suggest we do not introduce a minimum transaction,
but we instead only make 64 bytes transactions invalid

I think it's easier for the sake of analysis.
See this mailing list issue for 60-byte example transaction use-case:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2020-May/017883.html
Only I'm aware of to the best of my knowledge.

> What type of backend are you referring to here?

I can't find where `MIN_STANDARD_TX_NON_WITNESS_SIZE` is checked in btcd's
`maybeAcceptTransaction()`.

> This restriction is on the size of the transaction serialized without
witness.

Oky.

> Could you clarify? Are you suggesting something else than to set the
nLockTime in the coinbase transaction to the height of the block? If so,
what exactly are you referring to by "monotonic counter in the past"?

Thinking more, I believe it's okay to use the nLocktime in the coinbase
transaction, as the wtxid of the coinbase is assumed to be 0x00.
To be checked if it doesn't break anything w.rt Stratum V2 / mining job
distribution.

Best,
Antoine












Le mer. 27 mars 2024 à 10:36, Antoine Poinsot <darosior@protonmail•com> a
écrit :

>
> Hi Poinsot,
>
>
> Hi Riard,
>
>
> The only beneficial case I can remember about the timewarp issue is
> "forwarding blocks" by maaku for on-chain scaling:
> http://freico.in/forward-blocks-scalingbitcoin-paper.pdf
>
>
> I would not qualify this hack of "beneficial". Besides the centralization
> pressure of an increased block frequency, leveraging the timewarp to
> achieve it would put the network constantly on the Brink of being seriously
> (fatally?) harmed. And this sets pernicious incentives too. Every
> individual user has a short-term incentive to get lower fees by the
> increased block space, at the expense of all users longer term. And every
> individual miner has an incentive to get more block reward at the expense
> of future miners. (And of course bigger miners benefit from an increased
> block frequency.)
>
>
> I think any consensus boundaries on the minimal transaction size would
> need to be done carefully and have all lightweight
> clients update their own transaction acceptance logic to enforce the check
> to avoid years-long transitory massive double-spend
> due to software incoordination.
>
>
> Note in my writeup i suggest we do not introduce a minimum transaction,
> but we instead only make 64 bytes transactions invalid. See
> https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710#can-we-come-up-with-a-better-fix-10
> :
>
> However the BIP proposes to also make less-than-64-bytes transactions
> invalid. Although they are of no (or little) use, such transactions are not
> harmful. I believe considering a type of transaction useless is not
> sufficient motivation for making them invalid through a soft fork.
>
> Making (exactly) 64 bytes long transactions invalid is also what AJ
> implemented in his pull request to Bitcoin-inquisition
> <https://github.com/bitcoin-inquisition/bitcoin/pull/24>.
>
>
>
> I doubt `MIN_STANDARD_TX_NON_WITNESS_SIZE` is implemented correctly by all
> transaction-relay backends and it's a mess in this area.
>
>
> What type of backend are you referring to here? Bitcoin full nodes
> reimplementations? These transactions have been non-standard in Bitcoin
> Core for the past 6 years (commit 7485488e907e236133a016ba7064c89bf9ab6da3
> ).
>
>
> Quid if we have < 64 bytes transaction where the only witness is enforced
> to be a minimal 1-byte
> as witness elements are only used for higher layers protocols semantics ?
>
>
> This restriction is on the size of the transaction serialized without
> witness. So this particular instance would not be affected and whatever the
> witness is isn't relevant.
>
>
> Making coinbase unique by requesting the block height to be enforced in
> nLocktime, sounds more robust to take a monotonic counter
> in the past in case of accidental or provoked shallow reorgs. I can see of
> you would have to re-compute a block template, loss a round-trip
> compare to your mining competitors. Better if it doesn't introduce a new
> DoS vector at mining job distribution and control.
>
>
> Could you clarify? Are you suggesting something else than to set the
> nLockTime in the coinbase transaction to the height of the block? If so,
> what exactly are you referring to by "monotonic counter in the past"?
>
> At any rate in my writeup i suggested making the coinbase commitment
> mandatory (even when empty) instead for compatibility reasons.
>
> That said, since we could make this rule kick in in 25 years from now, we
> might want to just do the Obvious Thing and just require the height in
> nLockTime.
>
>
>  and b) it's already a lot of careful consensus
> code to get right :)
>
>
> Definitely. I just want to make sure we are not missing anything important
> if a soft fork gets proposed along these lines in the future.
>
>
> Best,
> Antoine
>
> Le dimanche 24 mars 2024 à 19:06:57 UTC, Antoine Poinsot a écrit :
>
>> Hey all,
>>
>> I've recently posted about the Great Consensus Cleanup there:
>> https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710.
>>
>> I'm starting a thread on the mailing list as well to get comments and
>> opinions from people who are not on Delving.
>>
>> TL;DR:
>> - i think the worst block validation time is concerning. The mitigations
>> proposed by Matt are effective, but i think we should also limit the
>> maximum size of legacy transactions for an additional safety margin;
>> - i believe it's more important to fix the timewarp bug than people
>> usually think;
>> - it would be nice to include a fix to make coinbase transactions unique
>> once and for all, to avoid having to resort back to doing BIP30 validation
>> after block 1,983,702;
>> - 64 bytes transactions should definitely be made invalid, but i don't
>> think there is a strong case for making less than 64 bytes transactions
>> invalid.
>>
>> Anything in there that people disagree with conceptually?
>> Anything in there that people think shouldn't (or don't need to) be
>> fixed?
>> Anything in there which can be improved (a simpler, or better fix)?
>> Anything NOT in there that people think should be fixed?
>>
>>
>> Antoine Poinsot
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/bitcoindev/dc2cc46f-e697-4b14-91b3-34cf11de29a3n%40googlegroups.com
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/CALZpt%2BGeEonE08V6tBoY0gsc1hj6r3y1yTUri_nCJ-%3DLyq6jLA%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 16860 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-03-27 10:35   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-03-27 18:57     ` Antoine Riard
@ 2024-04-18  0:46     ` Mark F
  2024-04-18 10:04       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-07-20 21:39     ` Murad Ali
  2 siblings, 1 reply; 33+ messages in thread
From: Mark F @ 2024-04-18  0:46 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 3245 bytes --]

On Wednesday, March 27, 2024 at 4:00:34 AM UTC-7 Antoine Poinsot wrote:

The only beneficial case I can remember about the timewarp issue is 
"forwarding blocks" by maaku for on-chain scaling:
http://freico.in/forward-blocks-scalingbitcoin-paper.pdf

I would not qualify this hack of "beneficial". Besides the centralization 
pressure of an increased block frequency, leveraging the timewarp to 
achieve it would put the network constantly on the Brink of being seriously 
(fatally?) harmed. And this sets pernicious incentives too. Every 
individual user has a short-term incentive to get lower fees by the 
increased block space, at the expense of all users longer term. And every 
individual miner has an incentive to get more block reward at the expense 
of future miners. (And of course bigger miners benefit from an increased 
block frequency.)

Every single concern mentioned here is addressed prominently in the 
paper/presentation for Forward Blocks:

* Increased block frequency is only on the compatibility chain, where the 
content of blocks is deterministic anyway. There is no centralization 
pressure from the frequency of blocks on the compatibility chain, as the 
content of the blocks is not miner-editable in economically meaningful 
ways. Only the block frequency of the forward block chain matters, and here 
the block frequency is actually *reduced*, thereby decreasing 
centralization pressure.

* The elastic block size adjustment mechanism proposed in the paper is 
purposefully constructed so that users or miners wanting to increase the 
block size beyond what is currently provided for will have to pay 
significantly (multiple orders of magnitude) more than they could possibly 
acquire from larger blocks, and the block size would re-adjust downward 
shortly after the cessation of that artificial fee pressure.

* Increased block frequency of compatibility blocks has no effect on the 
total issuance, so miners are not rewarded by faster blocks.

You are free to criticize Forward Blocks, but please do so by actually 
addressing the content of the proposal. Let's please hold a standard of 
intellectual excellence on this mailing list in which ideas are debated 
based on content-level arguments rather than repeating inaccurate takes 
from Reddit/Twitter.

To the topic of the thread, disabling time-warp will close off an unlikely 
and difficult to pull off subsidy draining attack that to activate would 
necessarily require weeks of forewarning and could be easily countered in 
other ways, with the tradeoff of removing the only known mechanism for 
upgrading the bitcoin protocol to larger effective block sizes while 
staying 100% compatible with un-upgraded nodes (all nodes see all 
transactions).

I think we should keep our options open.

-Mark

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/62640263-077c-4ac7-98a6-d9c17913fca0n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 4245 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-04-18  0:46     ` Mark F
@ 2024-04-18 10:04       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-04-25  6:08         ` Antoine Riard
  0 siblings, 1 reply; 33+ messages in thread
From: 'Antoine Poinsot' via Bitcoin Development Mailing List @ 2024-04-18 10:04 UTC (permalink / raw)
  To: Mark F; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 6057 bytes --]

> You are free to criticize Forward Blocks, but please do so by actually addressing the content of the proposal. Let's please hold a standard of intellectual excellence on this mailing list in which ideas are debated based on content-level arguments rather than repeating inaccurate takes from Reddit/Twitter.

You are the one being dishonest here. Look, i understand you came up with a fun hack exploiting bugs in Bitcoin and you are biased against fixing them. Yet, the cost of not fixing timewarp objectively far exceeds the cost of making "forward blocks" impossible.

As already addressed in the DelvingBitcoin post:

- The timewarp bug significantly changes the 51% attacker threat model. Without exploiting it a censoring miner needs to continuously keep more hashrate than the rest of the network combined for as long as he wants to prevent some people from using Bitcoin. By exploiting timewarp the attacker can prevent everybody from using Bitcoin within 40 days.
- The timewarp bug allows an attacking miner to force on full nodes more block data than they agreed to. This is actually the attack leveraged by your proposal. I believe this variant of the attack is more likely to happen, simply for the reason that all participants of the system have a short term incentive to exploit this (yay lower fees! yay more block subsidy!), at the expense of the long term health of the system. As the block subsidy exponentially decreases miners are likely to start playing more games and that's a particularly attractive one. Given the level of mining centralization we are witnessing [0] i believe this is particularly worrisome.
- I'm very skeptical of arguments about how "we" can stop an attack which requires "weeks of forewarning". Who's we? How do we proceed, all Bitcoin users coordinate and arbitrarily decide of the validity of a block? A few weeks is very little time if this is at all achievable. If you add on top of that the political implications of the previous point it gets particularly messy.

I've got better things to do than to play "you are being dishonest! -no it's you -no you" games. So unless you bring something new to the table this will be my last reply to your accusations.

Antoine

[0] https://x.com/0xB10C/status/1780611768081121700
On Thursday, April 18th, 2024 at 2:46 AM, Mark F <mark@friedenbach•org> wrote:

> On Wednesday, March 27, 2024 at 4:00:34 AM UTC-7 Antoine Poinsot wrote:
>
>>> The only beneficial case I can remember about the timewarp issue is "forwarding blocks" by maaku for on-chain scaling:
>>> http://freico.in/forward-blocks-scalingbitcoin-paper.pdf
>>
>> I would not qualify this hack of "beneficial". Besides the centralization pressure of an increased block frequency, leveraging the timewarp to achieve it would put the network constantly on the Brink of being seriously (fatally?) harmed. And this sets pernicious incentives too. Every individual user has a short-term incentive to get lower fees by the increased block space, at the expense of all users longer term. And every individual miner has an incentive to get more block reward at the expense of future miners. (And of course bigger miners benefit from an increased block frequency.)
>
> Every single concern mentioned here is addressed prominently in the paper/presentation for Forward Blocks:
>
> * Increased block frequency is only on the compatibility chain, where the content of blocks is deterministic anyway. There is no centralization pressure from the frequency of blocks on the compatibility chain, as the content of the blocks is not miner-editable in economically meaningful ways. Only the block frequency of the forward block chain matters, and here the block frequency is actually *reduced*, thereby decreasing centralization pressure.
>
> * The elastic block size adjustment mechanism proposed in the paper is purposefully constructed so that users or miners wanting to increase the block size beyond what is currently provided for will have to pay significantly (multiple orders of magnitude) more than they could possibly acquire from larger blocks, and the block size would re-adjust downward shortly after the cessation of that artificial fee pressure.
>
> * Increased block frequency of compatibility blocks has no effect on the total issuance, so miners are not rewarded by faster blocks.
>
> You are free to criticize Forward Blocks, but please do so by actually addressing the content of the proposal. Let's please hold a standard of intellectual excellence on this mailing list in which ideas are debated based on content-level arguments rather than repeating inaccurate takes from Reddit/Twitter.
>
> To the topic of the thread, disabling time-warp will close off an unlikely and difficult to pull off subsidy draining attack that to activate would necessarily require weeks of forewarning and could be easily countered in other ways, with the tradeoff of removing the only known mechanism for upgrading the bitcoin protocol to larger effective block sizes while staying 100% compatible with un-upgraded nodes (all nodes see all transactions).
>
> I think we should keep our options open.
>
> -Mark
>
> --
> You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/62640263-077c-4ac7-98a6-d9c17913fca0n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/8fFFuAU-SN2NrQ2SKhS2eOeLkHIdCQtnivE4LzWe32vk5gejNEwNvr9IIa3JJ-sII2UUIpOx8oRMslzmA1ZL6y1kBuQEB1fpTaXku2QGAC0%3D%40protonmail.com.

[-- Attachment #2: Type: text/html, Size: 8615 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-04-18 10:04       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
@ 2024-04-25  6:08         ` Antoine Riard
  2024-04-30 22:20           ` Mark F
  0 siblings, 1 reply; 33+ messages in thread
From: Antoine Riard @ 2024-04-25  6:08 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 10647 bytes --]

Hi Maaku,

> Every single concern mentioned here is addressed prominently in the 
paper/presentation for Forward Blocks:
>
> * Increased block frequency is only on the compatibility chain, where the 
content of blocks is deterministic anyway. There is no centralization 
pressure from the frequency > of blocks on the compatibility chain, as the 
content of the blocks is not miner-editable in economically meaningful 
ways. Only the block frequency of the forward block > chain matters, and 
here the block frequency is actually *reduced*, thereby decreasing 
centralization pressure.
>
> * The elastic block size adjustment mechanism proposed in the paper is 
purposefully constructed so that users or miners wanting to increase the 
block size beyond what > is currently provided for will have to pay 
significantly (multiple orders of magnitude) more than they could possibly 
acquire from larger blocks, and the block size would re-> adjust downward 
shortly after the cessation of that artificial fee pressure.

> * Increased block frequency of compatibility blocks has no effect on the 
total issuance, so miners are not rewarded by faster blocks.

> You are free to criticize Forward Blocks, but please do so by actually 
addressing the content of the proposal. Let's please hold a standard of 
intellectual excellence on this > mailing list in which ideas are debated 
based on content-level arguments rather than repeating inaccurate takes 
from Reddit/Twitter.

> To the topic of the thread, disabling time-warp will close off an 
unlikely and difficult to pull off subsidy draining attack that to activate 
would necessarily require weeks of > forewarning and could be easily 
countered in other ways, with the tradeoff of removing the only known 
mechanism for upgrading the bitcoin protocol to larger effective > block 
sizes while staying 100% compatible with un-upgraded nodes (all nodes see 
all transactions).

> I think we should keep our options open.

Somehow, I'm sharing your concerns on preserving the long-term evolvability 
w.r.t scalability options
of bitcoin under the security model as very roughly describer in the paper. 
Yet, from my understanding
of the forwarding block proposal as described in your paper, I wonder if 
the forward block chain could
be re-pegged to the main bitcoin chain using the BIP141 extensible 
commitment structure (assuming
a future hypothetical soft-fork).

From my understanding, it's like doubly linked-list in C, you just need a 
pointer in the BIP141 extensible
commitment structure referencing back the forward chain headers. If one 
wishes no logically authoritative
cross-chain commitment, one could leverage some dynamic-membership 
multi-party signature. This
DMMS could even be backup by proof-of-work based schemes.

The forward block chain can have higher block-rate frequency and the number 
of block headers be
compressed in a merkle tree committed in the BIP141 extensible commitment 
structure. Compression
structure can only be defined by the forward chain consensus algorithm to 
allow more efficient accumulator
than merkle tree to be used".

The forward block chain can have elastic block size consensus-bounded by 
miners fees on long period
of time. Transaction elements can be just committed in the block headers 
themselves, so no centralization
pressure on the main chain. Increased block frequency or block size on the 
forward block chain have not
effect on the total issuance (modulo the game-theory limits of the known 
empirical effects of colored coins
on miners incentives).

I think the time-warp issues opens the door to economically non-null 
exploitation under some scenarios
over some considered time periods. If one can think to other ways to 
mitigate the issue in minimal and
non-invasive way w.r.t current Bitcoin consensus rules and respecting 
un-upgraded node ressources
consumption, I would say you're free to share them.

I can only share your take on maintaining a standard of intellectual 
excellence on the mailing list,
and avoid faltering in Reddit / Twitter-style "madness of the crowd"-like 
conversations.

Best,
Antoine

Le vendredi 19 avril 2024 à 01:19:23 UTC+1, Antoine Poinsot a écrit :

> You are free to criticize Forward Blocks, but please do so by actually 
> addressing the content of the proposal. Let's please hold a standard of 
> intellectual excellence on this mailing list in which ideas are debated 
> based on content-level arguments rather than repeating inaccurate takes 
> from Reddit/Twitter.
>
>
> You are the one being dishonest here. Look, i understand you came up with 
> a fun hack exploiting bugs in Bitcoin and you are biased against fixing 
> them. Yet, the cost of not fixing timewarp objectively far exceeds the 
> cost of making "forward blocks" impossible.
>
> As already addressed in the DelvingBitcoin post:
>
>    1. The timewarp bug significantly changes the 51% attacker threat 
>    model. Without exploiting it a censoring miner needs to continuously keep 
>    more hashrate than the rest of the network combined for as long as he wants 
>    to prevent some people from using Bitcoin. By exploiting timewarp the 
>    attacker can prevent everybody from using Bitcoin within 40 days.
>    2. The timewarp bug allows an attacking miner to force on full nodes 
>    more block data than they agreed to. This is actually the attack leveraged 
>    by your proposal. I believe this variant of the attack is more likely to 
>    happen, simply for the reason that all participants of the system have a 
>    short term incentive to exploit this (yay lower fees! yay more block 
>    subsidy!), at the expense of the long term health of the system. As the 
>    block subsidy exponentially decreases miners are likely to start playing 
>    more games and that's a particularly attractive one. Given the level of 
>    mining centralization we are witnessing [0] i believe this is particularly 
>    worrisome.
>    3. I'm very skeptical of arguments about how "we" can stop an attack 
>    which requires "weeks of forewarning". Who's we? How do we proceed, all 
>    Bitcoin users coordinate and arbitrarily decide of the validity of a block? 
>    A few weeks is very little time if this is at all achievable. If you add on 
>    top of that the political implications of the previous point it gets 
>    particularly messy.
>
>
> I've got better things to do than to play "you are being dishonest! -no 
> it's you -no you" games. So unless you bring something new to the table 
> this will be my last reply to your accusations.
>
> Antoine
>
> [0] https://x.com/0xB10C/status/1780611768081121700
> On Thursday, April 18th, 2024 at 2:46 AM, Mark F <ma...@friedenbach•org> 
> wrote:
>
> On Wednesday, March 27, 2024 at 4:00:34 AM UTC-7 Antoine Poinsot wrote:
>
> The only beneficial case I can remember about the timewarp issue is 
> "forwarding blocks" by maaku for on-chain scaling:
> http://freico.in/forward-blocks-scalingbitcoin-paper.pdf
>
>
> I would not qualify this hack of "beneficial". Besides the centralization 
> pressure of an increased block frequency, leveraging the timewarp to 
> achieve it would put the network constantly on the Brink of being seriously 
> (fatally?) harmed. And this sets pernicious incentives too. Every 
> individual user has a short-term incentive to get lower fees by the 
> increased block space, at the expense of all users longer term. And every 
> individual miner has an incentive to get more block reward at the expense 
> of future miners. (And of course bigger miners benefit from an increased 
> block frequency.)
>
> Every single concern mentioned here is addressed prominently in the 
> paper/presentation for Forward Blocks:
>
> * Increased block frequency is only on the compatibility chain, where the 
> content of blocks is deterministic anyway. There is no centralization 
> pressure from the frequency of blocks on the compatibility chain, as the 
> content of the blocks is not miner-editable in economically meaningful 
> ways. Only the block frequency of the forward block chain matters, and here 
> the block frequency is actually *reduced*, thereby decreasing 
> centralization pressure.
>
> * The elastic block size adjustment mechanism proposed in the paper is 
> purposefully constructed so that users or miners wanting to increase the 
> block size beyond what is currently provided for will have to pay 
> significantly (multiple orders of magnitude) more than they could possibly 
> acquire from larger blocks, and the block size would re-adjust downward 
> shortly after the cessation of that artificial fee pressure.
>
> * Increased block frequency of compatibility blocks has no effect on the 
> total issuance, so miners are not rewarded by faster blocks.
>
> You are free to criticize Forward Blocks, but please do so by actually 
> addressing the content of the proposal. Let's please hold a standard of 
> intellectual excellence on this mailing list in which ideas are debated 
> based on content-level arguments rather than repeating inaccurate takes 
> from Reddit/Twitter.
>
> To the topic of the thread, disabling time-warp will close off an unlikely 
> and difficult to pull off subsidy draining attack that to activate would 
> necessarily require weeks of forewarning and could be easily countered in 
> other ways, with the tradeoff of removing the only known mechanism for 
> upgrading the bitcoin protocol to larger effective block sizes while 
> staying 100% compatible with un-upgraded nodes (all nodes see all 
> transactions).
>
> I think we should keep our options open.
>
> -Mark
>
> -- 
>
> You received this message because you are subscribed to the Google Groups 
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to bitcoindev+...@googlegroups•com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/bitcoindev/62640263-077c-4ac7-98a6-d9c17913fca0n%40googlegroups.com
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/3e93b83e-f0ea-43b9-8f77-f7b044fb3187n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 14107 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-04-25  6:08         ` Antoine Riard
@ 2024-04-30 22:20           ` Mark F
  2024-05-06  1:10             ` Antoine Riard
  0 siblings, 1 reply; 33+ messages in thread
From: Mark F @ 2024-04-30 22:20 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 13244 bytes --]

Hi Antoine,

That's a reasonable suggestion, and one which has been discussed in the 
past under various names. Concrete ideas for a pegged extension-block side 
chain go back to 2014 at the very least. However there is one concrete way 
in which these proposals differ from forward blocks: the replay of 
transactions to the compatibility block chain. With forward blocks, even 
ancient versions of bitcoind that have been running since 2013 (picked as a 
cutoff because of the probabilistic fork caused by v0.8) will see all 
blocks, and have a complete listing of all UTXOs, and the content of 
transactions as they appear.

Does this matter? In principle you can just upgrade all nodes to understand 
the extension block, but in practice for a system as diverse as bitcoin 
support of older node versions is often required in critical 
infrastructure. Think of all the block explorer and mempool websites out 
there, for example, and various network monitoring and charting tools. Many 
of which are poorly maintained and probably running on two or three year 
old versions of Bitcoin Core.

The forward blocks proposal uses the timewarp bug to enable (1) a 
proof-of-work change, (2) sharding, (3) subsidy schedule smoothing, and (4) 
a flexible block size, all without forcing any non-mining nodes to *have* 
to upgrade in order to regain visibility into the network. Yes it's an 
everything-and-the-kitchen-sink straw man proposal, but that was on purpose 
to show that all these so-called “hard-fork” changes can in fact be done as 
a soft-fork on vanilla bitcoin, while supporting even the oldest 
still-running nodes.

That changes if we "fix" the timewarp bug though. At the very least, the 
flexible block size and subsidy schedule smoothing can't be accomplished 
without exploiting the timewarp bug, as far as anyone can tell. Therefore 
fixing the timewarp bug will _permanently_ cutoff the bitcoin community 
from ever having the ability to scale on-chain in a backwards-compatible 
way, now or decades or centuries into the future.

Once thrown, this fuse switch can't be undone. We should be damn sure we 
will never, ever need that capability before giving it up.

Mark

On Thursday, April 25, 2024 at 3:46:40 AM UTC-7 Antoine Riard wrote:

> Hi Maaku,
>
> > Every single concern mentioned here is addressed prominently in the 
> paper/presentation for Forward Blocks:
> >
> > * Increased block frequency is only on the compatibility chain, where 
> the content of blocks is deterministic anyway. There is no centralization 
> pressure from the frequency > of blocks on the compatibility chain, as the 
> content of the blocks is not miner-editable in economically meaningful 
> ways. Only the block frequency of the forward block > chain matters, and 
> here the block frequency is actually *reduced*, thereby decreasing 
> centralization pressure.
> >
> > * The elastic block size adjustment mechanism proposed in the paper is 
> purposefully constructed so that users or miners wanting to increase the 
> block size beyond what > is currently provided for will have to pay 
> significantly (multiple orders of magnitude) more than they could possibly 
> acquire from larger blocks, and the block size would re-> adjust downward 
> shortly after the cessation of that artificial fee pressure.
>
> > * Increased block frequency of compatibility blocks has no effect on the 
> total issuance, so miners are not rewarded by faster blocks.
>
> > You are free to criticize Forward Blocks, but please do so by actually 
> addressing the content of the proposal. Let's please hold a standard of 
> intellectual excellence on this > mailing list in which ideas are debated 
> based on content-level arguments rather than repeating inaccurate takes 
> from Reddit/Twitter.
>
> > To the topic of the thread, disabling time-warp will close off an 
> unlikely and difficult to pull off subsidy draining attack that to activate 
> would necessarily require weeks of > forewarning and could be easily 
> countered in other ways, with the tradeoff of removing the only known 
> mechanism for upgrading the bitcoin protocol to larger effective > block 
> sizes while staying 100% compatible with un-upgraded nodes (all nodes see 
> all transactions).
>
> > I think we should keep our options open.
>
> Somehow, I'm sharing your concerns on preserving the long-term 
> evolvability w.r.t scalability options
> of bitcoin under the security model as very roughly describer in the 
> paper. Yet, from my understanding
> of the forwarding block proposal as described in your paper, I wonder if 
> the forward block chain could
> be re-pegged to the main bitcoin chain using the BIP141 extensible 
> commitment structure (assuming
> a future hypothetical soft-fork).
>
> From my understanding, it's like doubly linked-list in C, you just need a 
> pointer in the BIP141 extensible
> commitment structure referencing back the forward chain headers. If one 
> wishes no logically authoritative
> cross-chain commitment, one could leverage some dynamic-membership 
> multi-party signature. This
> DMMS could even be backup by proof-of-work based schemes.
>
> The forward block chain can have higher block-rate frequency and the 
> number of block headers be
> compressed in a merkle tree committed in the BIP141 extensible commitment 
> structure. Compression
> structure can only be defined by the forward chain consensus algorithm to 
> allow more efficient accumulator
> than merkle tree to be used".
>
> The forward block chain can have elastic block size consensus-bounded by 
> miners fees on long period
> of time. Transaction elements can be just committed in the block headers 
> themselves, so no centralization
> pressure on the main chain. Increased block frequency or block size on the 
> forward block chain have not
> effect on the total issuance (modulo the game-theory limits of the known 
> empirical effects of colored coins
> on miners incentives).
>
> I think the time-warp issues opens the door to economically non-null 
> exploitation under some scenarios
> over some considered time periods. If one can think to other ways to 
> mitigate the issue in minimal and
> non-invasive way w.r.t current Bitcoin consensus rules and respecting 
> un-upgraded node ressources
> consumption, I would say you're free to share them.
>
> I can only share your take on maintaining a standard of intellectual 
> excellence on the mailing list,
> and avoid faltering in Reddit / Twitter-style "madness of the crowd"-like 
> conversations.
>
> Best,
> Antoine
>
> Le vendredi 19 avril 2024 à 01:19:23 UTC+1, Antoine Poinsot a écrit :
>
>> You are free to criticize Forward Blocks, but please do so by actually 
>> addressing the content of the proposal. Let's please hold a standard of 
>> intellectual excellence on this mailing list in which ideas are debated 
>> based on content-level arguments rather than repeating inaccurate takes 
>> from Reddit/Twitter.
>>
>>
>> You are the one being dishonest here. Look, i understand you came up with 
>> a fun hack exploiting bugs in Bitcoin and you are biased against fixing 
>> them. Yet, the cost of not fixing timewarp objectively far exceeds the 
>> cost of making "forward blocks" impossible.
>>
>> As already addressed in the DelvingBitcoin post:
>>
>>    1. The timewarp bug significantly changes the 51% attacker threat 
>>    model. Without exploiting it a censoring miner needs to continuously keep 
>>    more hashrate than the rest of the network combined for as long as he wants 
>>    to prevent some people from using Bitcoin. By exploiting timewarp the 
>>    attacker can prevent everybody from using Bitcoin within 40 days.
>>    2. The timewarp bug allows an attacking miner to force on full nodes 
>>    more block data than they agreed to. This is actually the attack leveraged 
>>    by your proposal. I believe this variant of the attack is more likely to 
>>    happen, simply for the reason that all participants of the system have a 
>>    short term incentive to exploit this (yay lower fees! yay more block 
>>    subsidy!), at the expense of the long term health of the system. As the 
>>    block subsidy exponentially decreases miners are likely to start playing 
>>    more games and that's a particularly attractive one. Given the level of 
>>    mining centralization we are witnessing [0] i believe this is particularly 
>>    worrisome.
>>    3. I'm very skeptical of arguments about how "we" can stop an attack 
>>    which requires "weeks of forewarning". Who's we? How do we proceed, all 
>>    Bitcoin users coordinate and arbitrarily decide of the validity of a block? 
>>    A few weeks is very little time if this is at all achievable. If you add on 
>>    top of that the political implications of the previous point it gets 
>>    particularly messy.
>>
>>
>> I've got better things to do than to play "you are being dishonest! -no 
>> it's you -no you" games. So unless you bring something new to the table 
>> this will be my last reply to your accusations.
>>
>> Antoine
>>
>> [0] https://x.com/0xB10C/status/1780611768081121700
>> On Thursday, April 18th, 2024 at 2:46 AM, Mark F <ma...@friedenbach•org> 
>> wrote:
>>
>> On Wednesday, March 27, 2024 at 4:00:34 AM UTC-7 Antoine Poinsot wrote:
>>
>> The only beneficial case I can remember about the timewarp issue is 
>> "forwarding blocks" by maaku for on-chain scaling:
>> http://freico.in/forward-blocks-scalingbitcoin-paper.pdf
>>
>>
>> I would not qualify this hack of "beneficial". Besides the centralization 
>> pressure of an increased block frequency, leveraging the timewarp to 
>> achieve it would put the network constantly on the Brink of being seriously 
>> (fatally?) harmed. And this sets pernicious incentives too. Every 
>> individual user has a short-term incentive to get lower fees by the 
>> increased block space, at the expense of all users longer term. And every 
>> individual miner has an incentive to get more block reward at the expense 
>> of future miners. (And of course bigger miners benefit from an increased 
>> block frequency.)
>>
>> Every single concern mentioned here is addressed prominently in the 
>> paper/presentation for Forward Blocks:
>>
>> * Increased block frequency is only on the compatibility chain, where the 
>> content of blocks is deterministic anyway. There is no centralization 
>> pressure from the frequency of blocks on the compatibility chain, as the 
>> content of the blocks is not miner-editable in economically meaningful 
>> ways. Only the block frequency of the forward block chain matters, and here 
>> the block frequency is actually *reduced*, thereby decreasing 
>> centralization pressure.
>>
>> * The elastic block size adjustment mechanism proposed in the paper is 
>> purposefully constructed so that users or miners wanting to increase the 
>> block size beyond what is currently provided for will have to pay 
>> significantly (multiple orders of magnitude) more than they could possibly 
>> acquire from larger blocks, and the block size would re-adjust downward 
>> shortly after the cessation of that artificial fee pressure.
>>
>> * Increased block frequency of compatibility blocks has no effect on the 
>> total issuance, so miners are not rewarded by faster blocks.
>>
>> You are free to criticize Forward Blocks, but please do so by actually 
>> addressing the content of the proposal. Let's please hold a standard of 
>> intellectual excellence on this mailing list in which ideas are debated 
>> based on content-level arguments rather than repeating inaccurate takes 
>> from Reddit/Twitter.
>>
>> To the topic of the thread, disabling time-warp will close off an 
>> unlikely and difficult to pull off subsidy draining attack that to activate 
>> would necessarily require weeks of forewarning and could be easily 
>> countered in other ways, with the tradeoff of removing the only known 
>> mechanism for upgrading the bitcoin protocol to larger effective block 
>> sizes while staying 100% compatible with un-upgraded nodes (all nodes see 
>> all transactions).
>>
>> I think we should keep our options open.
>>
>> -Mark
>>
>> -- 
>>
>> You received this message because you are subscribed to the Google Groups 
>> "Bitcoin Development Mailing List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to bitcoindev+...@googlegroups•com.
>>
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/bitcoindev/62640263-077c-4ac7-98a6-d9c17913fca0n%40googlegroups.com
>> .
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/67ec72f6-b89f-4f8d-8629-0ebc8bdb7acfn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 16621 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-04-30 22:20           ` Mark F
@ 2024-05-06  1:10             ` Antoine Riard
  0 siblings, 0 replies; 33+ messages in thread
From: Antoine Riard @ 2024-05-06  1:10 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 16902 bytes --]

Hi Maaku,

From reading back the "forward block" paper, while it effectively 
guarantees an on-chain settlment throughput increases without the
necessity to upgrade old clients, one could argue the proof-of-work change 
on the forward chain (unless it's a no-op double-sha256)
coupled with the subsidy schedule smoothing, constitutes a substantial 
change of the already-mined UTXO security model. You can
use a lot of hash functions as proof-of-work primitive, though it doesn't 
mean they are relying on as strong assumptions or level
of cryptanalysis.

In fine, you could have poorly picked up hash function for the forward 
chain resulting in a lowering of everyone coins security
(the > 100 TH/s of today is securing years old coins from block mined under 
< 1 TH/s). I hold the opinion that fundamental changes
affecting the security of everyone coins should be better to be opted-in by 
the super-economic majority of nodes, including non-mining
nodes. At the contrary, the "forward block" proposal sounds to make the 
point it's okay to update proof-of-work algorithm by a
combined set of mining nodes and upgraded non-mining nodes, which could 
hypothetically lead to a "security downgrade" due to weaker
proof-of-work algorithm used on the forward chain.

While your papers introduce formalization of both full-node cost of 
validation and censorship resistance concepts, one could also
add "hardness to change" as a property of the Bitcoin network we all 
cherishes. If tomorrow, 10% of the hahrate was able to enforce
proof-of-work upgrade to the broken SHA-1, I think we would all consider as 
a security downgrade.

Beyond, this is corect we have a diversity of old nodes used in the 
ecosystem, probably for block explorer and mempool websites.
Yet in practice, they're more certainly vectors of weakness for their 
end-users, as Bitcoin Core has sadly a limited security fixes
backport policy, which doesn't go as far as v0.8 for sure. That we can all 
deplore the lack of half-decade old LTS release policy for
Bitcoin Core, like done by the Linux kernel is a legitimate conversation to 
have (and it would be indeed make it easier with
libbitcoinkernel progress). I think we shall rather invite operators of 
oldest still-running nodes to upgrade to more recent
versions, before to ask them to go through the analytical process of 
weighting all the security / scalability trade-offs of a
proposal like "forward block".

Finally, on letting options open to bump block inter-val as a soft-fork on 
the compatibility chain, I think one could still have
a multi-stage "forward block" deployment, where a) a new difficutly 
adjustment algoritm with parameters is introduced bumping block
inter-val for upgraded mining nodes e.g a block every 400 s in average and 
the b) re-use this block inter-val capacity increase for
the forward chain flexible block size. Now why a miner would opt-in in such 
block-interval constraining soft-fork is a good question,
in a paradigm where they still get the same block subsidy distribution.

This is just a thought experiment aiming to invalidate the "as far as 
anyone can tell" statement on forclosing forever on-chain
settlement throughput increase, if we fix the timewarp bug.

Best,
Antoine
Le mercredi 1 mai 2024 à 09:58:48 UTC+1, Mark F a écrit :

> Hi Antoine,
>
> That's a reasonable suggestion, and one which has been discussed in the 
> past under various names. Concrete ideas for a pegged extension-block side 
> chain go back to 2014 at the very least. However there is one concrete way 
> in which these proposals differ from forward blocks: the replay of 
> transactions to the compatibility block chain. With forward blocks, even 
> ancient versions of bitcoind that have been running since 2013 (picked as a 
> cutoff because of the probabilistic fork caused by v0.8) will see all 
> blocks, and have a complete listing of all UTXOs, and the content of 
> transactions as they appear.
>
> Does this matter? In principle you can just upgrade all nodes to 
> understand the extension block, but in practice for a system as diverse as 
> bitcoin support of older node versions is often required in critical 
> infrastructure. Think of all the block explorer and mempool websites out 
> there, for example, and various network monitoring and charting tools. Many 
> of which are poorly maintained and probably running on two or three year 
> old versions of Bitcoin Core.
>
> The forward blocks proposal uses the timewarp bug to enable (1) a 
> proof-of-work change, (2) sharding, (3) subsidy schedule smoothing, and (4) 
> a flexible block size, all without forcing any non-mining nodes to *have* 
> to upgrade in order to regain visibility into the network. Yes it's an 
> everything-and-the-kitchen-sink straw man proposal, but that was on purpose 
> to show that all these so-called “hard-fork” changes can in fact be done as 
> a soft-fork on vanilla bitcoin, while supporting even the oldest 
> still-running nodes.
>
> That changes if we "fix" the timewarp bug though. At the very least, the 
> flexible block size and subsidy schedule smoothing can't be accomplished 
> without exploiting the timewarp bug, as far as anyone can tell. Therefore 
> fixing the timewarp bug will _permanently_ cutoff the bitcoin community 
> from ever having the ability to scale on-chain in a backwards-compatible 
> way, now or decades or centuries into the future.
>
> Once thrown, this fuse switch can't be undone. We should be damn sure we 
> will never, ever need that capability before giving it up.
>
> Mark
>
> On Thursday, April 25, 2024 at 3:46:40 AM UTC-7 Antoine Riard wrote:
>
>> Hi Maaku,
>>
>> > Every single concern mentioned here is addressed prominently in the 
>> paper/presentation for Forward Blocks:
>> >
>> > * Increased block frequency is only on the compatibility chain, where 
>> the content of blocks is deterministic anyway. There is no centralization 
>> pressure from the frequency > of blocks on the compatibility chain, as the 
>> content of the blocks is not miner-editable in economically meaningful 
>> ways. Only the block frequency of the forward block > chain matters, and 
>> here the block frequency is actually *reduced*, thereby decreasing 
>> centralization pressure.
>> >
>> > * The elastic block size adjustment mechanism proposed in the paper is 
>> purposefully constructed so that users or miners wanting to increase the 
>> block size beyond what > is currently provided for will have to pay 
>> significantly (multiple orders of magnitude) more than they could possibly 
>> acquire from larger blocks, and the block size would re-> adjust downward 
>> shortly after the cessation of that artificial fee pressure.
>>
>> > * Increased block frequency of compatibility blocks has no effect on 
>> the total issuance, so miners are not rewarded by faster blocks.
>>
>> > You are free to criticize Forward Blocks, but please do so by actually 
>> addressing the content of the proposal. Let's please hold a standard of 
>> intellectual excellence on this > mailing list in which ideas are debated 
>> based on content-level arguments rather than repeating inaccurate takes 
>> from Reddit/Twitter.
>>
>> > To the topic of the thread, disabling time-warp will close off an 
>> unlikely and difficult to pull off subsidy draining attack that to activate 
>> would necessarily require weeks of > forewarning and could be easily 
>> countered in other ways, with the tradeoff of removing the only known 
>> mechanism for upgrading the bitcoin protocol to larger effective > block 
>> sizes while staying 100% compatible with un-upgraded nodes (all nodes see 
>> all transactions).
>>
>> > I think we should keep our options open.
>>
>> Somehow, I'm sharing your concerns on preserving the long-term 
>> evolvability w.r.t scalability options
>> of bitcoin under the security model as very roughly describer in the 
>> paper. Yet, from my understanding
>> of the forwarding block proposal as described in your paper, I wonder if 
>> the forward block chain could
>> be re-pegged to the main bitcoin chain using the BIP141 extensible 
>> commitment structure (assuming
>> a future hypothetical soft-fork).
>>
>> From my understanding, it's like doubly linked-list in C, you just need a 
>> pointer in the BIP141 extensible
>> commitment structure referencing back the forward chain headers. If one 
>> wishes no logically authoritative
>> cross-chain commitment, one could leverage some dynamic-membership 
>> multi-party signature. This
>> DMMS could even be backup by proof-of-work based schemes.
>>
>> The forward block chain can have higher block-rate frequency and the 
>> number of block headers be
>> compressed in a merkle tree committed in the BIP141 extensible commitment 
>> structure. Compression
>> structure can only be defined by the forward chain consensus algorithm to 
>> allow more efficient accumulator
>> than merkle tree to be used".
>>
>> The forward block chain can have elastic block size consensus-bounded by 
>> miners fees on long period
>> of time. Transaction elements can be just committed in the block headers 
>> themselves, so no centralization
>> pressure on the main chain. Increased block frequency or block size on 
>> the forward block chain have not
>> effect on the total issuance (modulo the game-theory limits of the known 
>> empirical effects of colored coins
>> on miners incentives).
>>
>> I think the time-warp issues opens the door to economically non-null 
>> exploitation under some scenarios
>> over some considered time periods. If one can think to other ways to 
>> mitigate the issue in minimal and
>> non-invasive way w.r.t current Bitcoin consensus rules and respecting 
>> un-upgraded node ressources
>> consumption, I would say you're free to share them.
>>
>> I can only share your take on maintaining a standard of intellectual 
>> excellence on the mailing list,
>> and avoid faltering in Reddit / Twitter-style "madness of the crowd"-like 
>> conversations.
>>
>> Best,
>> Antoine
>>
>> Le vendredi 19 avril 2024 à 01:19:23 UTC+1, Antoine Poinsot a écrit :
>>
>>> You are free to criticize Forward Blocks, but please do so by actually 
>>> addressing the content of the proposal. Let's please hold a standard of 
>>> intellectual excellence on this mailing list in which ideas are debated 
>>> based on content-level arguments rather than repeating inaccurate takes 
>>> from Reddit/Twitter.
>>>
>>>
>>> You are the one being dishonest here. Look, i understand you came up 
>>> with a fun hack exploiting bugs in Bitcoin and you are biased against 
>>> fixing them. Yet, the cost of not fixing timewarp objectively far 
>>> exceeds the cost of making "forward blocks" impossible.
>>>
>>> As already addressed in the DelvingBitcoin post:
>>>
>>>    1. The timewarp bug significantly changes the 51% attacker threat 
>>>    model. Without exploiting it a censoring miner needs to continuously keep 
>>>    more hashrate than the rest of the network combined for as long as he wants 
>>>    to prevent some people from using Bitcoin. By exploiting timewarp the 
>>>    attacker can prevent everybody from using Bitcoin within 40 days.
>>>    2. The timewarp bug allows an attacking miner to force on full nodes 
>>>    more block data than they agreed to. This is actually the attack leveraged 
>>>    by your proposal. I believe this variant of the attack is more likely to 
>>>    happen, simply for the reason that all participants of the system have a 
>>>    short term incentive to exploit this (yay lower fees! yay more block 
>>>    subsidy!), at the expense of the long term health of the system. As the 
>>>    block subsidy exponentially decreases miners are likely to start playing 
>>>    more games and that's a particularly attractive one. Given the level of 
>>>    mining centralization we are witnessing [0] i believe this is particularly 
>>>    worrisome.
>>>    3. I'm very skeptical of arguments about how "we" can stop an attack 
>>>    which requires "weeks of forewarning". Who's we? How do we proceed, all 
>>>    Bitcoin users coordinate and arbitrarily decide of the validity of a block? 
>>>    A few weeks is very little time if this is at all achievable. If you add on 
>>>    top of that the political implications of the previous point it gets 
>>>    particularly messy.
>>>
>>>
>>> I've got better things to do than to play "you are being dishonest! -no 
>>> it's you -no you" games. So unless you bring something new to the table 
>>> this will be my last reply to your accusations.
>>>
>>> Antoine
>>>
>>> [0] https://x.com/0xB10C/status/1780611768081121700
>>> On Thursday, April 18th, 2024 at 2:46 AM, Mark F <ma...@friedenbach•org> 
>>> wrote:
>>>
>>> On Wednesday, March 27, 2024 at 4:00:34 AM UTC-7 Antoine Poinsot wrote:
>>>
>>> The only beneficial case I can remember about the timewarp issue is 
>>> "forwarding blocks" by maaku for on-chain scaling:
>>> http://freico.in/forward-blocks-scalingbitcoin-paper.pdf
>>>
>>>
>>> I would not qualify this hack of "beneficial". Besides the 
>>> centralization pressure of an increased block frequency, leveraging the 
>>> timewarp to achieve it would put the network constantly on the Brink of 
>>> being seriously (fatally?) harmed. And this sets pernicious incentives too. 
>>> Every individual user has a short-term incentive to get lower fees by the 
>>> increased block space, at the expense of all users longer term. And every 
>>> individual miner has an incentive to get more block reward at the expense 
>>> of future miners. (And of course bigger miners benefit from an increased 
>>> block frequency.)
>>>
>>> Every single concern mentioned here is addressed prominently in the 
>>> paper/presentation for Forward Blocks:
>>>
>>> * Increased block frequency is only on the compatibility chain, where 
>>> the content of blocks is deterministic anyway. There is no centralization 
>>> pressure from the frequency of blocks on the compatibility chain, as the 
>>> content of the blocks is not miner-editable in economically meaningful 
>>> ways. Only the block frequency of the forward block chain matters, and here 
>>> the block frequency is actually *reduced*, thereby decreasing 
>>> centralization pressure.
>>>
>>> * The elastic block size adjustment mechanism proposed in the paper is 
>>> purposefully constructed so that users or miners wanting to increase the 
>>> block size beyond what is currently provided for will have to pay 
>>> significantly (multiple orders of magnitude) more than they could possibly 
>>> acquire from larger blocks, and the block size would re-adjust downward 
>>> shortly after the cessation of that artificial fee pressure.
>>>
>>> * Increased block frequency of compatibility blocks has no effect on the 
>>> total issuance, so miners are not rewarded by faster blocks.
>>>
>>> You are free to criticize Forward Blocks, but please do so by actually 
>>> addressing the content of the proposal. Let's please hold a standard of 
>>> intellectual excellence on this mailing list in which ideas are debated 
>>> based on content-level arguments rather than repeating inaccurate takes 
>>> from Reddit/Twitter.
>>>
>>> To the topic of the thread, disabling time-warp will close off an 
>>> unlikely and difficult to pull off subsidy draining attack that to activate 
>>> would necessarily require weeks of forewarning and could be easily 
>>> countered in other ways, with the tradeoff of removing the only known 
>>> mechanism for upgrading the bitcoin protocol to larger effective block 
>>> sizes while staying 100% compatible with un-upgraded nodes (all nodes see 
>>> all transactions).
>>>
>>> I think we should keep our options open.
>>>
>>> -Mark
>>>
>>> -- 
>>>
>>> You received this message because you are subscribed to the Google 
>>> Groups "Bitcoin Development Mailing List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to bitcoindev+...@googlegroups•com.
>>>
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/bitcoindev/62640263-077c-4ac7-98a6-d9c17913fca0n%40googlegroups.com
>>> .
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/5c2b1a47-5a7a-48f3-9904-c17fa5ece5a6n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 20303 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-03-27 10:35   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-03-27 18:57     ` Antoine Riard
  2024-04-18  0:46     ` Mark F
@ 2024-07-20 21:39     ` Murad Ali
  2 siblings, 0 replies; 33+ messages in thread
From: Murad Ali @ 2024-07-20 21:39 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 6155 bytes --]


ok
On Wednesday 27 March 2024 at 04:00:34 UTC-7 Antoine Poinsot wrote:

>
> Hi Poinsot,
>
>
> Hi Riard,
>
>
> The only beneficial case I can remember about the timewarp issue is 
> "forwarding blocks" by maaku for on-chain scaling:
> http://freico.in/forward-blocks-scalingbitcoin-paper.pdf
>
>
> I would not qualify this hack of "beneficial". Besides the centralization 
> pressure of an increased block frequency, leveraging the timewarp to 
> achieve it would put the network constantly on the Brink of being seriously 
> (fatally?) harmed. And this sets pernicious incentives too. Every 
> individual user has a short-term incentive to get lower fees by the 
> increased block space, at the expense of all users longer term. And every 
> individual miner has an incentive to get more block reward at the expense 
> of future miners. (And of course bigger miners benefit from an increased 
> block frequency.)
>
>
> I think any consensus boundaries on the minimal transaction size would 
> need to be done carefully and have all lightweight
> clients update their own transaction acceptance logic to enforce the check 
> to avoid years-long transitory massive double-spend
> due to software incoordination.
>
>
> Note in my writeup i suggest we do not introduce a minimum transaction, 
> but we instead only make 64 bytes transactions invalid. See 
> https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710#can-we-come-up-with-a-better-fix-10
> :
>
> However the BIP proposes to also make less-than-64-bytes transactions 
> invalid. Although they are of no (or little) use, such transactions are not 
> harmful. I believe considering a type of transaction useless is not 
> sufficient motivation for making them invalid through a soft fork.
>
> Making (exactly) 64 bytes long transactions invalid is also what AJ 
> implemented in his pull request to Bitcoin-inquisition 
> <https://github.com/bitcoin-inquisition/bitcoin/pull/24>.
>
>
>
> I doubt `MIN_STANDARD_TX_NON_WITNESS_SIZE` is implemented correctly by all 
> transaction-relay backends and it's a mess in this area.
>
>
> What type of backend are you referring to here? Bitcoin full nodes 
> reimplementations? These transactions have been non-standard in Bitcoin 
> Core for the past 6 years (commit 7485488e907e236133a016ba7064c89bf9ab6da3
> ).
>
>
> Quid if we have < 64 bytes transaction where the only witness is enforced 
> to be a minimal 1-byte
> as witness elements are only used for higher layers protocols semantics ?
>
>
> This restriction is on the size of the transaction serialized without 
> witness. So this particular instance would not be affected and whatever the 
> witness is isn't relevant.
>
>
> Making coinbase unique by requesting the block height to be enforced in 
> nLocktime, sounds more robust to take a monotonic counter
> in the past in case of accidental or provoked shallow reorgs. I can see of 
> you would have to re-compute a block template, loss a round-trip
> compare to your mining competitors. Better if it doesn't introduce a new 
> DoS vector at mining job distribution and control.
>
>
> Could you clarify? Are you suggesting something else than to set the 
> nLockTime in the coinbase transaction to the height of the block? If so, 
> what exactly are you referring to by "monotonic counter in the past"?
>
> At any rate in my writeup i suggested making the coinbase commitment 
> mandatory (even when empty) instead for compatibility reasons.
>
> That said, since we could make this rule kick in in 25 years from now, we 
> might want to just do the Obvious Thing and just require the height in 
> nLockTime.
>
>
>  and b) it's already a lot of careful consensus
> code to get right :)
>
>
> Definitely. I just want to make sure we are not missing anything important 
> if a soft fork gets proposed along these lines in the future.
>
>
> Best,
> Antoine
>
> Le dimanche 24 mars 2024 à 19:06:57 UTC, Antoine Poinsot a écrit :
>
>> Hey all, 
>>
>> I've recently posted about the Great Consensus Cleanup there: 
>> https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710. 
>>
>> I'm starting a thread on the mailing list as well to get comments and 
>> opinions from people who are not on Delving. 
>>
>> TL;DR: 
>> - i think the worst block validation time is concerning. The mitigations 
>> proposed by Matt are effective, but i think we should also limit the 
>> maximum size of legacy transactions for an additional safety margin; 
>> - i believe it's more important to fix the timewarp bug than people 
>> usually think; 
>> - it would be nice to include a fix to make coinbase transactions unique 
>> once and for all, to avoid having to resort back to doing BIP30 validation 
>> after block 1,983,702; 
>> - 64 bytes transactions should definitely be made invalid, but i don't 
>> think there is a strong case for making less than 64 bytes transactions 
>> invalid. 
>>
>> Anything in there that people disagree with conceptually? 
>> Anything in there that people think shouldn't (or don't need to) be 
>> fixed? 
>> Anything in there which can be improved (a simpler, or better fix)? 
>> Anything NOT in there that people think should be fixed? 
>>
>>
>> Antoine Poinsot 
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to bitcoindev+...@googlegroups•com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/bitcoindev/dc2cc46f-e697-4b14-91b3-34cf11de29a3n%40googlegroups.com
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/34a43375-a08e-4a8d-b409-0fd67730d753n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 12740 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-03-24 18:10 [bitcoindev] Great Consensus Cleanup Revival 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-03-26 19:11 ` [bitcoindev] " Antoine Riard
@ 2024-06-17 22:15 ` Eric Voskuil
  2024-06-18  8:13   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  1 sibling, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-06-17 22:15 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 901 bytes --]

Hi Antoine,

Regarding potential malleability pertaining to blocks with only 64 byte 
transactions, why is not a deserialization phase check for the coinbase 
input as a null point not sufficient mitigation (computational 
infeasibility) for any implementation that desires to perform permanent 
invalidity marking?

Best,
Eric

ref: Weaknesses in Bitcoin’s Merkle Root Construction 
<https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/72e83c31-408f-4c13-bff5-bf0789302e23n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1211 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-17 22:15 ` Eric Voskuil
@ 2024-06-18  8:13   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-06-18 13:02     ` Eric Voskuil
  0 siblings, 1 reply; 33+ messages in thread
From: 'Antoine Poinsot' via Bitcoin Development Mailing List @ 2024-06-18  8:13 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 2315 bytes --]

Hi Eric,

It is. This is what is implemented in Bitcoin Core, see [this snippet](https://github.com/bitcoin/bitcoin/blob/41544b8f96dbc9c6b8998acd6522200d67cdc16d/src/validation.cpp#L4547-L4552) and section 4.1 of the document you reference:

> Another check that was also being done in CheckBlock() relates to the coinbase transaction: if the first transaction in a block fails the required structure of a coinbase – one input, with previous output hash of all zeros and index of all ones – then the block will fail validation. The side effect of this test being in CheckBlock() was that even though the block malleability discussed in section 3.1 was unknown, we were effectively protected against it – as described above, it would take at least 224 bits of work to produce a malleated block that passed the coinbase check.

Best,
Antoine
On Tuesday, June 18th, 2024 at 12:15 AM, Eric Voskuil <eric@voskuil•org> wrote:

> Hi Antoine,
>
> Regarding potential malleability pertaining to blocks with only 64 byte transactions, why is not a deserialization phase check for the coinbase input as a null point not sufficient mitigation (computational infeasibility) for any implementation that desires to perform permanent invalidity marking?
>
> Best,
> Eric
>
> ref: [Weaknesses in Bitcoin’s Merkle Root Construction](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf)
>
> --
> You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/72e83c31-408f-4c13-bff5-bf0789302e23n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/heKH68GFJr4Zuf6lBozPJrb-StyBJPMNvmZL0xvKFBnBGVA3fVSgTLdWc-_8igYWX8z3zCGvzflH-CsRv0QCJQcfwizNyYXlBJa_Kteb2zg%3D%40protonmail.com.

[-- Attachment #2: Type: text/html, Size: 4370 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-18  8:13   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
@ 2024-06-18 13:02     ` Eric Voskuil
  2024-06-21 13:09       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-06-18 13:02 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 2720 bytes --]

Right, a fairly obvious resolution. My question is why is that not 
sufficient - especially given that a similar (context free) check is 
required for duplicated tx malleability? We'd just be swapping one trivial 
check (first input not null) for another (tx size not 64 bytes).

Best,
Eric
On Tuesday, June 18, 2024 at 7:46:28 AM UTC-4 Antoine Poinsot wrote:

> Hi Eric,
>
> It is. This is what is implemented in Bitcoin Core, see this snippet 
> <https://github.com/bitcoin/bitcoin/blob/41544b8f96dbc9c6b8998acd6522200d67cdc16d/src/validation.cpp#L4547-L4552> 
> and section 4.1 of the document you reference:
>
> Another check that was also being done in CheckBlock() relates to the 
> coinbase transaction: if the first transaction in a block fails the 
> required structure of a coinbase – one input, with previous output hash 
> of all zeros and index of all ones – then the block will fail validation. 
> The side effect of this test being in CheckBlock() was that even though 
> the block malleability discussed in section 3.1 was unknown, we were 
> effectively protected against it – as described above, it would take at 
> least 224 bits of work to produce a malleated block that passed the 
> coinbase check.
>
>
> Best,
> Antoine
> On Tuesday, June 18th, 2024 at 12:15 AM, Eric Voskuil <er...@voskuil•org> 
> wrote:
>
> Hi Antoine,
>
> Regarding potential malleability pertaining to blocks with only 64 byte 
> transactions, why is not a deserialization phase check for the coinbase 
> input as a null point not sufficient mitigation (computational 
> infeasibility) for any implementation that desires to perform permanent 
> invalidity marking?
>
> Best,
> Eric
>
> ref: Weaknesses in Bitcoin’s Merkle Root Construction 
> <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf>
>
> -- 
>
> You received this message because you are subscribed to the Google Groups 
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to bitcoindev+...@googlegroups•com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/bitcoindev/72e83c31-408f-4c13-bff5-bf0789302e23n%40googlegroups.com
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/5b0331a5-4e94-465d-a51d-02166e2c1937n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 5603 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-18 13:02     ` Eric Voskuil
@ 2024-06-21 13:09       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-06-24  0:35         ` Eric Voskuil
  0 siblings, 1 reply; 33+ messages in thread
From: 'Antoine Poinsot' via Bitcoin Development Mailing List @ 2024-06-21 13:09 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 4548 bytes --]

Making 64-bytes transactions invalid is indeed not the most pressing bug fix, but i believe it's still a very nice cleanup to include if such a soft fork ends up being seriously proposed.

As discussed here it would let node implementations cache block failures at an earlier stage of validation. Not a large gain, but still nice to have.

As discussed in the DelvingBitcoin post it would also be a small gain of bandwidth for SPV verifiers as they wouldn't have to query a merkle proof for the coinbase transaction in addition to the one for the transaction they're interested in. It would also avoid a large footgun for anyone implementing a software verifying an SPV proof verifier and not knowing the intricacies of the protocol which make such proofs not secure on their own today.

Finally, it would get rid of a large footgun in general. Certainly, unique block hashes would be a useful property for Bitcoin to have. It's not far-fetched to expect current or future Bitcoin-related software to rely on this.

Outlawing 64-bytes transactions is also a very narrow and straightforward change, with trivial confiscatory effect as any 64-bytes transactions would either be unspendable or an anyone-can-spend. Therefore i believe the benefits of making them illegal outweigh the costs.

Best,
Antoine

On Thursday, June 20th, 2024 at 6:57 PM, Eric Voskuil <eric@voskuil•org> wrote:

> Right, a fairly obvious resolution. My question is why is that not sufficient - especially given that a similar (context free) check is required for duplicated tx malleability? We'd just be swapping one trivial check (first input not null) for another (tx size not 64 bytes).
>
> Best,
> Eric
> On Tuesday, June 18, 2024 at 7:46:28 AM UTC-4 Antoine Poinsot wrote:
>
>> Hi Eric,
>>
>> It is. This is what is implemented in Bitcoin Core, see [this snippet](https://github.com/bitcoin/bitcoin/blob/41544b8f96dbc9c6b8998acd6522200d67cdc16d/src/validation.cpp#L4547-L4552) and section 4.1 of the document you reference:
>>
>>> Another check that was also being done in CheckBlock() relates to the coinbase transaction: if the first transaction in a block fails the required structure of a coinbase – one input, with previous output hash of all zeros and index of all ones – then the block will fail validation. The side effect of this test being in CheckBlock() was that even though the block malleability discussed in section 3.1 was unknown, we were effectively protected against it – as described above, it would take at least 224 bits of work to produce a malleated block that passed the coinbase check.
>>
>> Best,
>> Antoine
>>
>> On Tuesday, June 18th, 2024 at 12:15 AM, Eric Voskuil <er...@voskuil•org> wrote:
>>
>>> Hi Antoine,
>>>
>>> Regarding potential malleability pertaining to blocks with only 64 byte transactions, why is not a deserialization phase check for the coinbase input as a null point not sufficient mitigation (computational infeasibility) for any implementation that desires to perform permanent invalidity marking?
>>>
>>> Best,
>>> Eric
>>>
>>> ref: [Weaknesses in Bitcoin’s Merkle Root Construction](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf)
>>
>>> --
>>
>>> You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+...@googlegroups•com.
>>
>>> To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/72e83c31-408f-4c13-bff5-bf0789302e23n%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/5b0331a5-4e94-465d-a51d-02166e2c1937n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/yt1O1F7NiVj-WkmnYeta1fSqCYNFx8h6OiJaTBmwhmJ2MWAZkmmjPlUST6FM7t6_-2NwWKdglWh77vcnEKA8swiAnQCZJY2SSCAh4DOKt2I%3D%40protonmail.com.

[-- Attachment #2: Type: text/html, Size: 9243 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-21 13:09       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
@ 2024-06-24  0:35         ` Eric Voskuil
  2024-06-27  9:35           ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-06-24  0:35 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 2780 bytes --]

Thanks for the responses Antoine.

>  As discussed here it would let node implementations cache block failures 
at an earlier stage of validation. Not a large gain, but still nice to have.

It is not clear to me how determining the coinbase size can be done at an 
earlier stage of validation than detection of the non-null coinbase. The 
former requires parsing the coinbase to determine its size, the latter 
requires parsing it to know if the point is null. Both of these can be 
performed as early as immediately following the socket read.

size check

(1) requires new consensus rule: 64 byte transactions (or coinbases?) are 
invalid.
(2) creates a consensus "seam"  (complexity) in txs, where < 64 bytes and > 
64 bytes are potentially valid.
(3) can be limited to reading/skipping header (80 bytes) plus parsing 0 - 
65 coinbase bytes.

point check

(1) requires no change.
(2) creates no consensus seam.
(3) can be limited to reading/skipping header (80 bytes) plus parsing 6 - 
43 coinbase bytes.

Not only is this not a large (performance) gain, it's not one at all.

> It would also avoid a large footgun for anyone implementing a software 
verifying an SPV proof verifier and not knowing the intricacies of the 
protocol...

It seems to me that introducing an arbitrary tx size validity may create 
more potential implementation bugs than it resolves. And certainly anyone 
implementing such a verifier must know many intricacies of the protocol. 
This does not remove one, it introduces another - as there is not only a 
bifurcation around tx size but one around the question of whether this rule 
is active.

> Finally, it would get rid of a large footgun in general. 

I do not see this. I see a very ugly perpetual seam which will likely 
result in unexpected complexities over time.

> Certainly, unique block hashes would be a useful property for Bitcoin to 
have. It's not far-fetched to expect current or future Bitcoin-related 
software to rely on this.

This does not produce unmalleable block hashes. Duplicate tx hash 
malleation remains in either case, to the same effect. Without a resolution 
to both issues this is an empty promise.

The only possible benefit that I can see here is the possible very small 
bandwidth savings pertaining to SPV proofs. I would have a very hard time 
justifying adding any consensus rule to achieve only that result.

Best,
Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/be78e733-6e9f-4f4e-8dc2-67b79ddbf677n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3281 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-24  0:35         ` Eric Voskuil
@ 2024-06-27  9:35           ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-06-28 17:14             ` Eric Voskuil
  0 siblings, 1 reply; 33+ messages in thread
From: 'Antoine Poinsot' via Bitcoin Development Mailing List @ 2024-06-27  9:35 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 5016 bytes --]

> It is not clear to me how determining the coinbase size can be done at an earlier stage of validation than detection of the non-null coinbase.

My point wasn't about checking the coinbase size, it was about being able to cache the hash of a (non-malleated) invalid block as permanently invalid to avoid re-downloading and re-validating it.

> It seems to me that introducing an arbitrary tx size validity may create more potential implementation bugs than it resolves.

The potential for implementation bugs is a fair point to raise, but in this case i don't think it's a big concern. Verifying no transaction in a block is 64 bytes is as simple a check as you can get.

> And certainly anyone implementing such a verifier must know many intricacies of the protocol.

They need to know some, but i don't think it's reasonable to expect them to realize the merkle tree construction is such that an inner node may be confused with a 64 bytes transaction.

> I do not see this. I see a very ugly perpetual seam which will likely result in unexpected complexities over time.

What makes you think making 64 bytes transactions invalid could result in unexpected complexities? And why do you think it's likely?

> This does not produce unmalleable block hashes. Duplicate tx hash malleation remains in either case, to the same effect. Without a resolution to both issues this is an empty promise.

Duplicate txids have been invalid since 2012 (CVE-2012-2459). If 64 bytes transactions are also made invalid, this would make it impossible for two valid blocks to have the same hash.

Best,
Antoine
On Monday, June 24th, 2024 at 2:35 AM, Eric Voskuil <eric@voskuil•org> wrote:

> Thanks for the responses Antoine.
>
>> As discussed here it would let node implementations cache block failures at an earlier stage of validation. Not a large gain, but still nice to have.
>
> It is not clear to me how determining the coinbase size can be done at an earlier stage of validation than detection of the non-null coinbase. The former requires parsing the coinbase to determine its size, the latter requires parsing it to know if the point is null. Both of these can be performed as early as immediately following the socket read.
>
> size check
>
> (1) requires new consensus rule: 64 byte transactions (or coinbases?) are invalid.
> (2) creates a consensus "seam" (complexity) in txs, where < 64 bytes and > 64 bytes are potentially valid.
> (3) can be limited to reading/skipping header (80 bytes) plus parsing 0 - 65 coinbase bytes.
>
> point check
>
> (1) requires no change.
> (2) creates no consensus seam.
> (3) can be limited to reading/skipping header (80 bytes) plus parsing 6 - 43 coinbase bytes.
>
> Not only is this not a large (performance) gain, it's not one at all.
>
>> It would also avoid a large footgun for anyone implementing a software verifying an SPV proof verifier and not knowing the intricacies of the protocol...
>
> It seems to me that introducing an arbitrary tx size validity may create more potential implementation bugs than it resolves. And certainly anyone implementing such a verifier must know many intricacies of the protocol. This does not remove one, it introduces another - as there is not only a bifurcation around tx size but one around the question of whether this rule is active.
>
>> Finally, it would get rid of a large footgun in general.
>
> I do not see this. I see a very ugly perpetual seam which will likely result in unexpected complexities over time.
>
>> Certainly, unique block hashes would be a useful property for Bitcoin to have. It's not far-fetched to expect current or future Bitcoin-related software to rely on this.
>
> This does not produce unmalleable block hashes. Duplicate tx hash malleation remains in either case, to the same effect. Without a resolution to both issues this is an empty promise.
>
> The only possible benefit that I can see here is the possible very small bandwidth savings pertaining to SPV proofs. I would have a very hard time justifying adding any consensus rule to achieve only that result.
>
> Best,
> Eric
>
> --
> You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/be78e733-6e9f-4f4e-8dc2-67b79ddbf677n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/jJLDrYTXvTgoslhl1n7Fk9-pL1mMC-0k6gtoniQINmioJpzgtqrJ_WqyFZkLltsCUusnQ4jZ6HbvRC-mGuaUlDi3kcqcFHALd10-JQl-FMY%3D%40protonmail.com.

[-- Attachment #2: Type: text/html, Size: 10287 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-27  9:35           ` 'Antoine Poinsot' via Bitcoin Development Mailing List
@ 2024-06-28 17:14             ` Eric Voskuil
  2024-06-29  1:06               ` Antoine Riard
  2024-07-02 10:23               ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  0 siblings, 2 replies; 33+ messages in thread
From: Eric Voskuil @ 2024-06-28 17:14 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 8301 bytes --]

>> It is not clear to me how determining the coinbase size can be done at 
an earlier stage of validation than detection of the non-null coinbase.
> My point wasn't about checking the coinbase size, it was about being able 
to cache the hash of a (non-malleated) invalid block as permanently invalid 
to avoid re-downloading and re-validating it.

This I understood, but I think you misunderstood me. Your point was 
specifically that, "it would let node implementations cache block failures 
at an earlier stage of validation." Since you have not addressed that 
aspect I assume you agree with my assertion above that the proposed rule 
does not actually achieve this.

Regarding the question of checking coinbase size, the issue is of detecting 
(or preventing) hashes mallied via the 64 byte tx technique. A rule against 
64 byte txs would allow this determination by checking the coinbase alone. 
If the coinbase is 64 bytes the block is invalid, if it is not the block 
hash cannot have been mallied (all txs must have been 64 bytes, see 
previous reference).

In that case if the block is invalid the invalidity can be cached. But 
block invalidity cannot actually be cached until the block is fully 
validated. A rule to prohibit *all* 64 byte txs is counterproductive as it 
only adds additional checks on typically thousands of txs per block, 
serving no purpose.

>> It seems to me that introducing an arbitrary tx size validity may create 
more potential implementation bugs than it resolves.
> The potential for implementation bugs is a fair point to raise, but in 
this case i don't think it's a big concern. Verifying no transaction in a 
block is 64 bytes is as simple a check as you can get.

You appear to be making the assumption that the check is performed after 
the block is fully parsed (contrary to your "earlier" criterion above). The 
only way to determine the tx sizes is to parse each tx for witness marker, 
input count, output count, input script sizes, output script sizes, witness 
sizes, and skipping over the header, several constants, and associated 
buffers. Doing this "early" to detect malleation is an extraordinarily 
complex and costly process. On the other hand, as I pointed out, a rational 
implementation would only do this early check for the coinbase.

Yet even determining the size of the coinbase is significantly more complex 
and costly than checking its first input point against null. That check 
(which is already necessary for validation) resolves the malleation 
question, can be performed on the raw unparsed block buffer by simply 
skipping header, version, reading input count and witness marker as 
necessary, offsetting to the 36 byte point buffer, and performing a byte 
comparison against 
[0000000000000000000000000000000000000000000000000000000000000000ffffffff].

This is:

(1) earlier
(2) faster
(3) simpler
(4) already consensus

>> And certainly anyone implementing such a verifier must know many 
intricacies of the protocol.
> They need to know some, but i don't think it's reasonable to expect them 
to realize the merkle tree construction is such that an inner node may be 
confused with a 64 bytes transaction.

A protocol developer needs to understand that the hash of an invalid block 
cannot be cached unless at least the coinbase has been restricted in size 
(under the proposal) -or- that the coinbase is a null point (presently or 
under the proposal). In the latter case the check is already performed in 
validation, so there is no way a block would presently be cached as invalid 
without checking it. The proposal adds a redundant check, even if limited 
to just the coinbase. [He must also understand the second type of 
malleability, discussed below.]

If this proposed rule was to activate we would implement it in a late stage 
tx.check, after txs/blocks had been fully deserialized. We would not check 
it an all in the case where the block is under checkpoint or milestone 
("assume valid"). In this case we would retain the early null point 
malleation check (along with the hash duplication malleation check) that we 
presently have, would validate tx commitments, and commit the block. In 
other words, the proposal adds unnecessary late stage checks only. 
Implementing it otherwise would just add complexity and hurt performance.

>> I do not see this. I see a very ugly perpetual seam which will likely 
result in unexpected complexities over time.
> What makes you think making 64 bytes transactions invalid could result in 
unexpected complexities? And why do you think it's likely?

As described above, it's later, slower, more complex, unnecessarily broad, 
and a consensus change. Beyond that it creates an arbitrary size limit - 
not a lower or upper bound, but a slice out of the domain. Discontinuities 
are inherent complexities in computing. The "unexpected" part speaks for 
itself.

>> This does not produce unmalleable block hashes. Duplicate tx hash 
malleation remains in either case, to the same effect. Without a resolution 
to both issues this is an empty promise.
> Duplicate txids have been invalid since 2012 (CVE-2012-2459).

I think again here you may have misunderstood me. I was not making a point 
pertaining to BIP30. I was referring to the other form of block hash 
malleability, which results from duplicating sets of trailing txs in a 
single block (see previous reference). This malleation vector remains, even 
with invalid 64 byte txs. As I pointed out, this has the "same effect" as 
the 64 byte tx issue. Merkle hashing the set of txs is insufficient to 
determine identity. In one case the coinbase must be checked (null point or 
size) and in the other case the set of tx hashes must be checked for 
trailing duplicated sets. [Core performs this second check within the 
Merkle hashing algorithm (with far more comparisons than necessary), though 
this can be performed earlier and independently to avoid any hashing in the 
malleation case.]

I would also point out in the interest of correctness that Core reverted 
its BIP30 soft fork implementation as a consequence of the BIP90 hard fork, 
following and requiring the BIP34 soft fork that presumably precluded it 
but didn't, so it is no longer the case that duplicate tx hashes are 
invalid in implementation. As you have proposed in this rollup, this 
requires fixing again.

> If 64 bytes transactions are also made invalid, this would make it 
impossible for two valid blocks to have the same hash.

Aside from the BIP30/34/90 issue addressed above, it is already 
"impossible" (cannot be stronger than computationally infeasible) for two 
*valid* blocks to have the same hash. The proposal does not enable that 
objective, it is already the case. No malleated block is a valid block.

The proposal aims only to make it earlier or easier or faster to check for 
block hash malleation. And as I've pointed out above, it doesn't achieve 
those objectives. Possibly the perception that this would be the case is a 
consequence of implementation details, but as I have shown above, it is not 
in fact the case.

Given either type of malleation, the malleated block can be determined to 
be invalid by a context free check. But this knowledge cannot ever be 
cached against the block hash, since the same hash may be valid. Invalidity 
can only be cached once a non-mallied block is validated and determined to 
be invalid. Block hash malleations are and will remain invalid blocks with 
or without the proposal, and it will continue to be necessary to avoid 
caching invalid against the malleation. As you said:

> it was about being able to cache the hash of a (non-malleated) invalid 
block as permanently invalid to avoid re-downloading and re-validating it.

This is already the case, and requires validating the full non-malleated 
block. Adding a redundant invalidity check doesn't improve this in any way.

Best,
Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/9a4c4151-36ed-425a-a535-aa2837919a04n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 8800 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-28 17:14             ` Eric Voskuil
@ 2024-06-29  1:06               ` Antoine Riard
  2024-06-29  1:31                 ` Eric Voskuil
  2024-07-02 10:23               ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  1 sibling, 1 reply; 33+ messages in thread
From: Antoine Riard @ 2024-06-29  1:06 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 10702 bytes --]

Hi Eric,

> It is not clear to me how determining the coinbase size can be done at an 
earlier stage of validation than
> detection of the non-null coinbase. The former requires parsing the 
coinbase to determine its size, the latter
> requires parsing it to know if the point is null. Both of these can be 
performed as early as immediately following the socket read.

If you have code in pure C with variables on the stack no malloc, doing a 
check of the coinbase size after the socket
read can be certainly more robust than checking a non-null pointer. And 
note the attacking game we're solving is a peer
passing a sequence of malleated blocks for which the headers have been 
already verified, so there we can only have weaker
assumptions on the computational infeasibility.

Introducing a discontinuity like ensuring that both leaf / non-leaf merkle 
tree nodes are belonging to different domains
can be obviously a source of additional software complexity, however from a 
security perspective discontinuities if they're
computational asymmetries at the advantage of validating nodes I think they 
can be worthy of considerations for soft-fork extensions.

After looking on the proposed implementation in bitcoin inquisition, I 
think this is correct that the efficiency
of the 64 byte technique transaction to check full block malleability is 
very implementation dependent. Sadly, I
cannot think about other directions to alleviate this dependence on the 
ordering of the block validation checks
from socket read.

In my reasonable opinion, it would be more constructive to come out with a 
full-fleshout "fast block malleability
validation" algorithm in the sense of SipHash (-- and see to have this 
implemented and benchmarked in core) before to
consider more the 64 byte transaction invalidity at the consensus level.

Best,
Antoine (the other one).

Le vendredi 28 juin 2024 à 19:49:39 UTC+1, Eric Voskuil a écrit :

> >> It is not clear to me how determining the coinbase size can be done at 
> an earlier stage of validation than detection of the non-null coinbase.
> > My point wasn't about checking the coinbase size, it was about being 
> able to cache the hash of a (non-malleated) invalid block as permanently 
> invalid to avoid re-downloading and re-validating it.
>
> This I understood, but I think you misunderstood me. Your point was 
> specifically that, "it would let node implementations cache block failures 
> at an earlier stage of validation." Since you have not addressed that 
> aspect I assume you agree with my assertion above that the proposed rule 
> does not actually achieve this.
>
> Regarding the question of checking coinbase size, the issue is of 
> detecting (or preventing) hashes mallied via the 64 byte tx technique. A 
> rule against 64 byte txs would allow this determination by checking the 
> coinbase alone. If the coinbase is 64 bytes the block is invalid, if it is 
> not the block hash cannot have been mallied (all txs must have been 64 
> bytes, see previous reference).
>
> In that case if the block is invalid the invalidity can be cached. But 
> block invalidity cannot actually be cached until the block is fully 
> validated. A rule to prohibit *all* 64 byte txs is counterproductive as it 
> only adds additional checks on typically thousands of txs per block, 
> serving no purpose.
>
> >> It seems to me that introducing an arbitrary tx size validity may 
> create more potential implementation bugs than it resolves.
> > The potential for implementation bugs is a fair point to raise, but in 
> this case i don't think it's a big concern. Verifying no transaction in a 
> block is 64 bytes is as simple a check as you can get.
>
> You appear to be making the assumption that the check is performed after 
> the block is fully parsed (contrary to your "earlier" criterion above). The 
> only way to determine the tx sizes is to parse each tx for witness marker, 
> input count, output count, input script sizes, output script sizes, witness 
> sizes, and skipping over the header, several constants, and associated 
> buffers. Doing this "early" to detect malleation is an extraordinarily 
> complex and costly process. On the other hand, as I pointed out, a rational 
> implementation would only do this early check for the coinbase.
>
> Yet even determining the size of the coinbase is significantly more 
> complex and costly than checking its first input point against null. That 
> check (which is already necessary for validation) resolves the malleation 
> question, can be performed on the raw unparsed block buffer by simply 
> skipping header, version, reading input count and witness marker as 
> necessary, offsetting to the 36 byte point buffer, and performing a byte 
> comparison against 
> [0000000000000000000000000000000000000000000000000000000000000000ffffffff].
>
> This is:
>
> (1) earlier
> (2) faster
> (3) simpler
> (4) already consensus
>
> >> And certainly anyone implementing such a verifier must know many 
> intricacies of the protocol.
> > They need to know some, but i don't think it's reasonable to expect them 
> to realize the merkle tree construction is such that an inner node may be 
> confused with a 64 bytes transaction.
>
> A protocol developer needs to understand that the hash of an invalid block 
> cannot be cached unless at least the coinbase has been restricted in size 
> (under the proposal) -or- that the coinbase is a null point (presently or 
> under the proposal). In the latter case the check is already performed in 
> validation, so there is no way a block would presently be cached as invalid 
> without checking it. The proposal adds a redundant check, even if limited 
> to just the coinbase. [He must also understand the second type of 
> malleability, discussed below.]
>
> If this proposed rule was to activate we would implement it in a late 
> stage tx.check, after txs/blocks had been fully deserialized. We would not 
> check it an all in the case where the block is under checkpoint or 
> milestone ("assume valid"). In this case we would retain the early null 
> point malleation check (along with the hash duplication malleation check) 
> that we presently have, would validate tx commitments, and commit the 
> block. In other words, the proposal adds unnecessary late stage checks 
> only. Implementing it otherwise would just add complexity and hurt 
> performance.
>
> >> I do not see this. I see a very ugly perpetual seam which will likely 
> result in unexpected complexities over time.
> > What makes you think making 64 bytes transactions invalid could result 
> in unexpected complexities? And why do you think it's likely?
>
> As described above, it's later, slower, more complex, unnecessarily broad, 
> and a consensus change. Beyond that it creates an arbitrary size limit - 
> not a lower or upper bound, but a slice out of the domain. Discontinuities 
> are inherent complexities in computing. The "unexpected" part speaks for 
> itself.
>
> >> This does not produce unmalleable block hashes. Duplicate tx hash 
> malleation remains in either case, to the same effect. Without a resolution 
> to both issues this is an empty promise.
> > Duplicate txids have been invalid since 2012 (CVE-2012-2459).
>
> I think again here you may have misunderstood me. I was not making a point 
> pertaining to BIP30. I was referring to the other form of block hash 
> malleability, which results from duplicating sets of trailing txs in a 
> single block (see previous reference). This malleation vector remains, even 
> with invalid 64 byte txs. As I pointed out, this has the "same effect" as 
> the 64 byte tx issue. Merkle hashing the set of txs is insufficient to 
> determine identity. In one case the coinbase must be checked (null point or 
> size) and in the other case the set of tx hashes must be checked for 
> trailing duplicated sets. [Core performs this second check within the 
> Merkle hashing algorithm (with far more comparisons than necessary), though 
> this can be performed earlier and independently to avoid any hashing in the 
> malleation case.]
>
> I would also point out in the interest of correctness that Core reverted 
> its BIP30 soft fork implementation as a consequence of the BIP90 hard fork, 
> following and requiring the BIP34 soft fork that presumably precluded it 
> but didn't, so it is no longer the case that duplicate tx hashes are 
> invalid in implementation. As you have proposed in this rollup, this 
> requires fixing again.
>
> > If 64 bytes transactions are also made invalid, this would make it 
> impossible for two valid blocks to have the same hash.
>
> Aside from the BIP30/34/90 issue addressed above, it is already 
> "impossible" (cannot be stronger than computationally infeasible) for two 
> *valid* blocks to have the same hash. The proposal does not enable that 
> objective, it is already the case. No malleated block is a valid block.
>
> The proposal aims only to make it earlier or easier or faster to check for 
> block hash malleation. And as I've pointed out above, it doesn't achieve 
> those objectives. Possibly the perception that this would be the case is a 
> consequence of implementation details, but as I have shown above, it is not 
> in fact the case.
>
> Given either type of malleation, the malleated block can be determined to 
> be invalid by a context free check. But this knowledge cannot ever be 
> cached against the block hash, since the same hash may be valid. Invalidity 
> can only be cached once a non-mallied block is validated and determined to 
> be invalid. Block hash malleations are and will remain invalid blocks with 
> or without the proposal, and it will continue to be necessary to avoid 
> caching invalid against the malleation. As you said:
>
> > it was about being able to cache the hash of a (non-malleated) invalid 
> block as permanently invalid to avoid re-downloading and re-validating it.
>
> This is already the case, and requires validating the full non-malleated 
> block. Adding a redundant invalidity check doesn't improve this in any way.
>
> Best,
> Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/3f0064f9-54bd-46a7-9d9a-c54b99aca7b2n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 11117 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-29  1:06               ` Antoine Riard
@ 2024-06-29  1:31                 ` Eric Voskuil
  2024-06-29  1:53                   ` Antoine Riard
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-06-29  1:31 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 831 bytes --]

Hello Antoine (other),

>  If you have code in pure C with variables on the stack no malloc, doing 
a check of the coinbase size after the socket read can be certainly more 
robust than checking a non-null pointer. 

Can you please clarify this for me? When you say "non-null pointer" do you 
mean C pointer or transaction input "null point" (sequence of 32 repeating 
0x00 bytes and 4 0xff)? What do you mean by "more robust"?

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/26b7321b-cc64-44b9-bc95-a4d8feb701e5n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1145 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-29  1:31                 ` Eric Voskuil
@ 2024-06-29  1:53                   ` Antoine Riard
  2024-06-29 20:29                     ` Eric Voskuil
  0 siblings, 1 reply; 33+ messages in thread
From: Antoine Riard @ 2024-06-29  1:53 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 2041 bytes --]

Hi Eric,

I meant C pointer and by "more robust" any kind of memory / CPU DoS arising
due to memory management (e.g. hypothetical rule checking the 64 bytes size
for all block transactions).

In my understanding, the validation logic equivalent of core's CheckBlock
is libbitcoin's block::check():
https://github.com/libbitcoin/libbitcoin-system/blob/master/src/chain/block.cpp#L751

Best,
Antoine

Le sam. 29 juin 2024 à 02:33, Eric Voskuil <eric@voskuil•org> a écrit :

> Hello Antoine (other),
>
> >  If you have code in pure C with variables on the stack no malloc, doing
> a check of the coinbase size after the socket read can be certainly more
> robust than checking a non-null pointer.
>
> Can you please clarify this for me? When you say "non-null pointer" do you
> mean C pointer or transaction input "null point" (sequence of 32 repeating
> 0x00 bytes and 4 0xff)? What do you mean by "more robust"?
>
> Thanks,
> Eric
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/bitcoindev/CAfm7D5ppjo/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/bitcoindev/26b7321b-cc64-44b9-bc95-a4d8feb701e5n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/26b7321b-cc64-44b9-bc95-a4d8feb701e5n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/CALZpt%2BEwVyaz1%3DA6hOOycqFGJs%2BzxyYYocZixTJgVmzZezUs9Q%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 3091 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-29  1:53                   ` Antoine Riard
@ 2024-06-29 20:29                     ` Eric Voskuil
  2024-06-29 20:40                       ` Eric Voskuil
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-06-29 20:29 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 1578 bytes --]

> I meant C pointer and by "more robust" any kind of memory / CPU DoS 
arising due to memory management (e.g. hypothetical rule checking the 64 
bytes size for all block transactions).

Ok, thanks for clarifying. I'm still not making the connection to "checking 
a non-null [C] pointer" but that's prob on me.

> In my understanding, the validation logic equivalent of core's CheckBlock 
is libbitcoin's block::check(): 
https://github.com/libbitcoin/libbitcoin-system/blob/master/src/chain/block.cpp#L751

Yes, a rough correlation but not necessarily equivalence. Note that 
block.check has context free and contextual overrides.

The 'bypass' parameter indicates a block under checkpoint or milestone 
("assume valid"). In this case we must check Merkle root, witness 
commitment, and both types of malleation - as the purpose is to establish 
identity. Absent 'bypass' the typical checks are performed, and therefore a 
malleation check is not required here. The "type64" malleation is subsumed 
by the is_first_non_coinbase check and the "type32" malleation is subsumed 
by the is_internal_double_spend check.

I have some other thoughts on this that I'll post separately.

Best,
Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/607a2233-ac12-4a80-ae4a-08341b3549b3n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1906 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-29 20:29                     ` Eric Voskuil
@ 2024-06-29 20:40                       ` Eric Voskuil
  2024-07-02  2:36                         ` Antoine Riard
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-06-29 20:40 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 5861 bytes --]

Caching identity in the case of invalidity is more interesting question 
than it might seem.

Background: A fully-validated block has established identity in its block 
hash. However an invalid block message may include the same block header, 
producing the same hash, but with any kind of nonsense following the 
header. The purpose of the transaction and witness commitments is of course 
to establish this identity, so these two checks are therefore necessary 
even under checkpoint/milestone. And then of course the two Merkle tree 
issues complicate the tx commitment (the integrity of the witness 
commitment is assured by that of the tx commitment).

So what does it mean to speak of a block hash derived from:

(1) a block message with an unparseable header?
(2) a block message with parseable but invalid header?
(3) a block message with valid header but unparseable tx data?
(4) a block message with valid header but parseable invalid uncommitted tx 
data?
(5) a block message with valid header but parseable invalid malleated 
committed tx data?
(6) a block message with valid header but parseable invalid unmalleated 
committed tx data?
(7) a block message with valid header but uncommitted valid tx data?
(8) a block message with valid header but malleated committed valid tx data?
(9) a block message with valid header but unmalleated committed valid tx 
data?

Note that only the #9 p2p block message contains an actual Bitcoin block, 
the others are bogus messages. In all cases the message can be sha256 
hashed to establish the identity of the *message*. And if one's objective 
is to reject repeating bogus messages, this might be a useful strategy. 
It's already part of the p2p protocol, is orders of magnitude cheaper to 
produce than a Merkle root, and has no identity issues.

The concept of Bitcoin block hash as unique identifier for invalid p2p 
block messages is problematic. Apart from the malleation question, what is 
the Bitcoin block hash for a message with unparseable data (#1 and #3)? 
Such messages are trivial to produce and have no block hash. What is the 
useful identifier for a block with malleated commitments (#5 and #8) or 
invalid commitments (#4 and #7) - valid txs or otherwise?

The stated objective for a consensus rule to invalidate all 64 byte txs is:

> being able to cache the hash of a (non-malleated) invalid block as 
permanently invalid to avoid re-downloading and re-validating it.

This seems reasonable at first glance, but given the list of scenarios 
above, which does it apply to? Presumably the invalid header (#2) doesn't 
get this far because of headers-first. That leaves just invalid blocks with 
useful block hash identifiers (#6). In all other cases the message is 
simply discarded. In this case the attempt is to move category #5 into 
category #6 by prohibiting 64 byte txs.

The requirement to "avoid re-downloading and re-validating it" is about 
performance, presumably minimizing initial block download/catch-up time. 
There is a computational cost to producing 64 byte malleations and none for 
any of the other bogus block message categories above, including the other 
form of malleation. Furthermore, 64 byte malleation has almost zero cost to 
preclude. No hashing and not even true header or tx parsing are required. 
Only a handful of bytes must be read from the raw message before it can be 
discarded presently.

That's actually far cheaper than any of the other scenarios that again, 
have no cost to produce. The other type of malleation requires parsing all 
of the txs in the block and hashing and comparing some or all of them. In 
other words, if there is an attack scenario, that must be addressed before 
this can be meaningful. In fact all of the other bogus message scenarios 
(with tx data) will remain more expensive to discard than this one.

The problem arises from trying to optimize dismissal by storing an 
identifier. Just *producing* the identifier is orders of magnitude more 
costly than simply dismissing this bogus message. I can't imagine why any 
implementation would want to compute and store and retrieve and recompute 
and compare hashes when the alterative is just dismissing the bogus 
messages with no hashing at all.

Bogus messages will arrive, they do not even have to be requested. The 
simplest are dealt with by parse failure. What defines a parse is entirely 
subjective. Generally it's "structural" but nothing precludes incorporating 
a requirement for a necessary leading pattern in the stream, sort of like 
how the witness pattern is identified. If we were going to prioritize early 
dismissal this is where we would put it.

However, there is a tradeoff in terms of early dismissal. Looking up 
invalid hashes is a costly tradeoff, which becomes multiplied by every 
block validated. For example, expending 1 millisecond in hash/lookup to 
save 1 second of validation time in the failure case seems like a 
reasonable tradeoff, until you multiply across the whole chain. 1 ms 
becomes 14 minutes across the chain, just to save a second for each mallied 
block encountered. That means you need to have encountered 840 such mallied 
blocks just to break even. Early dismissing the block for non-null coinbase 
point (without hashing anything) would be on the order of 1000x faster than 
that (breakeven at 1 encounter). So why the block hash cache requirement? 
It cannot be applied to many scenarios, and cannot be optimal in this one.

Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/3dceca4d-03a8-44f3-be64-396702247fadn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 6253 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-29 20:40                       ` Eric Voskuil
@ 2024-07-02  2:36                         ` Antoine Riard
  2024-07-03  1:07                           ` Larry Ruane
  2024-07-03  1:13                           ` Eric Voskuil
  0 siblings, 2 replies; 33+ messages in thread
From: Antoine Riard @ 2024-07-02  2:36 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 16921 bytes --]

Hi Eric,

> Ok, thanks for clarifying. I'm still not making the connection to 
"checking a non-null [C] pointer" but that's prob on me.

A C pointer, which is a language idiome assigning to a memory address A the 
value o memory address B can be 0 (or NULL a standard macro defined in 
stddef.h).

Here a snippet example of linked list code checking the pointer 
(`*begin_list`) is non null before the comparison operation to find the 
target element list.

```
pointer_t       ft_list_find(pointer_t **start_list, void *data_ref, int 
(*cmp)())
{
        while (*start_list)
        {
                if (cmp((*start_list)->data, data_ref) == 0)
                        return (*start_list);
                *start_list = (*start_list)->next;
        }
        return (0);
}
```

While both libbitcoin and bitcoin core are both written in c++, you still 
have underlying pointer derefencing playing out to access the coinbase
transaction, and all underlying implications in terms of memory management.

> Yes, a rough correlation but not necessarily equivalence. Note that 
block.check has context free and contextual overrides.
> 
> The 'bypass' parameter indicates a block under checkpoint or milestone 
("assume valid"). In this case we must check Merkle root, witness 
commitment, and both types of malleation - as the purpose is to establish 
identity. Absent 'bypass' the typical checks are performed, and therefore a 
malleation check is not required here. The "type64" malleation is subsumed 
by the is_first_non_coinbase check and the "type32" malleation is subsumed 
by the is_internal_double_spend check.

Yes, I understand it's not a 1-to-1 compatibility, just a rough logical 
equivalence.

I think it's interesting to point out the two types of malleation that a 
bitcoin consensus validation logic should respect w.r.t block validity 
checks.

Like you said the first one on the merkle root committed in the headers's 
`hashMerkleRoot` due to the lack of domain separation between leaf and 
merkle tree nodes.
The second one is the bip141 wtxid commitment in one of the coinbase 
transaction `scriptpubkey` output, which is itself covered by a txid in the 
merkle tree.

> Caching identity in the case of invalidity is more interesting question 
than it might seem.
> 
> Background: A fully-validated block has established identity in its block 
hash. However an invalid block message may include the same block header, 
producing the same hash, but with any kind of nonsense following the 
header. The purpose of the transaction and witness commitments is of course 
to establish this identity, so these two checks are therefore necessary 
even under checkpoint/milestone. And then of course the two Merkle tree 
issues complicate the tx commitment (the integrity of the witness 
commitment is assured by that of the tx commitment).
> 
> So what does it mean to speak of a block hash derived from:
> 
> (1) a block message with an unparseable header?
> (2) a block message with parseable but invalid header?
> (3) a block message with valid header but unparseable tx data?
> (4) a block message with valid header but parseable invalid uncommitted 
tx data?
> (5) a block message with valid header but parseable invalid malleated 
committed tx data?
> (6) a block message with valid header but parseable invalid unmalleated 
committed tx data?
> (7) a block message with valid header but uncommitted valid tx data?
> (8) a block message with valid header but malleated committed valid tx 
data?
> (9) a block message with valid header but unmalleated committed valid tx 
data?
> 
> Note that only the #9 p2p block message contains an actual Bitcoin block, 
the others are bogus messages. In all cases the message can be sha256 
hashed to establish the identity of the *message*. And if one's objective 
is to reject repeating bogus messages, this might be a useful strategy. 
It's already part of the p2p protocol, is orders of magnitude cheaper to 
produce than a Merkle root, and has no identity issues.

I think I mostly agree with the identity issue as laid out so far, there is 
one caveat to add if you're considering identity caching as the problem 
solved.
A validation node might have to consider differently block messages 
processed if they connect on the longest most PoW valid chain for which all 
blocks have been validated. Or alternatively if they have to be added on a 
candidate longest most PoW valid chain.

> The concept of Bitcoin block hash as unique identifier for invalid p2p 
block messages is problematic. Apart from the malleation question, what is 
the Bitcoin block
> hash for a message with unparseable data (#1 and #3)? Such messages are 
trivial to produce and have no block hash.

For reasons, bitcoin core has the concept of outbound `BLOCK_RELAY` (in 
`src/node/connection_types.h`) where some preferential peering policy is 
applied in matters of block messages download.

> What is the useful identifier for a block with malleated commitments (#5 
and #8) or invalid commitments (#4 and #7) - valid txs or otherwise?

The block header, as it commits to the transaction identifier tree can be 
useful as much for #4 and #5. On the bitcoin core side, about #7 the 
uncommitted valid tx data can be already present in the validation cache 
from mempool acceptance. About #8, the malleaed committed valid 
transactions shall be also committed in the merkle root in headers.

> This seems reasonable at first glance, but given the list of scenarios 
above, which does it apply to?

> This seems reasonable at first glance, but given the list of scenarios 
above, which does it apply to? Presumably the invalid header (#2) doesn't 
get this far because of headers-first.
> That leaves just invalid blocks with useful block hash identifiers (#6). 
In all other cases the message is simply discarded. In this case the 
attempt is to move category #5 into category #6 by prohibiting 64 byte txs.

Yes, it's moving from the category #5 to the category #6. Note, transaction 
malleability can be a distinct issue than lack of domain separation.

> The requirement to "avoid re-downloading and re-validating it" is about 
performance, presumably minimizing initial block download/catch-up time. 
There is a > computational cost to producing 64 byte malleations and none 
for any of the other bogus block message categories above, including the 
other form of malleation. > Furthermore, 64 byte malleation has almost zero 
cost to preclude. No hashing and not even true header or tx parsing are 
required. Only a handful of bytes must be read > from the raw message 
before it can be discarded presently.

> That's actually far cheaper than any of the other scenarios that again, 
have no cost to produce. The other type of malleation requires parsing all 
of the txs in the block and > hashing and comparing some or all of them. In 
other words, if there is an attack scenario, that must be addressed before 
this can be meaningful. In fact all of the other
> bogus message scenarios (with tx data) will remain more expensive to 
discard than this one.

In practice on the bitcoin core side, the bogus block message categories 
from #4 to #6 are already mitigated by validation caching for transactions 
that have been received early. While libbitcoin has no mempool (at least in 
earlier versions) transactions buffering can be done by bip152's 
HeadersAndShortIds message.

About #7 and #8, introducing a domain separation where 64 bytes 
transactions are rejected and making it harder to exploit #7 and #8 
categories of bogus block messages.
This is correct that bitcoin core might accept valid transaction data 
before the merkle tree commitment has been verified.

> The problem arises from trying to optimize dismissal by storing an 
identifier. Just *producing* the identifier is orders of magnitude more 
costly than simply dismissing this > bogus message. I can't imagine why any 
implementation would want to compute and store and retrieve and recompute 
and compare hashes when the alterative is just
> dismissing the bogus messages with no hashing at all.

> Bogus messages will arrive, they do not even have to be requested. The 
simplest are dealt with by parse failure. What defines a parse is entirely 
subjective. Generally it's
> "structural" but nothing precludes incorporating a requirement for a 
necessary leading pattern in the stream, sort of like how the witness 
pattern is identified. If we were
> going to prioritize early dismissal this is where we would put it.

I don't think this is that simple - While producing an identifier comes 
with a computational cost (e.g fixed 64-byte structured coinbase 
transaction), if the full node have a hierarchy of validation cache like 
bitcoin core has already, the cost of bogus block messages can be slashed 
down. On the other hand, just dealing with parse failure on the spot by 
introducing a leading pattern in the stream just inflates the size of p2p 
messages, and the transaction-relay bandwidth cost.

> However, there is a tradeoff in terms of early dismissal. Looking up 
invalid hashes is a costly tradeoff, which becomes multiplied by every 
block validated. For example,
> expending 1 millisecond in hash/lookup to save 1 second of validation 
time in the failure case seems like a reasonable tradeoff, until you 
multiply across the whole chain. > 1 ms becomes 14 minutes across the 
chain, just to save a second for each mallied block encountered. That means 
you need to have encountered 840 such mallied blocks > just to break even. 
Early dismissing the block for non-null coinbase point (without hashing 
anything) would be on the order of 1000x faster than that (breakeven at 1 > 
encounter). So why the block hash cache requirement? It cannot be applied 
to many scenarios, and cannot be optimal in this one.

I think what you're describing is more a classic time-space tradeoff which 
is well-known in classic computer science litterature. In my reasonable 
opinion, one should more reason under what is the security paradigm we wish 
for bitcoin block-relay network and perduring decentralization, i.e one 
where it's easy to verify block messages proofs which could have been 
generated on specialized hardware with an asymmetric cost. Obviously 
encountering 840 such malliead blocks to make it break even doesn't make 
the math up to save on hash lookup, unless you can reduce the attack 
scenario in terms of adversaries capabilities.

Best,
Antoine 
Le samedi 29 juin 2024 à 21:42:23 UTC+1, Eric Voskuil a écrit :

> Caching identity in the case of invalidity is more interesting question 
> than it might seem.
>
> Background: A fully-validated block has established identity in its block 
> hash. However an invalid block message may include the same block header, 
> producing the same hash, but with any kind of nonsense following the 
> header. The purpose of the transaction and witness commitments is of course 
> to establish this identity, so these two checks are therefore necessary 
> even under checkpoint/milestone. And then of course the two Merkle tree 
> issues complicate the tx commitment (the integrity of the witness 
> commitment is assured by that of the tx commitment).
>
> So what does it mean to speak of a block hash derived from:
>
> (1) a block message with an unparseable header?
> (2) a block message with parseable but invalid header?
> (3) a block message with valid header but unparseable tx data?
> (4) a block message with valid header but parseable invalid uncommitted tx 
> data?
> (5) a block message with valid header but parseable invalid malleated 
> committed tx data?
> (6) a block message with valid header but parseable invalid unmalleated 
> committed tx data?
> (7) a block message with valid header but uncommitted valid tx data?
> (8) a block message with valid header but malleated committed valid tx 
> data?
> (9) a block message with valid header but unmalleated committed valid tx 
> data?
>
> Note that only the #9 p2p block message contains an actual Bitcoin block, 
> the others are bogus messages. In all cases the message can be sha256 
> hashed to establish the identity of the *message*. And if one's objective 
> is to reject repeating bogus messages, this might be a useful strategy. 
> It's already part of the p2p protocol, is orders of magnitude cheaper to 
> produce than a Merkle root, and has no identity issues.
>
> The concept of Bitcoin block hash as unique identifier for invalid p2p 
> block messages is problematic. Apart from the malleation question, what is 
> the Bitcoin block hash for a message with unparseable data (#1 and #3)? 
> Such messages are trivial to produce and have no block hash. What is the 
> useful identifier for a block with malleated commitments (#5 and #8) or 
> invalid commitments (#4 and #7) - valid txs or otherwise?
>
> The stated objective for a consensus rule to invalidate all 64 byte txs is:
>
> > being able to cache the hash of a (non-malleated) invalid block as 
> permanently invalid to avoid re-downloading and re-validating it.
>
> This seems reasonable at first glance, but given the list of scenarios 
> above, which does it apply to? Presumably the invalid header (#2) doesn't 
> get this far because of headers-first. That leaves just invalid blocks with 
> useful block hash identifiers (#6). In all other cases the message is 
> simply discarded. In this case the attempt is to move category #5 into 
> category #6 by prohibiting 64 byte txs.
>
> The requirement to "avoid re-downloading and re-validating it" is about 
> performance, presumably minimizing initial block download/catch-up time. 
> There is a computational cost to producing 64 byte malleations and none for 
> any of the other bogus block message categories above, including the other 
> form of malleation. Furthermore, 64 byte malleation has almost zero cost to 
> preclude. No hashing and not even true header or tx parsing are required. 
> Only a handful of bytes must be read from the raw message before it can be 
> discarded presently.
>
> That's actually far cheaper than any of the other scenarios that again, 
> have no cost to produce. The other type of malleation requires parsing all 
> of the txs in the block and hashing and comparing some or all of them. In 
> other words, if there is an attack scenario, that must be addressed before 
> this can be meaningful. In fact all of the other bogus message scenarios 
> (with tx data) will remain more expensive to discard than this one.
>
> The problem arises from trying to optimize dismissal by storing an 
> identifier. Just *producing* the identifier is orders of magnitude more 
> costly than simply dismissing this bogus message. I can't imagine why any 
> implementation would want to compute and store and retrieve and recompute 
> and compare hashes when the alterative is just dismissing the bogus 
> messages with no hashing at all.
>
> Bogus messages will arrive, they do not even have to be requested. The 
> simplest are dealt with by parse failure. What defines a parse is entirely 
> subjective. Generally it's "structural" but nothing precludes incorporating 
> a requirement for a necessary leading pattern in the stream, sort of like 
> how the witness pattern is identified. If we were going to prioritize early 
> dismissal this is where we would put it.
>
> However, there is a tradeoff in terms of early dismissal. Looking up 
> invalid hashes is a costly tradeoff, which becomes multiplied by every 
> block validated. For example, expending 1 millisecond in hash/lookup to 
> save 1 second of validation time in the failure case seems like a 
> reasonable tradeoff, until you multiply across the whole chain. 1 ms 
> becomes 14 minutes across the chain, just to save a second for each mallied 
> block encountered. That means you need to have encountered 840 such mallied 
> blocks just to break even. Early dismissing the block for non-null coinbase 
> point (without hashing anything) would be on the order of 1000x faster than 
> that (breakeven at 1 encounter). So why the block hash cache requirement? 
> It cannot be applied to many scenarios, and cannot be optimal in this one.
>
> Eric
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/301c64c7-0f0f-476a-90c4-913659477276n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 17684 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-02  2:36                         ` Antoine Riard
@ 2024-07-03  1:07                           ` Larry Ruane
  2024-07-03 23:29                             ` Eric Voskuil
  2024-07-03  1:13                           ` Eric Voskuil
  1 sibling, 1 reply; 33+ messages in thread
From: Larry Ruane @ 2024-07-03  1:07 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 1566 bytes --]

On Monday, July 1, 2024 at 11:03:33 PM UTC-6 Antoine Riard wrote:

Here a snippet example of linked list code checking the pointer 
(`*begin_list`) is non null before the comparison operation to find the 
target element list.

```
pointer_t       ft_list_find(pointer_t **start_list, void *data_ref, int 
(*cmp)())
{
        while (*start_list)
        {
                if (cmp((*start_list)->data, data_ref) == 0)
                        return (*start_list);
                *start_list = (*start_list)->next;
        }
        return (0);
}
```

I assume this function lets you search for an element starting in the 
middle of a single-linked list (the middle because you could call 
`ft_list_find(&p-next, data_ref)` where `p` points to any element in the 
middle of the list, including possibly the last item in the list, in which 
case the loop body wouldn't run). If so, I don't think this does what's 
intended. This actually unlinks (and memory-leaks) elements up to where the 
match is found. I think you want to advance `start_list` this way (I didn't 
test this):

```
    start_list = &(*start_list)->next;
```

Larry

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/e2c61ee5-68c4-461e-a132-bb86a4c3e2ccn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2175 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-03  1:07                           ` Larry Ruane
@ 2024-07-03 23:29                             ` Eric Voskuil
  2024-07-04 13:20                               ` Antoine Riard
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-07-03 23:29 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 451 bytes --]

This is why we don't use C - unsafe, unclear, unnecessary.

e

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/33dfd007-ac28-44a5-acee-cec4b381e854n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 740 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-03 23:29                             ` Eric Voskuil
@ 2024-07-04 13:20                               ` Antoine Riard
  2024-07-04 14:45                                 ` Eric Voskuil
  0 siblings, 1 reply; 33+ messages in thread
From: Antoine Riard @ 2024-07-04 13:20 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 3253 bytes --]

> I assume this function lets you search for an element starting in the
middle of a single-linked list (the middle because you could call
`ft_list_find(&p-next, data_ref)` where `p` points to any element in the
> middle of the list, including possibly the last item in the list, in
which case the loop body wouldn't run). If so, I don't think this does
what's intended. This actually unlinks (and memory-leaks) elements up to >
where the match is found. I think you want to advance `start_list` this way
(I didn't test this):

Note the usage of a pointer to pointer so the correct way to call the code
is : `pointer_t * list_ptr ; list_ptr = first_list_element ;
ft_list_find(list_ptr, data_rf, cmp);`.
This is correct that if you point to the last item in the list, the loop
body wouldn't run (which is the expected behavior). When there is a match,
the pointer `*start_list` takes as value the memory address of the next
element in the list, the contained structure pointer is not changed. The
code has been tested a while back, though it's indeed clearer if a typedef
`pointer_t` for list is fully given:

```
typedef struct                 s_list
{
                         void     *content;
                         size_t  content_size;
                         struct  s_list. *next;
}                                      pointer_t
```

> This is why we don't use C - unsafe, unclear, unnecessary.

Actually, I think libbitcoin is using its own maintained fork of secp256k1,
which is written in C.
For sure, I wouldn't recommend using C across a whole codebase as it's not
memory-safe (euphemism) though it's still un-match if you wish to
understand low-level memory management in hot paths.
It can be easier to use C++ or Rust, though it doesn't mean it will be as
(a) perf optimal and (b) hardened against side-channels.

I have not read in detail the last Eric's email on the whole caching
identity in case of invalidity discussion, though I'll do so.

Best,
Antoine


Le jeu. 4 juil. 2024 à 00:57, Eric Voskuil <eric@voskuil•org> a écrit :

> This is why we don't use C - unsafe, unclear, unnecessary.
>
> e
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/bitcoindev/CAfm7D5ppjo/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/bitcoindev/33dfd007-ac28-44a5-acee-cec4b381e854n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/33dfd007-ac28-44a5-acee-cec4b381e854n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/CALZpt%2BFs1U5f3S6_tR7AFfEMEkgBPSp3OaNEq%2BeqYoCSSYXD7g%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 4460 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-04 13:20                               ` Antoine Riard
@ 2024-07-04 14:45                                 ` Eric Voskuil
  2024-07-18 17:39                                   ` Antoine Riard
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-07-04 14:45 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 1842 bytes --]

> This is why we don't use C - unsafe, unclear, unnecessary.

Actually, I think libbitcoin is using its own maintained fork of secp256k1, 
which is written in C.

We do not maintain secp256k1 code. For years that library carried the same 
version, despite regular breaking changes to its API. This compelled us to 
place these different versions on distinct git branches. When it finally 
became versioned we started phasing this unfortunate practice out.

Out of the 10 repositories and at least half million lines of code, apart 
from an embedded copy of qrencode that we don’t independently maintain, I 
believe there is only one .c file in use in the entire project - the 
database mmap.c implementation for msvc builds. This includes hash 
functions, with vectorization optimizations, etc.

For sure, I wouldn't recommend using C across a whole codebase as it's not 
memory-safe (euphemism) though it's still un-match if you wish to 
understand low-level memory management in hot paths.

This is a commonly held misperception.

It can be easier to use C++ or Rust, though it doesn't mean it will be as 
(a) perf optimal and (b) hardened against side-channels.

Rust has its own set of problems. No need to get into a language Jihad 
here. My point was to clarify that the particular question was not about a 
C (or C++) null pointer value, either on the surface or underneath an 
abstraction.

e 

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/a76b8dc5-d37f-4059-882b-207004874887n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2912 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-04 14:45                                 ` Eric Voskuil
@ 2024-07-18 17:39                                   ` Antoine Riard
  2024-07-20 20:29                                     ` Eric Voskuil
  0 siblings, 1 reply; 33+ messages in thread
From: Antoine Riard @ 2024-07-18 17:39 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 17169 bytes --]

Hi Eric,

> While at some level the block message buffer would generally be 
referenced by one or more C pointers, the difference between a valid 
coinbase input (i.e. with a "null point") and any other input, is not 
nullptr vs. !nullptr. A "null point" is a 36 byte value, 32 0x00 byes 
followed by 4 0xff bytes. In his infinite wisdom Satoshi decided it was 
better (or easier) to serialize a first block tx (coinbase) with an input 
containing an unusable script and pointing to an invalid [tx:index] tuple 
(input point) as opposed to just not having any input. That invalid input 
point is called a "null point", and of course cannot be pointed to by a 
"null pointer". The coinbase must be identified by comparing those 36 bytes 
to the well-known null point value (and if this does not match the Merkle 
hash cannot have been type64 malleated).

Good for the clarification here, I had in mind the core's `CheckBlock` path 
where the first block transaction pointer is dereferenced to verify if the 
transaction is a coinbase (i.e a "null point" where the prevout is null). 
Zooming out and back to my remark, I think this is correct that adding a 
new 64 byte size check on all block transactions to detect block hash 
invalidity could be a low memory overhead (implementation dependant), 
rather than making that 64 byte check alone on the coinbase transaction as 
in my understanding you're proposing.

> We call this type64 malleability (or malleation where it is not only 
possible but occurs).

Yes, the problem which has been described as the lack of "domain 
separation".

> The second one is the bip141 wtxid commitment in one of the coinbase 
transaction `scriptpubkey` output, which is itself covered by a txid in the 
merkle tree.

> While symmetry seems to imply that the witness commitment would be 
malleable, just as the txs commitment, this is not the case. If the tx 
commitment is correct it is computationally infeasible for the witness 
commitment to be malleated, as the witness commitment incorporates each 
full tx (with witness, sentinel, and marker). As such the block identifier, 
which relies only on the header and tx commitment, is a sufficient 
identifier. Yet it remains necessary to validate the witness commitment to 
ensure that the correct witness data has been provided in the block message.
> 
> The second type of malleability, in addition to type64, is what we call 
type32. This is the consequence of duplicated trailing sets of txs (and 
therefore tx hashes) in a block message. This is applicable to some but not 
all blocks, as a function of the number of txs contained.

To precise more your statement in describing source of malleability. The 
witness stack can be malleated altering the wtxid and yet still valid. I 
think you can still have the case where you're feeded a block header with a 
merkle root commitment deserializing to a valid coinbase transaction with 
an invalid witness commitment. This is the case of a "block message with 
valid header but malleatead committed valid tx data". Validation of the 
witness commitment to ensure the correct witness data has been provided in 
the block message is indeed necessary.

>> Background: A fully-validated block has established identity in its 
block hash. However an invalid block message may include the same block 
header, producing the same hash, but with any kind of nonsense following 
the header. The purpose of the transaction and witness commitments is of 
course to establish this identity, so these two checks are therefore 
necessary even under checkpoint/milestone. And then of course the two 
Merkle tree issues complicate the tx commitment (the integrity of the 
witness commitment is assured by that of the tx commitment).
>>
>> So what does it mean to speak of a block hash derived from:
>> (1) a block message with an unparseable header?
>> (2) a block message with parseable but invalid header?
>> (3) a block message with valid header but unparseable tx data?
>> (4) a block message with valid header but parseable invalid uncommitted 
tx data?
>> (5) a block message with valid header but parseable invalid malleated 
committed tx data?
>> (6) a block message with valid header but parseable invalid unmalleated 
committed tx data?
>> (7) a block message with valid header but uncommitted valid tx data?
>> (8) a block message with valid header but malleated committed valid tx 
data?
>> (9) a block message with valid header but unmalleated committed valid tx 
data?
>>
>> Note that only the #9 p2p block message contains an actual Bitcoin 
block, the others are bogus messages. In all cases the message can be 
sha256 hashed to establish the identity of the *message*. And if one's 
objective is to reject repeating bogus messages, this might be a useful 
strategy. It's already part of the p2p protocol, is orders of magnitude 
cheaper to produce than a Merkle root, and has no identity issues.

> I think I mostly agree with the identity issue as laid out so far, there 
is one caveat to add if you're considering identity caching as the problem 
solved. A validation node might have to consider differently block messages 
processed if they connect on the longest most PoW valid chain for which all 
blocks have been validated. Or alternatively if they have to be added on a 
candidate longest most PoW valid chain.

> Certainly an important consideration. We store both types. Once there is 
a stronger candidate header chain we store the headers and proceed to 
obtaining the blocks (if we don't already have them). The blocks are stored 
in the same table; the confirmed vs. candidate indexes simply point to them 
as applicable. It is feasible (and has happened twice) for two blocks to 
share the very same coinbase tx, even with either/all bip30/34/90 active 
(and setting aside future issues here for the sake of simplicity). This 
remains only because two competing branches can have blocks at the same 
height, and bip34 requires only height in the coinbase input script. This 
therefore implies the same transaction but distinct blocks. It is however 
infeasible for one block to exist in multiple distinct chains. In order for 
this to happen two blocks at the same height must have the same coinbase 
(ok), and also the same parent (ok). But this then means that they either 
(1) have distinct identity due to another header property deviation, or (2) 
are the same block with the same parent and are therefore in just one 
chain. So I don't see an actual caveat. I'm not certain if this is the 
ambiguity that you were referring to. If not please feel free to clarify.

If you assume no network partition and the no blocks more than 2h in the 
future consensus rule, I cannot see how one block with no header property 
deviation can exist in multiple distinct chains. The ambiguity I was 
referring was about a different angle, if the design goal of introducing a 
64 byte size check is to "it was about being able to cache the hash of a 
(non-malleated) invalid block as permanently invalid to avoid 
re-downloading and re-validating it", in my thinking we shall consider the 
whole block headers caching strategy and be sure we don't get situations 
where an attacker can attach a chain of low-pow block headers with 
malleated committed valid tx data yielding a block invalidity at the end, 
provoking as a side-effect a network-wide data download blowup. So I think 
any implementation of the validation of a block validity, of which identity 
is a sub-problem, should be strictly ordered by adequate proof-of-work 
checks.

> We don't do this and I don't see how it would be relevant. If a peer 
provides any invalid message or otherwise violates the protocol it is 
simply dropped.
> 
> The "problematic" that I'm referring to is the reliance on the block hash 
as a message identifier, because it does not identify the message and 
cannot be useful in an effectively unlimited number of zero-cost cases.

Historically, it was to isolate transaction-relay from block-relay to 
optimistically harden in face of network partition, as this is easy to 
infer transaction-relay topology with a lot of heuristics.

I think this is correct that block hash message cannot be relied on as it 
cannot be useful in an unlimited number of zero-cost cases, as I was 
pointing that bitcoin core partially mitigate that with discouraging 
connections to block-relay peers servicing block messages 
(`MaybePunishNodeForBlocks`).

> #4 and #5 refer to "uncommitted" and "malleated committed". It may not be 
clear, but "uncommitted" means that the tx commitment is not valid (Merkle 
root doesn't match the header's value) and "malleated committed" means that 
the (matching) commitment cannot be relied upon because the txs represent 
malleation, invalidating the identifier. So neither of these are usable 
identifiers.
> 
> It seems you may be referring to "unconfirmed" txs as opposed to 
"uncommitted" txs. This doesn't pertain to tx storage or identifiers. 
Neither #7 nor #8 are usable for the same reasons.
> 
> I'm making no reference to tx malleability. This concerns only Merkle 
tree (block hash) malleability, the two types described in detail in the 
paper I referenced earlier, here again:
> 
> 
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf

I believe somehow the bottleneck we're circling around is computationally 
definining what are the "usable" identifiers for block messages.
The most straightforward answer to this question is the full block in one 
single peer message, at least in my perspective.
Reality since headers first synchronization (`getheaders`), block 
validation has been dissociated in steps for performance reasons, among 
others.

> Again, this has no relation to tx hashes/identifiers. Libbitcoin has a tx 
pool, we just don't store them in RAM (memory).

> I don't follow this. An invalid 64 byte tx consensus rule would 
definitely not make it harder to exploit block message invalidity. In fact 
it would just slow down validation by adding a redundant rule. Furthermore, 
as I have detailed in a previous message, caching invalidity does 
absolutely nothing to increase protection. In fact it makes the situation 
materially worse.

Just to recall, in my understanding the proposal we're discussing is about 
outlawing 64 bytes size transactions at the consensus-level to minimize 
denial-of-service vectors during block validation. I think we're talking 
about each other because the mempool already introduce a layer of caching 
in bitcoin core, of which the result are re-used at block validation, such 
as signature verification results. I'm not sure we can fully waive apart 
performance considerations, though I agree implementation architecture 
subsystems like mempool should only be a sideline considerations.

> No, this is not the case. As I detailed in my previous message, there is 
no possible scenario where invalidation caching does anything but make the 
situation materially worse.

I think this can be correct that invalidation caching make the situation 
materially worse, or is denial-of-service neutral, as I believe a full node 
is only trading space for time resources in matters of block messages 
validation. I still believe such analysis, as detailed in your previous 
message, would benefit to be more detailed.

> On the other hand, just dealing with parse failure on the spot by 
introducing a leading pattern in the stream just inflates the size of p2p 
messages, and the transaction-relay bandwidth cost.

> I think you misunderstood me. I am suggesting no change to serialization. 
I can see how it might be unclear, but I said, "nothing precludes 
incorporating a requirement for a necessary leading pattern in the stream." 
I meant that the parser can simply incorporate the *requirement* that the 
byte stream starts with a null input point. That identifies the malleation 
or invalidity without a single hash operation and while only reading a 
handful of bytes. No change to any messages.

Indeed, this is clearer with the re-explanation above about what you meant 
by the "null point". In my understanding, you're suggesting the following 
algorithm:
- receive transaction p2p messages
- deserialize transaction p2p messages
- if the transaction is a coinbase candidate, verify null input point
- if null input point pattern invalid, reject the transaction

If I'm understanding correctly, the last rule has for effect to constraint 
the transaction space that can be used to brute-force and mount a Merkle 
root forgery with a 64-byte coinbase transaction.

As described in the 3.1.1 of the paper: 
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf

> I'm referring to DoS mitigation (the only relevant security consideration 
here). I'm pointing out that invalidity caching is pointless in all cases, 
and in this case is the most pointless as type64 malleation is the cheapest 
of all invalidity to detect. I would prefer that all bogus blocks sent to 
my node are of this type. The worst types of invalidity detection have no 
mitigation and from a security standpoint are counterproductive to cache. 
I'm describing what overall is actually not a tradeoff. It's all negative 
and no positive.

I think we're both discussing the same issue about DoS mitigation for sure. 
Again, I think that saying the "invalidity caching" is pointless in all 
cases cannot be fully grounded as a statement without precising (a) what is 
the internal cache(s) layout of the full node processing block messages and 
(b) the sha256 mining resources available during N difficulty period and if 
any miner engage in self-fish mining like strategy.

About (a), I'll maintain my point I think it's a classic time-space 
trade-off to ponder in function of the internal cache layouts. About (b) I 
think we''ll be back to the headers synchronization strategy as implemented
by a full node to discuss if they're exploitable asymmetries for self-fish 
mining like strategies.

If you can give a pseudo-code example of the "null point" validation 
implementation in libbitcoin code (?) I think this can make the 
conversation more concrete on the caching aspect.

> Rust has its own set of problems. No need to get into a language Jihad 
here. My point was to clarify that the particular question was not about a 
C (or C++) null pointer value, either on the surface or underneath an 
abstraction.

Thanks for the additional comments on libbitcoin usage of dependencies, yes 
I don't think there is a need to get into a language jihad here. It's just 
like all languages have their memory model (stack, dynamic alloc, smart 
pointers, etc) and when you're talking about performance it's useful to 
have their minds, imho.

Best,
Antoine
ots hash: 058d7b3adb154a3e64d5f8ccf1944903bcd0c49dbb525f7212adf4f7ac7f8c55
Le mardi 9 juillet 2024 à 02:16:20 UTC+1, Eric Voskuil a écrit :

> > This is why we don't use C - unsafe, unclear, unnecessary.
>
> Actually, I think libbitcoin is using its own maintained fork of 
> secp256k1, which is written in C.
>
>
> We do not maintain secp256k1 code. For years that library carried the same 
> version, despite regular breaking changes to its API. This compelled us to 
> place these different versions on distinct git branches. When it finally 
> became versioned we started phasing this unfortunate practice out.
>
> Out of the 10 repositories and at least half million lines of code, apart 
> from an embedded copy of qrencode that we don’t independently maintain, I 
> believe there is only one .c file in use in the entire project - the 
> database mmap.c implementation for msvc builds. This includes hash 
> functions, with vectorization optimizations, etc.
>  
>
> For sure, I wouldn't recommend using C across a whole codebase as it's not 
> memory-safe (euphemism) though it's still un-match if you wish to 
> understand low-level memory management in hot paths.
>
>
> This is a commonly held misperception.
>
> It can be easier to use C++ or Rust, though it doesn't mean it will be as 
> (a) perf optimal and (b) hardened against side-channels.
>
>
> Rust has its own set of problems. No need to get into a language Jihad 
> here. My point was to clarify that the particular question was not about a 
> C (or C++) null pointer value, either on the surface or underneath an 
> abstraction.
>
> e 
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/ac6cc3b8-43e5-4cd6-aabe-f5ffc4672812n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 18672 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-18 17:39                                   ` Antoine Riard
@ 2024-07-20 20:29                                     ` Eric Voskuil
  2024-11-28  5:18                                       ` Antoine Riard
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Voskuil @ 2024-07-20 20:29 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 18647 bytes --]

Hi Antoine R,

>> While at some level the block message buffer would generally be 
referenced by one or more C pointers, the difference between a valid 
coinbase input (i.e. with a "null point") and any other input, is not 
nullptr vs. !nullptr. A "null point" is a 36 byte value, 32 0x00 byes 
followed by 4 0xff bytes. In his infinite wisdom Satoshi decided it was 
better (or easier) to serialize a first block tx (coinbase) with an input 
containing an unusable script and pointing to an invalid [tx:index] tuple 
(input point) as opposed to just not having any input. That invalid input 
point is called a "null point", and of course cannot be pointed to by a 
"null pointer". The coinbase must be identified by comparing those 36 bytes 
to the well-known null point value (and if this does not match the Merkle 
hash cannot have been type64 malleated).

> Good for the clarification here, I had in mind the core's `CheckBlock` 
path where the first block transaction pointer is dereferenced to verify if 
the transaction is a coinbase (i.e a "null point" where the prevout is 
null). Zooming out and back to my remark, I think this is correct that 
adding a new 64 byte size check on all block transactions to detect block 
hash invalidity could be a low memory overhead (implementation dependant), 
rather than making that 64 byte check alone on the coinbase transaction as 
in my understanding you're proposing.

I'm not sure what you mean by stating that a new consensus rule, "could be 
a low memory overhead". Checking all tx sizes is far more overhead than 
validating the coinbase for a null point. As AntoineP agreed, it cannot be 
done earlier, and I have shown that it is *significantly* more 
computationally intensive. It makes the determination much more costly and 
in all other cases by adding an additional check that serves no purpose.

>>> The second one is the bip141 wtxid commitment in one of the coinbase 
transaction `scriptpubkey` output, which is itself covered by a txid in the 
merkle tree.

>> While symmetry seems to imply that the witness commitment would be 
malleable, just as the txs commitment, this is not the case. If the tx 
commitment is correct it is computationally infeasible for the witness 
commitment to be malleated, as the witness commitment incorporates each 
full tx (with witness, sentinel, and marker). As such the block identifier, 
which relies only on the header and tx commitment, is a sufficient 
identifier. Yet it remains necessary to validate the witness commitment to 
ensure that the correct witness data has been provided in the block message.
>>
>> The second type of malleability, in addition to type64, is what we call 
type32. This is the consequence of duplicated trailing sets of txs (and 
therefore tx hashes) in a block message. This is applicable to some but not 
all blocks, as a function of the number of txs contained.

> To precise more your statement in describing source of malleability. The 
witness stack can be malleated altering the wtxid and yet still valid. I 
think you can still have the case where you're feeded a block header with a 
merkle root commitment deserializing to a valid coinbase transaction with 
an invalid witness commitment. This is the case of a "block message with 
valid header but malleatead committed valid tx data". Validation of the 
witness commitment to ensure the correct witness data has been provided in 
the block message is indeed necessary.

I think you misunderstood me. Of course the witness commitment must be 
validated (as I said, "Yet it remains necessary to validate the witness 
commitment..."), as otherwise the witnesses within a block can be anything 
without affecting the block hash. And of course the witness commitment is 
computed in the same manner as the tx commitment and is therefore subject 
to the same malleations. However, because the coinbase tx is committed to 
the block hash, there is no need to guard the witness commitment for 
malleation. And to my knowledge nobody has proposed doing so.

>>> I think I mostly agree with the identity issue as laid out so far, 
there is one caveat to add if you're considering identity caching as the 
problem solved. A validation node might have to consider differently block 
messages processed if they connect on the longest most PoW valid chain for 
which all blocks have been validated. Or alternatively if they have to be 
added on a candidate longest most PoW valid chain.

>> Certainly an important consideration. We store both types. Once there is 
a stronger candidate header chain we store the headers and proceed to 
obtaining the blocks (if we don't already have them). The blocks are stored 
in the same table; the confirmed vs. candidate indexes simply point to them 
as applicable. It is feasible (and has happened twice) for two blocks to 
share the very same coinbase tx, even with either/all bip30/34/90 active 
(and setting aside future issues here for the sake of simplicity). This 
remains only because two competing branches can have blocks at the same 
height, and bip34 requires only height in the coinbase input script. This 
therefore implies the same transaction but distinct blocks. It is however 
infeasible for one block to exist in multiple distinct chains. In order for 
this to happen two blocks at the same height must have the same coinbase 
(ok), and also the same parent (ok). But this then means that they either 
(1) have distinct identity due to another header property deviation, or (2) 
are the same block with the same parent and are therefore in just one 
chain. So I don't see an actual caveat. I'm not certain if this is the 
ambiguity that you were referring to. If not please feel free to clarify.

> If you assume no network partition and the no blocks more than 2h in the 
future consensus rule, I cannot see how one block with no header property 
deviation can exist in multiple distinct chains.

It cannot, that was my point: "(1) have distinct identity due to another 
header property deviation, or (2) are the same block..."

> The ambiguity I was referring was about a different angle, if the design 
goal of introducing a 64 byte size check is to "it was about being able to 
cache the hash of a (non-malleated) invalid block as permanently invalid to 
avoid re-downloading and re-validating it", in my thinking we shall 
consider the whole block headers caching strategy and be sure we don't get 
situations where an attacker can attach a chain of low-pow block headers 
with malleated committed valid tx data yielding a block invalidity at the 
end, provoking as a side-effect a network-wide data download blowup. So I 
think any implementation of the validation of a block validity, of which 
identity is a sub-problem, should be strictly ordered by adequate 
proof-of-work checks.

This was already the presumption.

>> We don't do this and I don't see how it would be relevant. If a peer 
provides any invalid message or otherwise violates the protocol it is 
simply dropped.
>>
>> The "problematic" that I'm referring to is the reliance on the block 
hash as a message identifier, because it does not identify the message and 
cannot be useful in an effectively unlimited number of zero-cost cases.

> Historically, it was to isolate transaction-relay from block-relay to 
optimistically harden in face of network partition, as this is easy to 
infer transaction-relay topology with a lot of heuristics.

I'm not seeing the connection here. Are you suggesting that tx and block 
hashes may collide with each other? Or that that a block message may be 
confused with a transaction message?

> I think this is correct that block hash message cannot be relied on as it 
cannot be useful in an unlimited number of zero-cost cases, as I was 
pointing that bitcoin core partially mitigate that with discouraging 
connections to block-relay peers servicing block messages 
(`MaybePunishNodeForBlocks`).

This does not mitigate the issue. It's essentially dead code. It's exactly 
like saying, "there's an arbitrary number of holes in the bucket, but we 
can plug a subset of those holes." Infinite minus any number is still 
infinite.

> I believe somehow the bottleneck we're circling around is computationally 
definining what are the "usable" identifiers for block messages. The most 
straightforward answer to this question is the full block in one single 
peer message, at least in my perspective.

I don't follow this statement. The term "usable" was specifically 
addressing the proposal - that a header hash must uniquely identify a block 
(a header and committed set of txs) as valid or otherwise. As I have 
pointed out, this will still not be the case if 64 byte blocks are 
invalidated. It is also not the case that detection of type64 malleated 
blocks can be made more performant if 64 byte txs are globally invalid. In 
fact the opposite is true, it becomes more costly (and complex) and is 
therefore just dead code.

> Reality since headers first synchronization (`getheaders`), block 
validation has been dissociated in steps for performance reasons, among 
others.

Headers first only defers malleation checks. The same checks are necessary 
whether you perform blocks first or headers first sync (we support both 
protocol levels). The only difference is that for headers first, a stored 
header might later become invalidated. However, this is the case with and 
without the possibility of malleation.

>> Again, this has no relation to tx hashes/identifiers. Libbitcoin has a 
tx pool, we just don't store them in RAM (memory).
>>
>> I don't follow this. An invalid 64 byte tx consensus rule would 
definitely not make it harder to exploit block message invalidity. In fact 
it would just slow down validation by adding a redundant rule. Furthermore, 
as I have detailed in a previous message, caching invalidity does 
absolutely nothing to increase protection. In fact it makes the situation 
materially worse.

> Just to recall, in my understanding the proposal we're discussing is 
about outlawing 64 bytes size transactions at the consensus-level to 
minimize denial-of-service vectors during block validation. I think we're 
talking about each other because the mempool already introduce a layer of 
caching in bitcoin core, of which the result are re-used at block 
validation, such as signature verification results. I'm not sure we can 
fully waive apart performance considerations, though I agree implementation 
architecture subsystems like mempool should only be a sideline 
considerations.

I have not suggested that anything is waived or ignored here. I'm stating 
that there is no "mempool" performance benefit whatsoever to invalidating 
64 byte txs. Mempool caching could only rely on tx identifiers, not block 
identifiers. Tx identifiers are not at issue.

>> No, this is not the case. As I detailed in my previous message, there is 
no possible scenario where invalidation caching does anything but make the 
situation materially worse.

> I think this can be correct that invalidation caching make the situation 
materially worse, or is denial-of-service neutral, as I believe a full node 
is only trading space for time resources in matters of block messages 
validation. I still believe such analysis, as detailed in your previous 
message, would benefit to be more detailed.

I don't know how to add any more detail than I already have. There are 
three relevant considerations:

(1) block hashes will not become unique identifiers for block messages.
(2) the earliest point at which type64 malleation can be detected will not 
be reduced.
(3) the necessary cost of type64 malleated determination will not be 
reduced.
(4) the additional consensus rule will increase validation cost and code 
complexity.
(5) invalid blocks can still be produced at no cost that require full 
double tx hashing/Merkle root computations.

Which of these statements are not evident at this point?

>> On the other hand, just dealing with parse failure on the spot by 
introducing a leading pattern in the stream just inflates the size of p2p 
messages, and the transaction-relay bandwidth cost.
>>
>> I think you misunderstood me. I am suggesting no change to 
serialization. I can see how it might be unclear, but I said, "nothing 
precludes incorporating a requirement for a necessary leading pattern in 
the stream." I meant that the parser can simply incorporate the 
*requirement* that the byte stream starts with a null input point. That 
identifies the malleation or invalidity without a single hash operation and 
while only reading a handful of bytes. No change to any messages.

> Indeed, this is clearer with the re-explanation above about what you 
meant by the "null point".

Ok

> In my understanding, you're suggesting the following algorithm:
> - receive transaction p2p messages
> - deserialize transaction p2p messages
> - if the transaction is a coinbase candidate, verify null input point
> - if null input point pattern invalid, reject the transaction

No, no part of this thread has any bearing on p2p transaction messages - 
nor are coinbase transactions relayed as transaction messages. You could 
restate it as:

- receive block p2p messages
- if the first tx's first input does not have a null point, reject the block

> If I'm understanding correctly, the last rule has for effect to 
constraint the transaction space that can be used to brute-force and mount 
a Merkle root forgery with a 64-byte coinbase transaction.
>
> As described in the 3.1.1 of the paper: 
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf

The above approach makes this malleation computationally infeasible.

>> I'm referring to DoS mitigation (the only relevant security 
consideration here). I'm pointing out that invalidity caching is pointless 
in all cases, and in this case is the most pointless as type64 malleation 
is the cheapest of all invalidity to detect. I would prefer that all bogus 
blocks sent to my node are of this type. The worst types of invalidity 
detection have no mitigation and from a security standpoint are 
counterproductive to cache. I'm describing what overall is actually not a 
tradeoff. It's all negative and no positive.

> I think we're both discussing the same issue about DoS mitigation for 
sure. Again, I think that saying the "invalidity caching" is pointless in 
all cases cannot be fully grounded as a statement without precising (a) 
what is the internal cache(s) layout of the full node processing block 
messages and (b) the sha256 mining resources available during N difficulty 
period and if any miner engage in self-fish mining like strategy.

It has nothing to do with internal cache layout and nothing to do with 
mining resources. Not having a cache is clearly more efficient than having 
a cache that provides no advantage, regardless of how the cache is laid 
out. There is no cost to forcing a node to perform far more block 
validation computations than can be precluded by invalidity caching. The 
caching simply increases the overall computational cost (as would another 
redundant rule to try and make it more efficient). Discarding invalid 
blocks after the minimal amount of work is the most efficient resolution. 
What one does with the peer at that point is orthogonal (e.g. drop, ban).

> About (a), I'll maintain my point I think it's a classic time-space 
trade-off to ponder in function of the internal cache layouts.

An attacker can throw a nearly infinite number of distinct invalid blocks 
at your node (and all will connect to the chain and show proper PoW). As 
such you will encounter zero cache hits and therefore nothing but overhead 
from the cache. Please explain to me in detail how "cache layout" is going 
to make any difference at all.

> About (b) I think we''ll be back to the headers synchronization strategy 
as implemented by a full node to discuss if they're exploitable asymmetries 
for self-fish mining like strategies.

I don't see this as a related/relevant topic. There are zero mining 
resources required to overflow the invalidity cache. Just as Core recently 
published regarding overflowing to its "ban" store, resulting in process 
termination, this then introduces another attack vector that must be 
mitigated.

> If you can give a pseudo-code example of the "null point" validation 
implementation in libbitcoin code (?) I think this can make the 
conversation more concrete on the caching aspect.

pseudo-code , not from libbitcoin...

```
bool malleated64(block)
{
    segregated = ((block[80 + 4] == 0) and (block[80 + 4 + 1] == 1))
    return block[segregated ? 86 : 85] != 
0xffffffff0000000000000000000000000000000000000000000000000000000000000000
}
```

Obviously there is no error handling (e.g. block too small, too many 
inputs, etc.) but that is not relevant to the particular question. The 
block.header is fixed size, always 80 bytes. The tx.version is also fixed, 
always 4 bytes. A following 0 implies a segregated witness (otherwise it's 
the input count), assuming there is a following 1. The first and only input 
for the coinbase tx, which must be the first block tx, follows. If it does 
not match 
0xffffffff0000000000000000000000000000000000000000000000000000000000000000 
then the block is invalid. If it does match, it is computationally 
infeasible that the merkle root is type64 malleated. That's it, absolutely 
trivial and with no prerequisites. The only thing that even makes it 
interesting is the segwit bifurcation.

>> Rust has its own set of problems. No need to get into a language Jihad 
here. My point was to clarify that the particular question was not about a 
C (or C++) null pointer value, either on the surface or underneath an 
abstraction.

> Thanks for the additional comments on libbitcoin usage of dependencies, 
yes I don't think there is a need to get into a language jihad here. It's 
just like all languages have their memory model (stack, dynamic alloc, 
smart pointers, etc) and when you're talking about performance it's useful 
to have their minds, imho.

Sure, but no language difference that I'm aware of could have any bearing 
on this particular question.

Best,
Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/926fdd12-4e50-433d-bd62-9cc41c7b22a0n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 19566 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-20 20:29                                     ` Eric Voskuil
@ 2024-11-28  5:18                                       ` Antoine Riard
  0 siblings, 0 replies; 33+ messages in thread
From: Antoine Riard @ 2024-11-28  5:18 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 33545 bytes --]

Hi Eric,

Going back to this thread with a bit of delay...

tl;dr: See specifically comment on the lack of proof that invalidating 
64-byte transactions are actually solving merkle root weaknesses that could 
lead to a fork or unravel SPV clients.

> I'm not sure what you mean by stating that a new consensus rule, "could 
be a low memory overhead". Checking all tx sizes is far more overhead than 
validating the coinbase for a null point. As AntoineP agreed, it cannot be 
done earlier, and I have shown that it is *significantly* more 
computationally intensive. It makes the determination much more costly and 
in all other cases by adding an additional check that serves no purpose.

I think for any (new) consensus rule, we shall be able to evalute its 
implication in term of at least 2 dimensions a) memory overhead (e.g do a 
full-node needs more memory to validate post-segwit blocks now that witness 
fields are discounted ?) and b) computational overhead (e.g do a full-node 
needs more CPU cycles to validate confidential transactions perdersen 
commitments ?). A same consensus rule can achieve the same effect e.g 
reducing headers merkle tree ambiguities, with completely different memory 
or computational cost. For the checking all tx sizes vs validating the 
coinbase for a null point, indeed I agree with you that the latter is 
intuitively better on the 2 dimensions.

> I think you misunderstood me. Of course the witness commitment must be 
validated (as I said, "Yet it remains necessary to validate the witness 
commitment..."), as otherwise the witnesses within a block can be anything 
without affecting the block hash. And of course the witness commitment is 
computed in the same manner as the tx commitment and is therefore subject 
to the same malleations. However, because the coinbase tx is committed to 
the block hash, there is no need to guard the witness commitment for 
malleation. And to my knowledge nobody has proposed doing so.

Yes, we misunderstood each other here.

> It cannot, that was my point: "(1) have distinct identity due to another 
header property deviation, or (2) are the same block..."

Ok.

> This was already the presumption.

Ok.

> I'm not seeing the connection here. Are you suggesting that tx and block 
hashes may collide with each other? Or that that a block message may be 
confused with a transaction message?

This was about how to deal with types of invalid block messages in bitcoin 
core that could sources of denial-of-service. E.g invalid bitcoin block 
hash for a message with unparsable data (#1 and #3 in your typology.
My point was bitcoin core is making some assumption at block download to 
fetch them from outbound peers rather than inbound. Outboud peers are 
assumed to be more reliable, as the connection is attempted from the 
outside.
The point in my point was that outbound-block-relay connections where 
initially introduce to alleviate those types of concerns i.e tx probe to 
infer the topology, see https://arxiv.org/pdf/1812.00942

> This does not mitigate the issue. It's essentially dead code. It's 
exactly like saying, "there's an arbitrary number of holes in the bucket, 
but we can plug a subset of those holes." Infinite minus any number is 
still infinite.

I disagree with you here - If the fundamental problem is efficiently 
caching identity in the case of block invalidity, one cannot ignore a 
robust peeering policy, i.e you pick up peers allowed a scarce connection 
slot.
This is indeed useless if you don't have first an efficient verification 
algorithm to determine the block invalidity, though it's part of the 
overall equation.
While infinite minus any number is of course still infinite, thinking 
security by layers it's the base you can have a secure signature 
verification algorithm still on a hot computing host.

> I don't follow this statement. The term "usable" was specifically 
addressing the proposal - that a header hash must uniquely identify a block 
(a header and committed set of txs) as valid or otherwise. As I have 
pointed out, this will still not be the case if 64 byte blocks are 
invalidated. It is also not the case that detection of type64 malleated 
blocks can be made more performant if 64 byte txs are globally invalid. In 
fact the opposite is true, it becomes more costly (and complex) and is 
therefore just dead code.

Okay, in my statement, the term "usable" was to be understood as any 
meaningful bit of information that can lead to 
computationally-hard-to-forge progress in the determination problem you 
laid out here:
https://groups.google.com/g/bitcoindev/c/CAfm7D5ppjo/m/T1-HKqSLAAAJ

> Headers first only defers malleation checks. The same checks are 
necessary whether you perform blocks first or headers first sync (we 
support both protocol levels). The only difference is that for headers 
first, a stored header might later become invalidated. However, this is the 
case with and without the possibility of malleation.

Yes, I agree with you here, a stored header might become invalidated, e.g a 
reorg-out tx committed in the header's merkle tree after the header 
reception.

> I have not suggested that anything is waived or ignored here. I'm stating 
that there is no "mempool" performance benefit whatsoever to invalidating 
64 byte txs. Mempool caching could only rely on tx identifiers, not block 
identifiers. Tx identifiers are not at issue.

Once again, if the goal is an efficient algorithm making progress to 
determinate a block invalidity, and as such reducing the denial-of-service 
surface, caching signatures which are committed in the wtixd tree or in the 
txid tree is a plus.
Though yes, I agree there is no "mempool" performance benefit to invalidate 
the 64 byte tx.

> I don't know how to add any more detail than I already have. There are 
three relevant considerations:
> 
> (1) block hashes will not become unique identifiers for block messages.
> (2) the earliest point at which type64 malleation can be detected will 
not be reduced.
> (3) the necessary cost of type64 malleated determination will not be 
reduced.
> (4) the additional consensus rule will increase validation cost and code 
complexity.
> (5) invalid blocks can still be produced at no cost that require full 
double tx hashing/Merkle root computations.
> 
> Which of these statements are not evident at this point?

That's five statements, not three. Minding implementation-dependent 
considerations, I'm leaning to agree with up to (4) included.
About (5), I don't see how it makes sense that invalid blocks can be still 
produced at not cost, at least pow should be the first thing first to be 
verified.
Like this statement could be clarified what you mean by this.

> No, no part of this thread has any bearing on p2p transaction messages - 
nor are coinbase transactions relayed as transaction messages. You could 
restate it as:
> 
> - receive block p2p messages
> - if the first tx's first input does not have a null point, reject the 
block

I don't believe we can fully dissociate bearing on p2p blocks / 
transactions messages, from the overall goal of reducing denial-of-service 
arising from invalid blocks. How can you be sure the block is invalid until 
you validate all txn ? Though let's waiwe this observation for the present 
point.

The idea of exploiting block malleability is to grind one transaction T0 
for a block B such that H(T0) == H(H(T1) || H(T2)) == B's Root. I.e to have 
T0 == H(T1) || H(T2). T0 can be consensus valid or invalid to provoke a 
consensus fork (it's the collision in the deserialization which is the 
source of the merkle tree root weakness). The first transaction in the 
block is necessarily the coinbase per current consensus rules. Checking 
that T0 is a valid coinbase transaction is sufficient to reject the block. 
Grinding 64-byte transactions that all deserialize as valid transactions, 
including the null point requirement is computationally infeasible.

I'm not sure that even if we get ride of 64-byte transactions, we would 
remove the merkle root weaknesses. Back to the previous example, one could 
find T3 and T4 such that H(H(T3) || H(T4)) is equivalent to H(H(T1) || 
H(T2)). Of course, it would consist in breaking SHA256, which is deemed 
computationally infeasible. I'm not even sure the header verifcation 
algorithms gains second-preimage resistance from forbidding 64-byte 
transaction.

So I think more that the minimal sufficient check to reject a block should 
be more carefully analyzed, rather than advocating that forbidding some 
magic value obviously fix an issue, in the present the bitcoin's merkle 
root weaknesses.

> The above approach makes this malleation computationally infeasible.

I'm intuitively leaning so, though see comments above that it should be 
more carefully thought about.

> It has nothing to do with internal cache layout and nothing to do with 
mining resources. Not having a cache is clearly more efficient than having 
a cache that provides no advantage, regardless of how the cache is laid 
out. There is no cost to forcing a node to perform far more block 
validation computations than can be precluded by invalidity caching. The 
caching simply increases the overall computational cost (as would another 
redundant rule to try and make it more efficient). Discarding invalid 
blocks after the minimal amount of work is the most efficient resolution. 
What one does with the peer at that point is orthogonal (e.g. drop, ban).

I disagree here - If the goal is an efficient algorithm making progress to 
determinate a block invalidity, and then being able to re-use a run of this 
algorithm when a blocks occurs again, having a cache widens the range of 
algorithmsone can design. Same with the mining ressources, if it's to 
consider denial-of-services and an attacker could be able to fully forge 
blocks. If such invalidity caching strategy was efficient it would actually 
minimize or erase the cost for a node to perform more block validation 
computations. Where yes I share you opinion is that an ill-designed caching 
could increase the overall computational cost, and that discarding invalid 
blocks after the minimal amount of work is the most efficient resolution 
for the 1st seen, though it doesn't say on the next N seen. Having the 
signatures already validated could be obviously a win, even with a blind, 
decaying cache, it's all about the memory space of an
average full-node.

> An attacker can throw a nearly infinite number of distinct invalid blocks 
at your node (and all will connect to the chain and show proper PoW). As 
such you will encounter zero cache hits and therefore nothing but overhead 
from the cache. Please explain to me in detail how "cache layout" is going 
to make any difference at all.

Going back to your typology from (1) to (9), e.g for the step 9 to 
determine if a block message with valid header but unmalleated committed 
valid tx data.
If you have seen the block message a first-time, though it wasn't yet on 
the longest pow-chain and you disregarded its validation.

> I don't see this as a related/relevant topic. There are zero mining 
resources required to overflow the invalidity cache. Just as Core recently 
published regarding overflowing to its "ban" store, resulting in process 
termination, this then introduces another attack vector that must be 
mitigated.

Depends if your invalidity cache is safeguarded by a minimal valid 
proof-of-work. I'm certaintly not going to defend that all bitcoin core 
internal caches and stores are well-designed for adversarials environments.

> pseudo-code , not from libbitcoin...
> 
> ```
> bool malleated64(block)
> {
>     segregated = ((block[80 + 4] == 0) and (block[80 + 4 + 1] == 1))
>     return block[segregated ? 86 : 85] != 
0xffffffff0000000000000000000000000000000000000000000000000000000000000000
> }
> ```
> 
> Obviously there is no error handling (e.g. block too small, too many 
inputs, etc.) but that is not relevant to the particular question. The 
block.header is fixed size, always 80 bytes. The tx.version is also fixed, 
always 4 bytes. A following 0 implies a segregated witness (otherwise it's 
the input count), assuming there is a following 1. The first and only input 
for the coinbase tx, which must be the first block tx, follows. If it does 
not match 
0xffffffff0000000000000000000000000000000000000000000000000000000000000000 
then the block is invalid. If it does match, it is computationally 
infeasible that the merkle root is type64 malleated. That's it, absolutely 
trivial and with no prerequisites. The only thing that even makes it 
interesting is the segwit bifurcation.

Thanks for the example with the segwit bifurcation for the marker. By the 
way, the segwit marker is documented in BIP144, which is incorrectly 
labeled as "Peer Services", though obviously misimplementing the 
transaction ser / deser algorithm for segwit blocks would lead to consensus 
divergence (what if you expect the "flag" to be 0xff and not 0x01 ?). 
Personally, I think it's a good example of how tedious consensus changes 
can be, when even documents for inter-compatibility about consensus changes 
are not drawing a clear line between what is consensus and what are p2p 
rules...

> Sure, but no language difference that I'm aware of could have any bearing 
on this particular question.

Same, I don't see language difference that could have bearing on this 
question, at that level of granularity.

Best,
Antoine R
ots hash: 3d5ed1718683ce1e864751a2eccf21908ed3b11079f183cdf863729d71ae3f36
Le samedi 20 juillet 2024 à 21:51:27 UTC+1, Eric Voskuil a écrit :

> Hi Antoine R,
>
> >> While at some level the block message buffer would generally be 
> referenced by one or more C pointers, the difference between a valid 
> coinbase input (i.e. with a "null point") and any other input, is not 
> nullptr vs. !nullptr. A "null point" is a 36 byte value, 32 0x00 byes 
> followed by 4 0xff bytes. In his infinite wisdom Satoshi decided it was 
> better (or easier) to serialize a first block tx (coinbase) with an input 
> containing an unusable script and pointing to an invalid [tx:index] tuple 
> (input point) as opposed to just not having any input. That invalid input 
> point is called a "null point", and of course cannot be pointed to by a 
> "null pointer". The coinbase must be identified by comparing those 36 bytes 
> to the well-known null point value (and if this does not match the Merkle 
> hash cannot have been type64 malleated).
>
> > Good for the clarification here, I had in mind the core's `CheckBlock` 
> path where the first block transaction pointer is dereferenced to verify if 
> the transaction is a coinbase (i.e a "null point" where the prevout is 
> null). Zooming out and back to my remark, I think this is correct that 
> adding a new 64 byte size check on all block transactions to detect block 
> hash invalidity could be a low memory overhead (implementation dependant), 
> rather than making that 64 byte check alone on the coinbase transaction as 
> in my understanding you're proposing.
>
> I'm not sure what you mean by stating that a new consensus rule, "could be 
> a low memory overhead". Checking all tx sizes is far more overhead than 
> validating the coinbase for a null point. As AntoineP agreed, it cannot be 
> done earlier, and I have shown that it is *significantly* more 
> computationally intensive. It makes the determination much more costly and 
> in all other cases by adding an additional check that serves no purpose.
>
> >>> The second one is the bip141 wtxid commitment in one of the coinbase 
> transaction `scriptpubkey` output, which is itself covered by a txid in the 
> merkle tree.
>
> >> While symmetry seems to imply that the witness commitment would be 
> malleable, just as the txs commitment, this is not the case. If the tx 
> commitment is correct it is computationally infeasible for the witness 
> commitment to be malleated, as the witness commitment incorporates each 
> full tx (with witness, sentinel, and marker). As such the block identifier, 
> which relies only on the header and tx commitment, is a sufficient 
> identifier. Yet it remains necessary to validate the witness commitment to 
> ensure that the correct witness data has been provided in the block message.
> >>
> >> The second type of malleability, in addition to type64, is what we call 
> type32. This is the consequence of duplicated trailing sets of txs (and 
> therefore tx hashes) in a block message. This is applicable to some but not 
> all blocks, as a function of the number of txs contained.
>
> > To precise more your statement in describing source of malleability. The 
> witness stack can be malleated altering the wtxid and yet still valid. I 
> think you can still have the case where you're feeded a block header with a 
> merkle root commitment deserializing to a valid coinbase transaction with 
> an invalid witness commitment. This is the case of a "block message with 
> valid header but malleatead committed valid tx data". Validation of the 
> witness commitment to ensure the correct witness data has been provided in 
> the block message is indeed necessary.
>
> I think you misunderstood me. Of course the witness commitment must be 
> validated (as I said, "Yet it remains necessary to validate the witness 
> commitment..."), as otherwise the witnesses within a block can be anything 
> without affecting the block hash. And of course the witness commitment is 
> computed in the same manner as the tx commitment and is therefore subject 
> to the same malleations. However, because the coinbase tx is committed to 
> the block hash, there is no need to guard the witness commitment for 
> malleation. And to my knowledge nobody has proposed doing so.
>
> >>> I think I mostly agree with the identity issue as laid out so far, 
> there is one caveat to add if you're considering identity caching as the 
> problem solved. A validation node might have to consider differently block 
> messages processed if they connect on the longest most PoW valid chain for 
> which all blocks have been validated. Or alternatively if they have to be 
> added on a candidate longest most PoW valid chain.
>
> >> Certainly an important consideration. We store both types. Once there 
> is a stronger candidate header chain we store the headers and proceed to 
> obtaining the blocks (if we don't already have them). The blocks are stored 
> in the same table; the confirmed vs. candidate indexes simply point to them 
> as applicable. It is feasible (and has happened twice) for two blocks to 
> share the very same coinbase tx, even with either/all bip30/34/90 active 
> (and setting aside future issues here for the sake of simplicity). This 
> remains only because two competing branches can have blocks at the same 
> height, and bip34 requires only height in the coinbase input script. This 
> therefore implies the same transaction but distinct blocks. It is however 
> infeasible for one block to exist in multiple distinct chains. In order for 
> this to happen two blocks at the same height must have the same coinbase 
> (ok), and also the same parent (ok). But this then means that they either 
> (1) have distinct identity due to another header property deviation, or (2) 
> are the same block with the same parent and are therefore in just one 
> chain. So I don't see an actual caveat. I'm not certain if this is the 
> ambiguity that you were referring to. If not please feel free to clarify.
>
> > If you assume no network partition and the no blocks more than 2h in the 
> future consensus rule, I cannot see how one block with no header property 
> deviation can exist in multiple distinct chains.
>
> It cannot, that was my point: "(1) have distinct identity due to another 
> header property deviation, or (2) are the same block..."
>
> > The ambiguity I was referring was about a different angle, if the design 
> goal of introducing a 64 byte size check is to "it was about being able to 
> cache the hash of a (non-malleated) invalid block as permanently invalid to 
> avoid re-downloading and re-validating it", in my thinking we shall 
> consider the whole block headers caching strategy and be sure we don't get 
> situations where an attacker can attach a chain of low-pow block headers 
> with malleated committed valid tx data yielding a block invalidity at the 
> end, provoking as a side-effect a network-wide data download blowup. So I 
> think any implementation of the validation of a block validity, of which 
> identity is a sub-problem, should be strictly ordered by adequate 
> proof-of-work checks.
>
> This was already the presumption.
>
> >> We don't do this and I don't see how it would be relevant. If a peer 
> provides any invalid message or otherwise violates the protocol it is 
> simply dropped.
> >>
> >> The "problematic" that I'm referring to is the reliance on the block 
> hash as a message identifier, because it does not identify the message and 
> cannot be useful in an effectively unlimited number of zero-cost cases.
>
> > Historically, it was to isolate transaction-relay from block-relay to 
> optimistically harden in face of network partition, as this is easy to 
> infer transaction-relay topology with a lot of heuristics.
>
> I'm not seeing the connection here. Are you suggesting that tx and block 
> hashes may collide with each other? Or that that a block message may be 
> confused with a transaction message?
>
> > I think this is correct that block hash message cannot be relied on as 
> it cannot be useful in an unlimited number of zero-cost cases, as I was 
> pointing that bitcoin core partially mitigate that with discouraging 
> connections to block-relay peers servicing block messages 
> (`MaybePunishNodeForBlocks`).
>
> This does not mitigate the issue. It's essentially dead code. It's exactly 
> like saying, "there's an arbitrary number of holes in the bucket, but we 
> can plug a subset of those holes." Infinite minus any number is still 
> infinite.
>
> > I believe somehow the bottleneck we're circling around is 
> computationally definining what are the "usable" identifiers for block 
> messages. The most straightforward answer to this question is the full 
> block in one single peer message, at least in my perspective.
>
> I don't follow this statement. The term "usable" was specifically 
> addressing the proposal - that a header hash must uniquely identify a block 
> (a header and committed set of txs) as valid or otherwise. As I have 
> pointed out, this will still not be the case if 64 byte blocks are 
> invalidated. It is also not the case that detection of type64 malleated 
> blocks can be made more performant if 64 byte txs are globally invalid. In 
> fact the opposite is true, it becomes more costly (and complex) and is 
> therefore just dead code.
>
> > Reality since headers first synchronization (`getheaders`), block 
> validation has been dissociated in steps for performance reasons, among 
> others.
>
> Headers first only defers malleation checks. The same checks are necessary 
> whether you perform blocks first or headers first sync (we support both 
> protocol levels). The only difference is that for headers first, a stored 
> header might later become invalidated. However, this is the case with and 
> without the possibility of malleation.
>
> >> Again, this has no relation to tx hashes/identifiers. Libbitcoin has a 
> tx pool, we just don't store them in RAM (memory).
> >>
> >> I don't follow this. An invalid 64 byte tx consensus rule would 
> definitely not make it harder to exploit block message invalidity. In fact 
> it would just slow down validation by adding a redundant rule. Furthermore, 
> as I have detailed in a previous message, caching invalidity does 
> absolutely nothing to increase protection. In fact it makes the situation 
> materially worse.
>
> > Just to recall, in my understanding the proposal we're discussing is 
> about outlawing 64 bytes size transactions at the consensus-level to 
> minimize denial-of-service vectors during block validation. I think we're 
> talking about each other because the mempool already introduce a layer of 
> caching in bitcoin core, of which the result are re-used at block 
> validation, such as signature verification results. I'm not sure we can 
> fully waive apart performance considerations, though I agree implementation 
> architecture subsystems like mempool should only be a sideline 
> considerations.
>
> I have not suggested that anything is waived or ignored here. I'm stating 
> that there is no "mempool" performance benefit whatsoever to invalidating 
> 64 byte txs. Mempool caching could only rely on tx identifiers, not block 
> identifiers. Tx identifiers are not at issue.
>
> >> No, this is not the case. As I detailed in my previous message, there 
> is no possible scenario where invalidation caching does anything but make 
> the situation materially worse.
>
> > I think this can be correct that invalidation caching make the situation 
> materially worse, or is denial-of-service neutral, as I believe a full node 
> is only trading space for time resources in matters of block messages 
> validation. I still believe such analysis, as detailed in your previous 
> message, would benefit to be more detailed.
>
> I don't know how to add any more detail than I already have. There are 
> three relevant considerations:
>
> (1) block hashes will not become unique identifiers for block messages.
> (2) the earliest point at which type64 malleation can be detected will not 
> be reduced.
> (3) the necessary cost of type64 malleated determination will not be 
> reduced.
> (4) the additional consensus rule will increase validation cost and code 
> complexity.
> (5) invalid blocks can still be produced at no cost that require full 
> double tx hashing/Merkle root computations.
>
> Which of these statements are not evident at this point?
>
> >> On the other hand, just dealing with parse failure on the spot by 
> introducing a leading pattern in the stream just inflates the size of p2p 
> messages, and the transaction-relay bandwidth cost.
> >>
> >> I think you misunderstood me. I am suggesting no change to 
> serialization. I can see how it might be unclear, but I said, "nothing 
> precludes incorporating a requirement for a necessary leading pattern in 
> the stream." I meant that the parser can simply incorporate the 
> *requirement* that the byte stream starts with a null input point. That 
> identifies the malleation or invalidity without a single hash operation and 
> while only reading a handful of bytes. No change to any messages.
>
> > Indeed, this is clearer with the re-explanation above about what you 
> meant by the "null point".
>
> Ok
>
> > In my understanding, you're suggesting the following algorithm:
> > - receive transaction p2p messages
> > - deserialize transaction p2p messages
> > - if the transaction is a coinbase candidate, verify null input point
> > - if null input point pattern invalid, reject the transaction
>
> No, no part of this thread has any bearing on p2p transaction messages - 
> nor are coinbase transactions relayed as transaction messages. You could 
> restate it as:
>
> - receive block p2p messages
> - if the first tx's first input does not have a null point, reject the 
> block
>
> > If I'm understanding correctly, the last rule has for effect to 
> constraint the transaction space that can be used to brute-force and mount 
> a Merkle root forgery with a 64-byte coinbase transaction.
> >
> > As described in the 3.1.1 of the paper: 
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf
>
> The above approach makes this malleation computationally infeasible.
>
> >> I'm referring to DoS mitigation (the only relevant security 
> consideration here). I'm pointing out that invalidity caching is pointless 
> in all cases, and in this case is the most pointless as type64 malleation 
> is the cheapest of all invalidity to detect. I would prefer that all bogus 
> blocks sent to my node are of this type. The worst types of invalidity 
> detection have no mitigation and from a security standpoint are 
> counterproductive to cache. I'm describing what overall is actually not a 
> tradeoff. It's all negative and no positive.
>
> > I think we're both discussing the same issue about DoS mitigation for 
> sure. Again, I think that saying the "invalidity caching" is pointless in 
> all cases cannot be fully grounded as a statement without precising (a) 
> what is the internal cache(s) layout of the full node processing block 
> messages and (b) the sha256 mining resources available during N difficulty 
> period and if any miner engage in self-fish mining like strategy.
>
> It has nothing to do with internal cache layout and nothing to do with 
> mining resources. Not having a cache is clearly more efficient than having 
> a cache that provides no advantage, regardless of how the cache is laid 
> out. There is no cost to forcing a node to perform far more block 
> validation computations than can be precluded by invalidity caching. The 
> caching simply increases the overall computational cost (as would another 
> redundant rule to try and make it more efficient). Discarding invalid 
> blocks after the minimal amount of work is the most efficient resolution. 
> What one does with the peer at that point is orthogonal (e.g. drop, ban).
>
> > About (a), I'll maintain my point I think it's a classic time-space 
> trade-off to ponder in function of the internal cache layouts.
>
> An attacker can throw a nearly infinite number of distinct invalid blocks 
> at your node (and all will connect to the chain and show proper PoW). As 
> such you will encounter zero cache hits and therefore nothing but overhead 
> from the cache. Please explain to me in detail how "cache layout" is going 
> to make any difference at all.
>
> > About (b) I think we''ll be back to the headers synchronization strategy 
> as implemented by a full node to discuss if they're exploitable asymmetries 
> for self-fish mining like strategies.
>
> I don't see this as a related/relevant topic. There are zero mining 
> resources required to overflow the invalidity cache. Just as Core recently 
> published regarding overflowing to its "ban" store, resulting in process 
> termination, this then introduces another attack vector that must be 
> mitigated.
>
> > If you can give a pseudo-code example of the "null point" validation 
> implementation in libbitcoin code (?) I think this can make the 
> conversation more concrete on the caching aspect.
>
> pseudo-code , not from libbitcoin...
>
> ```
> bool malleated64(block)
> {
>     segregated = ((block[80 + 4] == 0) and (block[80 + 4 + 1] == 1))
>     return block[segregated ? 86 : 85] != 
> 0xffffffff0000000000000000000000000000000000000000000000000000000000000000
> }
> ```
>
> Obviously there is no error handling (e.g. block too small, too many 
> inputs, etc.) but that is not relevant to the particular question. The 
> block.header is fixed size, always 80 bytes. The tx.version is also fixed, 
> always 4 bytes. A following 0 implies a segregated witness (otherwise it's 
> the input count), assuming there is a following 1. The first and only input 
> for the coinbase tx, which must be the first block tx, follows. If it does 
> not match 
> 0xffffffff0000000000000000000000000000000000000000000000000000000000000000 
> then the block is invalid. If it does match, it is computationally 
> infeasible that the merkle root is type64 malleated. That's it, absolutely 
> trivial and with no prerequisites. The only thing that even makes it 
> interesting is the segwit bifurcation.
>
> >> Rust has its own set of problems. No need to get into a language Jihad 
> here. My point was to clarify that the particular question was not about a 
> C (or C++) null pointer value, either on the surface or underneath an 
> abstraction.
>
> > Thanks for the additional comments on libbitcoin usage of dependencies, 
> yes I don't think there is a need to get into a language jihad here. It's 
> just like all languages have their memory model (stack, dynamic alloc, 
> smart pointers, etc) and when you're talking about performance it's useful 
> to have their minds, imho.
>
> Sure, but no language difference that I'm aware of could have any bearing 
> on this particular question.
>
> Best,
> Eric
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/78e8248d-bc77-452f-ac7e-19c28cbc3280n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 34481 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-02  2:36                         ` Antoine Riard
  2024-07-03  1:07                           ` Larry Ruane
@ 2024-07-03  1:13                           ` Eric Voskuil
  1 sibling, 0 replies; 33+ messages in thread
From: Eric Voskuil @ 2024-07-03  1:13 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 15974 bytes --]

Hi Antoine R,

>> Ok, thanks for clarifying. I'm still not making the connection to 
"checking a non-null [C] pointer" but that's prob on me.

> A C pointer, which is a language idiome assigning to a memory address A 
the value o memory address B can be 0 (or NULL a standard macro defined in 
stddef.h).
> Here a snippet example of linked list code checking the pointer 
(`*begin_list`) is non null before the comparison operation to find the 
target element list.
> ...
> While both libbitcoin and bitcoin core are both written in c++, you still 
have underlying pointer derefencing playing out to access the coinbase 
transaction, and all underlying implications in terms of memory management.

I'm familiar with pointers ;).

While at some level the block message buffer would generally be referenced 
by one or more C pointers, the difference between a valid coinbase input 
(i.e. with a "null point") and any other input, is not nullptr vs. 
!nullptr. A "null point" is a 36 byte value, 32 0x00 byes followed by 4 
0xff bytes. In his infinite wisdom Satoshi decided it was better (or 
easier) to serialize a first block tx (coinbase) with an input containing 
an unusable script and pointing to an invalid [tx:index] tuple (input 
point) as opposed to just not having any input. That invalid input point is 
called a "null point", and of course cannot be pointed to by a "null 
pointer". The coinbase must be identified by comparing those 36 bytes to 
the well-known null point value (and if this does not match the Merkle hash 
cannot have been type64 malleated).

> I think it's interesting to point out the two types of malleation that a 
bitcoin consensus validation logic should respect w.r.t block validity 
checks. Like you said the first one on the merkle root committed in the 
headers's `hashMerkleRoot` due to the lack of domain separation between 
leaf and merkle tree nodes.

We call this type64 malleability (or malleation where it is not only 
possible but occurs).

> The second one is the bip141 wtxid commitment in one of the coinbase 
transaction `scriptpubkey` output, which is itself covered by a txid in the 
merkle tree.

While symmetry seems to imply that the witness commitment would be 
malleable, just as the txs commitment, this is not the case. If the tx 
commitment is correct it is computationally infeasible for the witness 
commitment to be malleated, as the witness commitment incorporates each 
full tx (with witness, sentinel, and marker). As such the block identifier, 
which relies only on the header and tx commitment, is a sufficient 
identifier. Yet it remains necessary to validate the witness commitment to 
ensure that the correct witness data has been provided in the block message.

The second type of malleability, in addition to type64, is what we call 
type32. This is the consequence of duplicated trailing sets of txs (and 
therefore tx hashes) in a block message. This is applicable to some but not 
all blocks, as a function of the number of txs contained.

>> Caching identity in the case of invalidity is more interesting question 
than it might seem.
>> Background: A fully-validated block has established identity in its 
block hash. However an invalid block message may include the same block 
header, producing the same hash, but with any kind of nonsense following 
the header. The purpose of the transaction and witness commitments is of 
course to establish this identity, so these two checks are therefore 
necessary even under checkpoint/milestone. And then of course the two 
Merkle tree issues complicate the tx commitment (the integrity of the 
witness commitment is assured by that of the tx commitment).
>>
>> So what does it mean to speak of a block hash derived from:
>> (1) a block message with an unparseable header?
>> (2) a block message with parseable but invalid header?
>> (3) a block message with valid header but unparseable tx data?
>> (4) a block message with valid header but parseable invalid uncommitted 
tx data?
>> (5) a block message with valid header but parseable invalid malleated 
committed tx data?
>> (6) a block message with valid header but parseable invalid unmalleated 
committed tx data?
>> (7) a block message with valid header but uncommitted valid tx data?
>> (8) a block message with valid header but malleated committed valid tx 
data?
>> (9) a block message with valid header but unmalleated committed valid tx 
data?
>>
>> Note that only the #9 p2p block message contains an actual Bitcoin 
block, the others are bogus messages. In all cases the message can be 
sha256 hashed to establish the identity of the *message*. And if one's 
objective is to reject repeating bogus messages, this might be a useful 
strategy. It's already part of the p2p protocol, is orders of magnitude 
cheaper to produce than a Merkle root, and has no identity issues.

> I think I mostly agree with the identity issue as laid out so far, there 
is one caveat to add if you're considering identity caching as the problem 
solved. A validation node might have to consider differently block messages 
processed if they connect on the longest most PoW valid chain for which all 
blocks have been validated. Or alternatively if they have to be added on a 
candidate longest most PoW valid chain.

Certainly an important consideration. We store both types. Once there is a 
stronger candidate header chain we store the headers and proceed to 
obtaining the blocks (if we don't already have them). The blocks are stored 
in the same table; the confirmed vs. candidate indexes simply point to them 
as applicable. It is feasible (and has happened twice) for two blocks to 
share the very same coinbase tx, even with either/all bip30/34/90 active 
(and setting aside future issues here for the sake of simplicity). This 
remains only because two competing branches can have blocks at the same 
height, and bip34 requires only height in the coinbase input script. This 
therefore implies the same transaction but distinct blocks. It is however 
infeasible for one block to exist in multiple distinct chains. In order for 
this to happen two blocks at the same height must have the same coinbase 
(ok), and also the same parent (ok). But this then means that they either 
(1) have distinct identity due to another header property deviation, or (2) 
are the same block with the same parent and are therefore in just one 
chain. So I don't see an actual caveat. I'm not certain if this is the 
ambiguity that you were referring to. If not please feel free to clarify.

>> The concept of Bitcoin block hash as unique identifier for invalid p2p 
block messages is problematic. Apart from the malleation question, what is 
the Bitcoin block hash for a message with unparseable data (#1 and #3)? 
Such messages are trivial to produce and have no block hash.

> For reasons, bitcoin core has the concept of outbound `BLOCK_RELAY` (in 
`src/node/connection_types.h`) where some preferential peering policy is 
applied in matters of block messages download.

We don't do this and I don't see how it would be relevant. If a peer 
provides any invalid message or otherwise violates the protocol it is 
simply dropped.

The "problematic" that I'm referring to is the reliance on the block hash 
as a message identifier, because it does not identify the message and 
cannot be useful in an effectively unlimited number of zero-cost cases.

>> What is the useful identifier for a block with malleated commitments (#5 
and #8) or invalid commitments (#4 and #7) - valid txs or otherwise?

> The block header, as it commits to the transaction identifier tree can be 
useful as much for #4 and #5.

#4 and #5 refer to "uncommitted" and "malleated committed". It may not be 
clear, but "uncommitted" means that the tx commitment is not valid (Merkle 
root doesn't match the header's value) and "malleated committed" means that 
the (matching) commitment cannot be relied upon because the txs represent 
malleation, invalidating the identifier. So neither of these are usable 
identifiers.

> On the bitcoin core side, about #7 the uncommitted valid tx data can be 
already present in the validation cache from mempool acceptance. About #8, 
the malleaed committed valid transactions shall be also committed in the 
merkle root in headers.

It seems you may be referring to "unconfirmed" txs as opposed to 
"uncommitted" txs. This doesn't pertain to tx storage or identifiers. 
Neither #7 nor #8 are usable for the same reasons.

>> This seems reasonable at first glance, but given the list of scenarios 
above, which does it apply to?

>> This seems reasonable at first glance, but given the list of scenarios 
above, which does it apply to? Presumably the invalid header (#2) doesn't 
get this far because of headers-first.
>> That leaves just invalid blocks with useful block hash identifiers (#6). 
In all other cases the message is simply discarded. In this case the 
attempt is to move category #5 into category #6 by prohibiting 64 byte txs.

> Yes, it's moving from the category #5 to the category #6. Note, 
transaction malleability can be a distinct issue than lack of domain 
separation.

I'm making no reference to tx malleability. This concerns only Merkle tree 
(block hash) malleability, the two types described in detail in the paper I 
referenced earlier, here again:

https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf

>> The requirement to "avoid re-downloading and re-validating it" is about 
performance, presumably minimizing initial block download/catch-up time. 
There is a > computational cost to producing 64 byte malleations and none 
for any of the other bogus block message categories above, including the 
other form of malleation. > Furthermore, 64 byte malleation has almost zero 
cost to preclude. No hashing and not even true header or tx parsing are 
required. Only a handful of bytes must be read > from the raw message 
before it can be discarded presently.

>> That's actually far cheaper than any of the other scenarios that again, 
have no cost to produce. The other type of malleation requires parsing all 
of the txs in the block and > hashing and comparing some or all of them. In 
other words, if there is an attack scenario, that must be addressed before 
this can be meaningful. In fact all of the other bogus message scenarios 
(with tx data) will remain more expensive to discard than this one.

> In practice on the bitcoin core side, the bogus block message categories 
from #4 to #6 are already mitigated by validation caching for transactions 
that have been received early. While libbitcoin has no mempool (at least in 
earlier versions) transactions buffering can be done by bip152's 
HeadersAndShortIds message.

Again, this has no relation to tx hashes/identifiers. Libbitcoin has a tx 
pool, we just don't store them in RAM (memory).

> About #7 and #8, introducing a domain separation where 64 bytes 
transactions are rejected and making it harder to exploit #7 and #8 
categories of bogus block messages. This is correct that bitcoin core might 
accept valid transaction data before the merkle tree commitment has been 
verified.

I don't follow this. An invalid 64 byte tx consensus rule would definitely 
not make it harder to exploit block message invalidity. In fact it would 
just slow down validation by adding a redundant rule. Furthermore, as I 
have detailed in a previous message, caching invalidity does absolutely 
nothing to increase protection. In fact it makes the situation materially 
worse.

>> The problem arises from trying to optimize dismissal by storing an 
identifier. Just *producing* the identifier is orders of magnitude more 
costly than simply dismissing this > bogus message. I can't imagine why any 
implementation would want to compute and store and retrieve and recompute 
and compare hashes when the alterative is just dismissing the bogus 
messages with no hashing at all.

>> Bogus messages will arrive, they do not even have to be requested. The 
simplest are dealt with by parse failure. What defines a parse is entirely 
subjective. Generally it's
>> "structural" but nothing precludes incorporating a requirement for a 
necessary leading pattern in the stream, sort of like how the witness 
pattern is identified. If we were
>> going to prioritize early dismissal this is where we would put it.

> I don't think this is that simple - While producing an identifier comes 
with a computational cost (e.g fixed 64-byte structured coinbase 
transaction), if the full node have a hierarchy of validation cache like 
bitcoin core has already, the cost of bogus block messages can be slashed 
down.

No, this is not the case. As I detailed in my previous message, there is no 
possible scenario where invalidation caching does anything but make the 
situation materially worse.

> On the other hand, just dealing with parse failure on the spot by 
introducing a leading pattern in the stream just inflates the size of p2p 
messages, and the transaction-relay bandwidth cost.

I think you misunderstood me. I am suggesting no change to serialization. I 
can see how it might be unclear, but I said, "nothing precludes 
incorporating a requirement for a necessary leading pattern in the stream." 
I meant that the parser can simply incorporate the *requirement* that the 
byte stream starts with a null input point. That identifies the malleation 
or invalidity without a single hash operation and while only reading a 
handful of bytes. No change to any messages.

>> However, there is a tradeoff in terms of early dismissal. Looking up 
invalid hashes is a costly tradeoff, which becomes multiplied by every 
block validated. For example, expending 1 millisecond in hash/lookup to 
save 1 second of validation time in the failure case seems like a 
reasonable tradeoff, until you multiply across the whole chain. > 1 ms 
becomes 14 minutes across the chain, just to save a second for each mallied 
block encountered. That means you need to have encountered 840 such mallied 
blocks > just to break even. Early dismissing the block for non-null 
coinbase point (without hashing anything) would be on the order of 1000x 
faster than that (breakeven at 1 > encounter). So why the block hash cache 
requirement? It cannot be applied to many scenarios, and cannot be optimal 
in this one.

> I think what you're describing is more a classic time-space tradeoff 
which is well-known in classic computer science litterature. In my 
reasonable opinion, one should more reason under what is the security 
paradigm we wish for bitcoin block-relay network and perduring 
decentralization, i.e one where it's easy to verify block messages proofs 
which could have been generated on specialized hardware with an asymmetric 
cost. Obviously encountering 840 such malliead blocks to make it break even 
doesn't make the math up to save on hash lookup, unless you can reduce the 
attack scenario in terms of adversaries capabilities.

I'm referring to DoS mitigation (the only relevant security consideration 
here). I'm pointing out that invalidity caching is pointless in all cases, 
and in this case is the most pointless as type64 malleation is the cheapest 
of all invalidity to detect. I would prefer that all bogus blocks sent to 
my node are of this type. The worst types of invalidity detection have no 
mitigation and from a security standpoint are counterproductive to cache. 
I'm describing what overall is actually not a tradeoff. It's all negative 
and no positive.

Best,
Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/d9834ad5-f803-4a39-a854-95b2439738f5n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 16846 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-06-28 17:14             ` Eric Voskuil
  2024-06-29  1:06               ` Antoine Riard
@ 2024-07-02 10:23               ` 'Antoine Poinsot' via Bitcoin Development Mailing List
  2024-07-02 15:57                 ` Eric Voskuil
  1 sibling, 1 reply; 33+ messages in thread
From: 'Antoine Poinsot' via Bitcoin Development Mailing List @ 2024-07-02 10:23 UTC (permalink / raw)
  To: Eric Voskuil; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 9915 bytes --]

>>> This does not produce unmalleable block hashes. Duplicate tx hash malleation remains in either case, to the same effect. Without a resolution to both issues this is an empty promise.
>> Duplicate txids have been invalid since 2012 (CVE-2012-2459).
>
> I think again here you may have misunderstood me. I was not making a point pertaining to BIP30.

No, in fact you did. CVE-2012-2459 is unrelated to BIP30, it's the duplicate txids malleability found by forrestv in 2012. It's the one you are talking about thereafter and the one relevant for the purpose of this discussion.

For future reference, the full disclosure of CVE-2012-2459 can be found there: https://bitcointalk.org/?topic=102395.

> The proposal does not enable that objective, it is already the case. No malleated block is a valid block.

You are right. The advantage i initially mentioned about how making 64-bytes transactions invalid could help caching block failures at an earlier stage is incorrect.

Best,
Antoine Poinsot
On Friday, June 28th, 2024 at 7:14 PM, Eric Voskuil <eric@voskuil•org> wrote:

>>> It is not clear to me how determining the coinbase size can be done at an earlier stage of validation than detection of the non-null coinbase.
>> My point wasn't about checking the coinbase size, it was about being able to cache the hash of a (non-malleated) invalid block as permanently invalid to avoid re-downloading and re-validating it.
>
> This I understood, but I think you misunderstood me. Your point was specifically that, "it would let node implementations cache block failures at an earlier stage of validation." Since you have not addressed that aspect I assume you agree with my assertion above that the proposed rule does not actually achieve this.
>
> Regarding the question of checking coinbase size, the issue is of detecting (or preventing) hashes mallied via the 64 byte tx technique. A rule against 64 byte txs would allow this determination by checking the coinbase alone. If the coinbase is 64 bytes the block is invalid, if it is not the block hash cannot have been mallied (all txs must have been 64 bytes, see previous reference).
>
> In that case if the block is invalid the invalidity can be cached. But block invalidity cannot actually be cached until the block is fully validated. A rule to prohibit *all* 64 byte txs is counterproductive as it only adds additional checks on typically thousands of txs per block, serving no purpose.
>
>>> It seems to me that introducing an arbitrary tx size validity may create more potential implementation bugs than it resolves.
>> The potential for implementation bugs is a fair point to raise, but in this case i don't think it's a big concern. Verifying no transaction in a block is 64 bytes is as simple a check as you can get.
>
> You appear to be making the assumption that the check is performed after the block is fully parsed (contrary to your "earlier" criterion above). The only way to determine the tx sizes is to parse each tx for witness marker, input count, output count, input script sizes, output script sizes, witness sizes, and skipping over the header, several constants, and associated buffers. Doing this "early" to detect malleation is an extraordinarily complex and costly process. On the other hand, as I pointed out, a rational implementation would only do this early check for the coinbase.
>
> Yet even determining the size of the coinbase is significantly more complex and costly than checking its first input point against null. That check (which is already necessary for validation) resolves the malleation question, can be performed on the raw unparsed block buffer by simply skipping header, version, reading input count and witness marker as necessary, offsetting to the 36 byte point buffer, and performing a byte comparison against [0000000000000000000000000000000000000000000000000000000000000000ffffffff].
>
> This is:
>
> (1) earlier
> (2) faster
> (3) simpler
> (4) already consensus
>
>>> And certainly anyone implementing such a verifier must know many intricacies of the protocol.
>> They need to know some, but i don't think it's reasonable to expect them to realize the merkle tree construction is such that an inner node may be confused with a 64 bytes transaction.
>
> A protocol developer needs to understand that the hash of an invalid block cannot be cached unless at least the coinbase has been restricted in size (under the proposal) -or- that the coinbase is a null point (presently or under the proposal). In the latter case the check is already performed in validation, so there is no way a block would presently be cached as invalid without checking it. The proposal adds a redundant check, even if limited to just the coinbase. [He must also understand the second type of malleability, discussed below.]
>
> If this proposed rule was to activate we would implement it in a late stage tx.check, after txs/blocks had been fully deserialized. We would not check it an all in the case where the block is under checkpoint or milestone ("assume valid"). In this case we would retain the early null point malleation check (along with the hash duplication malleation check) that we presently have, would validate tx commitments, and commit the block. In other words, the proposal adds unnecessary late stage checks only. Implementing it otherwise would just add complexity and hurt performance.
>
>>> I do not see this. I see a very ugly perpetual seam which will likely result in unexpected complexities over time.
>> What makes you think making 64 bytes transactions invalid could result in unexpected complexities? And why do you think it's likely?
>
> As described above, it's later, slower, more complex, unnecessarily broad, and a consensus change. Beyond that it creates an arbitrary size limit - not a lower or upper bound, but a slice out of the domain. Discontinuities are inherent complexities in computing. The "unexpected" part speaks for itself.
>
>>> This does not produce unmalleable block hashes. Duplicate tx hash malleation remains in either case, to the same effect. Without a resolution to both issues this is an empty promise.
>> Duplicate txids have been invalid since 2012 (CVE-2012-2459).
>
> I think again here you may have misunderstood me. I was not making a point pertaining to BIP30. I was referring to the other form of block hash malleability, which results from duplicating sets of trailing txs in a single block (see previous reference). This malleation vector remains, even with invalid 64 byte txs. As I pointed out, this has the "same effect" as the 64 byte tx issue. Merkle hashing the set of txs is insufficient to determine identity. In one case the coinbase must be checked (null point or size) and in the other case the set of tx hashes must be checked for trailing duplicated sets. [Core performs this second check within the Merkle hashing algorithm (with far more comparisons than necessary), though this can be performed earlier and independently to avoid any hashing in the malleation case.]
>
> I would also point out in the interest of correctness that Core reverted its BIP30 soft fork implementation as a consequence of the BIP90 hard fork, following and requiring the BIP34 soft fork that presumably precluded it but didn't, so it is no longer the case that duplicate tx hashes are invalid in implementation. As you have proposed in this rollup, this requires fixing again.
>
>> If 64 bytes transactions are also made invalid, this would make it impossible for two valid blocks to have the same hash.
>
> Aside from the BIP30/34/90 issue addressed above, it is already "impossible" (cannot be stronger than computationally infeasible) for two *valid* blocks to have the same hash. The proposal does not enable that objective, it is already the case. No malleated block is a valid block.
>
> The proposal aims only to make it earlier or easier or faster to check for block hash malleation. And as I've pointed out above, it doesn't achieve those objectives. Possibly the perception that this would be the case is a consequence of implementation details, but as I have shown above, it is not in fact the case.
>
> Given either type of malleation, the malleated block can be determined to be invalid by a context free check. But this knowledge cannot ever be cached against the block hash, since the same hash may be valid. Invalidity can only be cached once a non-mallied block is validated and determined to be invalid. Block hash malleations are and will remain invalid blocks with or without the proposal, and it will continue to be necessary to avoid caching invalid against the malleation. As you said:
>
>> it was about being able to cache the hash of a (non-malleated) invalid block as permanently invalid to avoid re-downloading and re-validating it.
>
> This is already the case, and requires validating the full non-malleated block. Adding a redundant invalidity check doesn't improve this in any way.
>
> Best,
> Eric
>
> --
> You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/9a4c4151-36ed-425a-a535-aa2837919a04n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/wg_er0zMhAF9ERoYXmxI6aB7rc97Cum6PQj4UOELapsHVBBVWktFeOZT7sHDlyrXwJ5o5s9iMb2LW2Od-qacywsh-86p5Q7dP3XjWASXcMw%3D%40protonmail.com.

[-- Attachment #2: Type: text/html, Size: 12119 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [bitcoindev] Re: Great Consensus Cleanup Revival
  2024-07-02 10:23               ` 'Antoine Poinsot' via Bitcoin Development Mailing List
@ 2024-07-02 15:57                 ` Eric Voskuil
  0 siblings, 0 replies; 33+ messages in thread
From: Eric Voskuil @ 2024-07-02 15:57 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 5598 bytes --]

>>>> This does not produce unmalleable block hashes. Duplicate tx hash 
malleation remains in either case, to the same effect. Without a resolution 
to both issues this is an empty promise.

>>> Duplicate txids have been invalid since 2012 (CVE-2012-2459).

>> I think again here you may have misunderstood me. I was not making a 
point pertaining to BIP30.

> No, in fact you did. CVE-2012-2459 is unrelated to BIP30, it's the 
duplicate txids malleability found by forrestv in 2012. It's the one you 
are talking about thereafter and the one relevant for the purpose of this 
discussion.

Yes, my mistake. I didn't look up the CVE because malleability has no 
affect on consensus rules (validity). Without BIP30/34/90 a duplicated 
tx/txid (in a given chain) would still be valid (and under the caveats 
previously mentioned, still is). So I assumed you were referring to 
it/them. Malleability pertains strictly to validation implementation 
shortcuts (checkpoints, milestones, invalidity caching), not what is 
actually valid.

>> The proposal does not enable that objective, it is already the case. No 
malleated block is a valid block.

> You are right. The advantage i initially mentioned about how making 
64-bytes transactions invalid could help caching block failures at an 
earlier stage is incorrect.

Hopefully the discussion leads to simpler and more performant 
implementation. As I mentioned previously, the usefulness (i.e. performance 
improving outcome) of block hash invalidity caching is very limited.

Libbitcoin implements an append-only store. And we write a checkpointed, 
milestoned, or current/strong header chains before obtaining blocks. So in 
the case where an invalid block corresponds to a stored header we must 
store the header's invalidity. Obviously this is guarded by PoW and 
therefore extremely rare, but must be accounted for. Otherwise we do not 
under any circumstances store invalidity. This is far more effective than 
storing it, even under heavy/constant "attack".

Given the PoW guard, the worst case scenario is where the witness 
commitment is invalid (it is performed after tx commitment, because it 
relies on the coinbase tx commit). Next worse is where the tx commitment is 
invalid. Neither present any cost to the attacker and neither rely on 
Merkle tree malleability. The latter requires hashing every tx and 
performing the Merkle root calculation. The former requires doing this 
twice. For a block with 4096 txs, that's [2 * (4096 + 4095) = 16382] tx 
hashes.

While that's nothing to sneeze at, in our implementation this constitutes 
1-2% of total sync time on my 7 year old machine (no shani and no avx512). 
But what if we were to cache every invalid hash? Let's say we're under 
constant attack (despite dropping any peer that provides an 
invalid/unrequested block/message). The smart attacker doesn't use 
malleation, since he knows this is mitigated and cheaper in both cases to 
guard against. He just sends block messages with requested headers and a 
maximal set of valid txs (maybe from that actual block) and modifies one 
byte of any witness (or of any script for non-witness blocks). Every time 
sending a unique block, of which he can produce an effectively unlimited 
quantity. With or without caching this requires computation of all 16382 
hashes for each bogus block that includes a requested header (unrequested 
are dismissed at the cost of just one hash).

In this case there is never a cache hit. Each bogus block is unique, but 
"valid enough" to force full double Merkle root computations. Storing the 
cached invalid hash then absorbs additional time and 32 bytes of space plus 
indexation, and achieves nothing. It's as if the hope is that the attacker 
is dumb and just keeps sending the same invalid block. But what's actually 
happening as (1) deoptimization, (2) unnecessary complexity, and (3) 
exposure to a disk-full attack vector which must then also be mitigated.

The other scenarios where parse fails cannot rely on invalidity caching, 
since they don't produce valid commitments, and are dismissed cheaply. That 
leaves only malleability. This comes in two forms, the 64 byte form 
("type64") and what we call "type32" (hashes are 32 bytes and in this form 
they are duplicated). Type64 malleation is the cheapest form of dismissal, 
very early in parse (as discussed). Type32 malleation is far more 
expensive, but no more so than the worst case scenario above. In the Core 
implementation this detection adds a constant (and unnecessarily high) cost 
to the Merkle root computation. This makes it *more* expensive to detect 
than the worst case non-witness scenario above (and its discovery cannot be 
cached). It is possible to reduce this cost significantly by relying on 
some simple math operating over the tx count. So even this scenario is not 
inherently worst case.

So unless one is caching invalidity under PoW and due to an append-only 
store, I can see no reason to ever do it. Getting rid of it would improve 
both performance and security while reducing complexity. Optimally 
dismissing both types of malleation as described would improve performance, 
but is neutral regarding security.

e

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/c8f285b3-bcc4-43f3-b9d8-06fe23ee8303n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 5982 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2024-12-05 21:35 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-24 18:10 [bitcoindev] Great Consensus Cleanup Revival 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-03-26 19:11 ` [bitcoindev] " Antoine Riard
2024-03-27 10:35   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-03-27 18:57     ` Antoine Riard
2024-04-18  0:46     ` Mark F
2024-04-18 10:04       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-04-25  6:08         ` Antoine Riard
2024-04-30 22:20           ` Mark F
2024-05-06  1:10             ` Antoine Riard
2024-07-20 21:39     ` Murad Ali
2024-06-17 22:15 ` Eric Voskuil
2024-06-18  8:13   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-18 13:02     ` Eric Voskuil
2024-06-21 13:09       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-24  0:35         ` Eric Voskuil
2024-06-27  9:35           ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-28 17:14             ` Eric Voskuil
2024-06-29  1:06               ` Antoine Riard
2024-06-29  1:31                 ` Eric Voskuil
2024-06-29  1:53                   ` Antoine Riard
2024-06-29 20:29                     ` Eric Voskuil
2024-06-29 20:40                       ` Eric Voskuil
2024-07-02  2:36                         ` Antoine Riard
2024-07-03  1:07                           ` Larry Ruane
2024-07-03 23:29                             ` Eric Voskuil
2024-07-04 13:20                               ` Antoine Riard
2024-07-04 14:45                                 ` Eric Voskuil
2024-07-18 17:39                                   ` Antoine Riard
2024-07-20 20:29                                     ` Eric Voskuil
2024-11-28  5:18                                       ` Antoine Riard
2024-07-03  1:13                           ` Eric Voskuil
2024-07-02 10:23               ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-07-02 15:57                 ` Eric Voskuil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox