public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Eric Voskuil <eric@voskuil•org>
To: Bitcoin Development Mailing List <bitcoindev@googlegroups.com>
Subject: Re: [bitcoindev] Re: Great Consensus Cleanup Revival
Date: Tue, 2 Jul 2024 08:57:39 -0700 (PDT)	[thread overview]
Message-ID: <c8f285b3-bcc4-43f3-b9d8-06fe23ee8303n@googlegroups.com> (raw)
In-Reply-To: <wg_er0zMhAF9ERoYXmxI6aB7rc97Cum6PQj4UOELapsHVBBVWktFeOZT7sHDlyrXwJ5o5s9iMb2LW2Od-qacywsh-86p5Q7dP3XjWASXcMw=@protonmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 5598 bytes --]

>>>> This does not produce unmalleable block hashes. Duplicate tx hash 
malleation remains in either case, to the same effect. Without a resolution 
to both issues this is an empty promise.

>>> Duplicate txids have been invalid since 2012 (CVE-2012-2459).

>> I think again here you may have misunderstood me. I was not making a 
point pertaining to BIP30.

> No, in fact you did. CVE-2012-2459 is unrelated to BIP30, it's the 
duplicate txids malleability found by forrestv in 2012. It's the one you 
are talking about thereafter and the one relevant for the purpose of this 
discussion.

Yes, my mistake. I didn't look up the CVE because malleability has no 
affect on consensus rules (validity). Without BIP30/34/90 a duplicated 
tx/txid (in a given chain) would still be valid (and under the caveats 
previously mentioned, still is). So I assumed you were referring to 
it/them. Malleability pertains strictly to validation implementation 
shortcuts (checkpoints, milestones, invalidity caching), not what is 
actually valid.

>> The proposal does not enable that objective, it is already the case. No 
malleated block is a valid block.

> You are right. The advantage i initially mentioned about how making 
64-bytes transactions invalid could help caching block failures at an 
earlier stage is incorrect.

Hopefully the discussion leads to simpler and more performant 
implementation. As I mentioned previously, the usefulness (i.e. performance 
improving outcome) of block hash invalidity caching is very limited.

Libbitcoin implements an append-only store. And we write a checkpointed, 
milestoned, or current/strong header chains before obtaining blocks. So in 
the case where an invalid block corresponds to a stored header we must 
store the header's invalidity. Obviously this is guarded by PoW and 
therefore extremely rare, but must be accounted for. Otherwise we do not 
under any circumstances store invalidity. This is far more effective than 
storing it, even under heavy/constant "attack".

Given the PoW guard, the worst case scenario is where the witness 
commitment is invalid (it is performed after tx commitment, because it 
relies on the coinbase tx commit). Next worse is where the tx commitment is 
invalid. Neither present any cost to the attacker and neither rely on 
Merkle tree malleability. The latter requires hashing every tx and 
performing the Merkle root calculation. The former requires doing this 
twice. For a block with 4096 txs, that's [2 * (4096 + 4095) = 16382] tx 
hashes.

While that's nothing to sneeze at, in our implementation this constitutes 
1-2% of total sync time on my 7 year old machine (no shani and no avx512). 
But what if we were to cache every invalid hash? Let's say we're under 
constant attack (despite dropping any peer that provides an 
invalid/unrequested block/message). The smart attacker doesn't use 
malleation, since he knows this is mitigated and cheaper in both cases to 
guard against. He just sends block messages with requested headers and a 
maximal set of valid txs (maybe from that actual block) and modifies one 
byte of any witness (or of any script for non-witness blocks). Every time 
sending a unique block, of which he can produce an effectively unlimited 
quantity. With or without caching this requires computation of all 16382 
hashes for each bogus block that includes a requested header (unrequested 
are dismissed at the cost of just one hash).

In this case there is never a cache hit. Each bogus block is unique, but 
"valid enough" to force full double Merkle root computations. Storing the 
cached invalid hash then absorbs additional time and 32 bytes of space plus 
indexation, and achieves nothing. It's as if the hope is that the attacker 
is dumb and just keeps sending the same invalid block. But what's actually 
happening as (1) deoptimization, (2) unnecessary complexity, and (3) 
exposure to a disk-full attack vector which must then also be mitigated.

The other scenarios where parse fails cannot rely on invalidity caching, 
since they don't produce valid commitments, and are dismissed cheaply. That 
leaves only malleability. This comes in two forms, the 64 byte form 
("type64") and what we call "type32" (hashes are 32 bytes and in this form 
they are duplicated). Type64 malleation is the cheapest form of dismissal, 
very early in parse (as discussed). Type32 malleation is far more 
expensive, but no more so than the worst case scenario above. In the Core 
implementation this detection adds a constant (and unnecessarily high) cost 
to the Merkle root computation. This makes it *more* expensive to detect 
than the worst case non-witness scenario above (and its discovery cannot be 
cached). It is possible to reduce this cost significantly by relying on 
some simple math operating over the tx count. So even this scenario is not 
inherently worst case.

So unless one is caching invalidity under PoW and due to an append-only 
store, I can see no reason to ever do it. Getting rid of it would improve 
both performance and security while reducing complexity. Optimally 
dismissing both types of malleation as described would improve performance, 
but is neutral regarding security.

e

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/c8f285b3-bcc4-43f3-b9d8-06fe23ee8303n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 5982 bytes --]

      reply	other threads:[~2024-07-02 16:13 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-24 18:10 [bitcoindev] " 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-03-26 19:11 ` [bitcoindev] " Antoine Riard
2024-03-27 10:35   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-03-27 18:57     ` Antoine Riard
2024-04-18  0:46     ` Mark F
2024-04-18 10:04       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-04-25  6:08         ` Antoine Riard
2024-04-30 22:20           ` Mark F
2024-05-06  1:10             ` Antoine Riard
2024-06-17 22:15 ` Eric Voskuil
2024-06-18  8:13   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-18 13:02     ` Eric Voskuil
2024-06-21 13:09       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-24  0:35         ` Eric Voskuil
2024-06-27  9:35           ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-28 17:14             ` Eric Voskuil
2024-06-29  1:06               ` Antoine Riard
2024-06-29  1:31                 ` Eric Voskuil
2024-06-29  1:53                   ` Antoine Riard
2024-06-29 20:29                     ` Eric Voskuil
2024-06-29 20:40                       ` Eric Voskuil
2024-07-02  2:36                         ` Antoine Riard
2024-07-03  1:07                           ` Larry Ruane
2024-07-03  1:13                           ` Eric Voskuil
2024-07-02 10:23               ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-07-02 15:57                 ` Eric Voskuil [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c8f285b3-bcc4-43f3-b9d8-06fe23ee8303n@googlegroups.com \
    --to=eric@voskuil$(echo .)org \
    --cc=bitcoindev@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox