Re: [bitcoindev] Re: Great Consensus Cleanup Revival

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

From: Antoine Riard <antoine.riard@gmail•com>
To: Bitcoin Development Mailing List <bitcoindev@googlegroups.com>
Subject: Re: [bitcoindev] Re: Great Consensus Cleanup Revival
Date: Wed, 27 Nov 2024 21:18:37 -0800 (PST)	[thread overview]
Message-ID: <78e8248d-bc77-452f-ac7e-19c28cbc3280n@googlegroups.com> (raw)
In-Reply-To: <926fdd12-4e50-433d-bd62-9cc41c7b22a0n@googlegroups.com>


[-- Attachment #1.1: Type: text/plain, Size: 33545 bytes --]

Hi Eric,

Going back to this thread with a bit of delay...

tl;dr: See specifically comment on the lack of proof that invalidating 
64-byte transactions are actually solving merkle root weaknesses that could 
lead to a fork or unravel SPV clients.

> I'm not sure what you mean by stating that a new consensus rule, "could 
be a low memory overhead". Checking all tx sizes is far more overhead than 
validating the coinbase for a null point. As AntoineP agreed, it cannot be 
done earlier, and I have shown that it is *significantly* more 
computationally intensive. It makes the determination much more costly and 
in all other cases by adding an additional check that serves no purpose.

I think for any (new) consensus rule, we shall be able to evalute its 
implication in term of at least 2 dimensions a) memory overhead (e.g do a 
full-node needs more memory to validate post-segwit blocks now that witness 
fields are discounted ?) and b) computational overhead (e.g do a full-node 
needs more CPU cycles to validate confidential transactions perdersen 
commitments ?). A same consensus rule can achieve the same effect e.g 
reducing headers merkle tree ambiguities, with completely different memory 
or computational cost. For the checking all tx sizes vs validating the 
coinbase for a null point, indeed I agree with you that the latter is 
intuitively better on the 2 dimensions.

> I think you misunderstood me. Of course the witness commitment must be 
validated (as I said, "Yet it remains necessary to validate the witness 
commitment..."), as otherwise the witnesses within a block can be anything 
without affecting the block hash. And of course the witness commitment is 
computed in the same manner as the tx commitment and is therefore subject 
to the same malleations. However, because the coinbase tx is committed to 
the block hash, there is no need to guard the witness commitment for 
malleation. And to my knowledge nobody has proposed doing so.

Yes, we misunderstood each other here.

> It cannot, that was my point: "(1) have distinct identity due to another 
header property deviation, or (2) are the same block..."

Ok.

> This was already the presumption.

Ok.

> I'm not seeing the connection here. Are you suggesting that tx and block 
hashes may collide with each other? Or that that a block message may be 
confused with a transaction message?

This was about how to deal with types of invalid block messages in bitcoin 
core that could sources of denial-of-service. E.g invalid bitcoin block 
hash for a message with unparsable data (#1 and #3 in your typology.
My point was bitcoin core is making some assumption at block download to 
fetch them from outbound peers rather than inbound. Outboud peers are 
assumed to be more reliable, as the connection is attempted from the 
outside.
The point in my point was that outbound-block-relay connections where 
initially introduce to alleviate those types of concerns i.e tx probe to 
infer the topology, see https://arxiv.org/pdf/1812.00942

> This does not mitigate the issue. It's essentially dead code. It's 
exactly like saying, "there's an arbitrary number of holes in the bucket, 
but we can plug a subset of those holes." Infinite minus any number is 
still infinite.

I disagree with you here - If the fundamental problem is efficiently 
caching identity in the case of block invalidity, one cannot ignore a 
robust peeering policy, i.e you pick up peers allowed a scarce connection 
slot.
This is indeed useless if you don't have first an efficient verification 
algorithm to determine the block invalidity, though it's part of the 
overall equation.
While infinite minus any number is of course still infinite, thinking 
security by layers it's the base you can have a secure signature 
verification algorithm still on a hot computing host.

> I don't follow this statement. The term "usable" was specifically 
addressing the proposal - that a header hash must uniquely identify a block 
(a header and committed set of txs) as valid or otherwise. As I have 
pointed out, this will still not be the case if 64 byte blocks are 
invalidated. It is also not the case that detection of type64 malleated 
blocks can be made more performant if 64 byte txs are globally invalid. In 
fact the opposite is true, it becomes more costly (and complex) and is 
therefore just dead code.

Okay, in my statement, the term "usable" was to be understood as any 
meaningful bit of information that can lead to 
computationally-hard-to-forge progress in the determination problem you 
laid out here:
https://groups.google.com/g/bitcoindev/c/CAfm7D5ppjo/m/T1-HKqSLAAAJ

> Headers first only defers malleation checks. The same checks are 
necessary whether you perform blocks first or headers first sync (we 
support both protocol levels). The only difference is that for headers 
first, a stored header might later become invalidated. However, this is the 
case with and without the possibility of malleation.

Yes, I agree with you here, a stored header might become invalidated, e.g a 
reorg-out tx committed in the header's merkle tree after the header 
reception.

> I have not suggested that anything is waived or ignored here. I'm stating 
that there is no "mempool" performance benefit whatsoever to invalidating 
64 byte txs. Mempool caching could only rely on tx identifiers, not block 
identifiers. Tx identifiers are not at issue.

Once again, if the goal is an efficient algorithm making progress to 
determinate a block invalidity, and as such reducing the denial-of-service 
surface, caching signatures which are committed in the wtixd tree or in the 
txid tree is a plus.
Though yes, I agree there is no "mempool" performance benefit to invalidate 
the 64 byte tx.

> I don't know how to add any more detail than I already have. There are 
three relevant considerations:
> 
> (1) block hashes will not become unique identifiers for block messages.
> (2) the earliest point at which type64 malleation can be detected will 
not be reduced.
> (3) the necessary cost of type64 malleated determination will not be 
reduced.
> (4) the additional consensus rule will increase validation cost and code 
complexity.
> (5) invalid blocks can still be produced at no cost that require full 
double tx hashing/Merkle root computations.
> 
> Which of these statements are not evident at this point?

That's five statements, not three. Minding implementation-dependent 
considerations, I'm leaning to agree with up to (4) included.
About (5), I don't see how it makes sense that invalid blocks can be still 
produced at not cost, at least pow should be the first thing first to be 
verified.
Like this statement could be clarified what you mean by this.

> No, no part of this thread has any bearing on p2p transaction messages - 
nor are coinbase transactions relayed as transaction messages. You could 
restate it as:
> 
> - receive block p2p messages
> - if the first tx's first input does not have a null point, reject the 
block

I don't believe we can fully dissociate bearing on p2p blocks / 
transactions messages, from the overall goal of reducing denial-of-service 
arising from invalid blocks. How can you be sure the block is invalid until 
you validate all txn ? Though let's waiwe this observation for the present 
point.

The idea of exploiting block malleability is to grind one transaction T0 
for a block B such that H(T0) == H(H(T1) || H(T2)) == B's Root. I.e to have 
T0 == H(T1) || H(T2). T0 can be consensus valid or invalid to provoke a 
consensus fork (it's the collision in the deserialization which is the 
source of the merkle tree root weakness). The first transaction in the 
block is necessarily the coinbase per current consensus rules. Checking 
that T0 is a valid coinbase transaction is sufficient to reject the block. 
Grinding 64-byte transactions that all deserialize as valid transactions, 
including the null point requirement is computationally infeasible.

I'm not sure that even if we get ride of 64-byte transactions, we would 
remove the merkle root weaknesses. Back to the previous example, one could 
find T3 and T4 such that H(H(T3) || H(T4)) is equivalent to H(H(T1) || 
H(T2)). Of course, it would consist in breaking SHA256, which is deemed 
computationally infeasible. I'm not even sure the header verifcation 
algorithms gains second-preimage resistance from forbidding 64-byte 
transaction.

So I think more that the minimal sufficient check to reject a block should 
be more carefully analyzed, rather than advocating that forbidding some 
magic value obviously fix an issue, in the present the bitcoin's merkle 
root weaknesses.

> The above approach makes this malleation computationally infeasible.

I'm intuitively leaning so, though see comments above that it should be 
more carefully thought about.

> It has nothing to do with internal cache layout and nothing to do with 
mining resources. Not having a cache is clearly more efficient than having 
a cache that provides no advantage, regardless of how the cache is laid 
out. There is no cost to forcing a node to perform far more block 
validation computations than can be precluded by invalidity caching. The 
caching simply increases the overall computational cost (as would another 
redundant rule to try and make it more efficient). Discarding invalid 
blocks after the minimal amount of work is the most efficient resolution. 
What one does with the peer at that point is orthogonal (e.g. drop, ban).

I disagree here - If the goal is an efficient algorithm making progress to 
determinate a block invalidity, and then being able to re-use a run of this 
algorithm when a blocks occurs again, having a cache widens the range of 
algorithmsone can design. Same with the mining ressources, if it's to 
consider denial-of-services and an attacker could be able to fully forge 
blocks. If such invalidity caching strategy was efficient it would actually 
minimize or erase the cost for a node to perform more block validation 
computations. Where yes I share you opinion is that an ill-designed caching 
could increase the overall computational cost, and that discarding invalid 
blocks after the minimal amount of work is the most efficient resolution 
for the 1st seen, though it doesn't say on the next N seen. Having the 
signatures already validated could be obviously a win, even with a blind, 
decaying cache, it's all about the memory space of an
average full-node.

> An attacker can throw a nearly infinite number of distinct invalid blocks 
at your node (and all will connect to the chain and show proper PoW). As 
such you will encounter zero cache hits and therefore nothing but overhead 
from the cache. Please explain to me in detail how "cache layout" is going 
to make any difference at all.

Going back to your typology from (1) to (9), e.g for the step 9 to 
determine if a block message with valid header but unmalleated committed 
valid tx data.
If you have seen the block message a first-time, though it wasn't yet on 
the longest pow-chain and you disregarded its validation.

> I don't see this as a related/relevant topic. There are zero mining 
resources required to overflow the invalidity cache. Just as Core recently 
published regarding overflowing to its "ban" store, resulting in process 
termination, this then introduces another attack vector that must be 
mitigated.

Depends if your invalidity cache is safeguarded by a minimal valid 
proof-of-work. I'm certaintly not going to defend that all bitcoin core 
internal caches and stores are well-designed for adversarials environments.

> pseudo-code , not from libbitcoin...
> 
> ```
> bool malleated64(block)
> {
>     segregated = ((block[80 + 4] == 0) and (block[80 + 4 + 1] == 1))
>     return block[segregated ? 86 : 85] != 
0xffffffff0000000000000000000000000000000000000000000000000000000000000000
> }
> ```
> 
> Obviously there is no error handling (e.g. block too small, too many 
inputs, etc.) but that is not relevant to the particular question. The 
block.header is fixed size, always 80 bytes. The tx.version is also fixed, 
always 4 bytes. A following 0 implies a segregated witness (otherwise it's 
the input count), assuming there is a following 1. The first and only input 
for the coinbase tx, which must be the first block tx, follows. If it does 
not match 
0xffffffff0000000000000000000000000000000000000000000000000000000000000000 
then the block is invalid. If it does match, it is computationally 
infeasible that the merkle root is type64 malleated. That's it, absolutely 
trivial and with no prerequisites. The only thing that even makes it 
interesting is the segwit bifurcation.

Thanks for the example with the segwit bifurcation for the marker. By the 
way, the segwit marker is documented in BIP144, which is incorrectly 
labeled as "Peer Services", though obviously misimplementing the 
transaction ser / deser algorithm for segwit blocks would lead to consensus 
divergence (what if you expect the "flag" to be 0xff and not 0x01 ?). 
Personally, I think it's a good example of how tedious consensus changes 
can be, when even documents for inter-compatibility about consensus changes 
are not drawing a clear line between what is consensus and what are p2p 
rules...

> Sure, but no language difference that I'm aware of could have any bearing 
on this particular question.

Same, I don't see language difference that could have bearing on this 
question, at that level of granularity.

Best,
Antoine R
ots hash: 3d5ed1718683ce1e864751a2eccf21908ed3b11079f183cdf863729d71ae3f36
Le samedi 20 juillet 2024 à 21:51:27 UTC+1, Eric Voskuil a écrit :

> Hi Antoine R,
>
> >> While at some level the block message buffer would generally be 
> referenced by one or more C pointers, the difference between a valid 
> coinbase input (i.e. with a "null point") and any other input, is not 
> nullptr vs. !nullptr. A "null point" is a 36 byte value, 32 0x00 byes 
> followed by 4 0xff bytes. In his infinite wisdom Satoshi decided it was 
> better (or easier) to serialize a first block tx (coinbase) with an input 
> containing an unusable script and pointing to an invalid [tx:index] tuple 
> (input point) as opposed to just not having any input. That invalid input 
> point is called a "null point", and of course cannot be pointed to by a 
> "null pointer". The coinbase must be identified by comparing those 36 bytes 
> to the well-known null point value (and if this does not match the Merkle 
> hash cannot have been type64 malleated).
>
> > Good for the clarification here, I had in mind the core's `CheckBlock` 
> path where the first block transaction pointer is dereferenced to verify if 
> the transaction is a coinbase (i.e a "null point" where the prevout is 
> null). Zooming out and back to my remark, I think this is correct that 
> adding a new 64 byte size check on all block transactions to detect block 
> hash invalidity could be a low memory overhead (implementation dependant), 
> rather than making that 64 byte check alone on the coinbase transaction as 
> in my understanding you're proposing.
>
> I'm not sure what you mean by stating that a new consensus rule, "could be 
> a low memory overhead". Checking all tx sizes is far more overhead than 
> validating the coinbase for a null point. As AntoineP agreed, it cannot be 
> done earlier, and I have shown that it is *significantly* more 
> computationally intensive. It makes the determination much more costly and 
> in all other cases by adding an additional check that serves no purpose.
>
> >>> The second one is the bip141 wtxid commitment in one of the coinbase 
> transaction `scriptpubkey` output, which is itself covered by a txid in the 
> merkle tree.
>
> >> While symmetry seems to imply that the witness commitment would be 
> malleable, just as the txs commitment, this is not the case. If the tx 
> commitment is correct it is computationally infeasible for the witness 
> commitment to be malleated, as the witness commitment incorporates each 
> full tx (with witness, sentinel, and marker). As such the block identifier, 
> which relies only on the header and tx commitment, is a sufficient 
> identifier. Yet it remains necessary to validate the witness commitment to 
> ensure that the correct witness data has been provided in the block message.
> >>
> >> The second type of malleability, in addition to type64, is what we call 
> type32. This is the consequence of duplicated trailing sets of txs (and 
> therefore tx hashes) in a block message. This is applicable to some but not 
> all blocks, as a function of the number of txs contained.
>
> > To precise more your statement in describing source of malleability. The 
> witness stack can be malleated altering the wtxid and yet still valid. I 
> think you can still have the case where you're feeded a block header with a 
> merkle root commitment deserializing to a valid coinbase transaction with 
> an invalid witness commitment. This is the case of a "block message with 
> valid header but malleatead committed valid tx data". Validation of the 
> witness commitment to ensure the correct witness data has been provided in 
> the block message is indeed necessary.
>
> I think you misunderstood me. Of course the witness commitment must be 
> validated (as I said, "Yet it remains necessary to validate the witness 
> commitment..."), as otherwise the witnesses within a block can be anything 
> without affecting the block hash. And of course the witness commitment is 
> computed in the same manner as the tx commitment and is therefore subject 
> to the same malleations. However, because the coinbase tx is committed to 
> the block hash, there is no need to guard the witness commitment for 
> malleation. And to my knowledge nobody has proposed doing so.
>
> >>> I think I mostly agree with the identity issue as laid out so far, 
> there is one caveat to add if you're considering identity caching as the 
> problem solved. A validation node might have to consider differently block 
> messages processed if they connect on the longest most PoW valid chain for 
> which all blocks have been validated. Or alternatively if they have to be 
> added on a candidate longest most PoW valid chain.
>
> >> Certainly an important consideration. We store both types. Once there 
> is a stronger candidate header chain we store the headers and proceed to 
> obtaining the blocks (if we don't already have them). The blocks are stored 
> in the same table; the confirmed vs. candidate indexes simply point to them 
> as applicable. It is feasible (and has happened twice) for two blocks to 
> share the very same coinbase tx, even with either/all bip30/34/90 active 
> (and setting aside future issues here for the sake of simplicity). This 
> remains only because two competing branches can have blocks at the same 
> height, and bip34 requires only height in the coinbase input script. This 
> therefore implies the same transaction but distinct blocks. It is however 
> infeasible for one block to exist in multiple distinct chains. In order for 
> this to happen two blocks at the same height must have the same coinbase 
> (ok), and also the same parent (ok). But this then means that they either 
> (1) have distinct identity due to another header property deviation, or (2) 
> are the same block with the same parent and are therefore in just one 
> chain. So I don't see an actual caveat. I'm not certain if this is the 
> ambiguity that you were referring to. If not please feel free to clarify.
>
> > If you assume no network partition and the no blocks more than 2h in the 
> future consensus rule, I cannot see how one block with no header property 
> deviation can exist in multiple distinct chains.
>
> It cannot, that was my point: "(1) have distinct identity due to another 
> header property deviation, or (2) are the same block..."
>
> > The ambiguity I was referring was about a different angle, if the design 
> goal of introducing a 64 byte size check is to "it was about being able to 
> cache the hash of a (non-malleated) invalid block as permanently invalid to 
> avoid re-downloading and re-validating it", in my thinking we shall 
> consider the whole block headers caching strategy and be sure we don't get 
> situations where an attacker can attach a chain of low-pow block headers 
> with malleated committed valid tx data yielding a block invalidity at the 
> end, provoking as a side-effect a network-wide data download blowup. So I 
> think any implementation of the validation of a block validity, of which 
> identity is a sub-problem, should be strictly ordered by adequate 
> proof-of-work checks.
>
> This was already the presumption.
>
> >> We don't do this and I don't see how it would be relevant. If a peer 
> provides any invalid message or otherwise violates the protocol it is 
> simply dropped.
> >>
> >> The "problematic" that I'm referring to is the reliance on the block 
> hash as a message identifier, because it does not identify the message and 
> cannot be useful in an effectively unlimited number of zero-cost cases.
>
> > Historically, it was to isolate transaction-relay from block-relay to 
> optimistically harden in face of network partition, as this is easy to 
> infer transaction-relay topology with a lot of heuristics.
>
> I'm not seeing the connection here. Are you suggesting that tx and block 
> hashes may collide with each other? Or that that a block message may be 
> confused with a transaction message?
>
> > I think this is correct that block hash message cannot be relied on as 
> it cannot be useful in an unlimited number of zero-cost cases, as I was 
> pointing that bitcoin core partially mitigate that with discouraging 
> connections to block-relay peers servicing block messages 
> (`MaybePunishNodeForBlocks`).
>
> This does not mitigate the issue. It's essentially dead code. It's exactly 
> like saying, "there's an arbitrary number of holes in the bucket, but we 
> can plug a subset of those holes." Infinite minus any number is still 
> infinite.
>
> > I believe somehow the bottleneck we're circling around is 
> computationally definining what are the "usable" identifiers for block 
> messages. The most straightforward answer to this question is the full 
> block in one single peer message, at least in my perspective.
>
> I don't follow this statement. The term "usable" was specifically 
> addressing the proposal - that a header hash must uniquely identify a block 
> (a header and committed set of txs) as valid or otherwise. As I have 
> pointed out, this will still not be the case if 64 byte blocks are 
> invalidated. It is also not the case that detection of type64 malleated 
> blocks can be made more performant if 64 byte txs are globally invalid. In 
> fact the opposite is true, it becomes more costly (and complex) and is 
> therefore just dead code.
>
> > Reality since headers first synchronization (`getheaders`), block 
> validation has been dissociated in steps for performance reasons, among 
> others.
>
> Headers first only defers malleation checks. The same checks are necessary 
> whether you perform blocks first or headers first sync (we support both 
> protocol levels). The only difference is that for headers first, a stored 
> header might later become invalidated. However, this is the case with and 
> without the possibility of malleation.
>
> >> Again, this has no relation to tx hashes/identifiers. Libbitcoin has a 
> tx pool, we just don't store them in RAM (memory).
> >>
> >> I don't follow this. An invalid 64 byte tx consensus rule would 
> definitely not make it harder to exploit block message invalidity. In fact 
> it would just slow down validation by adding a redundant rule. Furthermore, 
> as I have detailed in a previous message, caching invalidity does 
> absolutely nothing to increase protection. In fact it makes the situation 
> materially worse.
>
> > Just to recall, in my understanding the proposal we're discussing is 
> about outlawing 64 bytes size transactions at the consensus-level to 
> minimize denial-of-service vectors during block validation. I think we're 
> talking about each other because the mempool already introduce a layer of 
> caching in bitcoin core, of which the result are re-used at block 
> validation, such as signature verification results. I'm not sure we can 
> fully waive apart performance considerations, though I agree implementation 
> architecture subsystems like mempool should only be a sideline 
> considerations.
>
> I have not suggested that anything is waived or ignored here. I'm stating 
> that there is no "mempool" performance benefit whatsoever to invalidating 
> 64 byte txs. Mempool caching could only rely on tx identifiers, not block 
> identifiers. Tx identifiers are not at issue.
>
> >> No, this is not the case. As I detailed in my previous message, there 
> is no possible scenario where invalidation caching does anything but make 
> the situation materially worse.
>
> > I think this can be correct that invalidation caching make the situation 
> materially worse, or is denial-of-service neutral, as I believe a full node 
> is only trading space for time resources in matters of block messages 
> validation. I still believe such analysis, as detailed in your previous 
> message, would benefit to be more detailed.
>
> I don't know how to add any more detail than I already have. There are 
> three relevant considerations:
>
> (1) block hashes will not become unique identifiers for block messages.
> (2) the earliest point at which type64 malleation can be detected will not 
> be reduced.
> (3) the necessary cost of type64 malleated determination will not be 
> reduced.
> (4) the additional consensus rule will increase validation cost and code 
> complexity.
> (5) invalid blocks can still be produced at no cost that require full 
> double tx hashing/Merkle root computations.
>
> Which of these statements are not evident at this point?
>
> >> On the other hand, just dealing with parse failure on the spot by 
> introducing a leading pattern in the stream just inflates the size of p2p 
> messages, and the transaction-relay bandwidth cost.
> >>
> >> I think you misunderstood me. I am suggesting no change to 
> serialization. I can see how it might be unclear, but I said, "nothing 
> precludes incorporating a requirement for a necessary leading pattern in 
> the stream." I meant that the parser can simply incorporate the 
> *requirement* that the byte stream starts with a null input point. That 
> identifies the malleation or invalidity without a single hash operation and 
> while only reading a handful of bytes. No change to any messages.
>
> > Indeed, this is clearer with the re-explanation above about what you 
> meant by the "null point".
>
> Ok
>
> > In my understanding, you're suggesting the following algorithm:
> > - receive transaction p2p messages
> > - deserialize transaction p2p messages
> > - if the transaction is a coinbase candidate, verify null input point
> > - if null input point pattern invalid, reject the transaction
>
> No, no part of this thread has any bearing on p2p transaction messages - 
> nor are coinbase transactions relayed as transaction messages. You could 
> restate it as:
>
> - receive block p2p messages
> - if the first tx's first input does not have a null point, reject the 
> block
>
> > If I'm understanding correctly, the last rule has for effect to 
> constraint the transaction space that can be used to brute-force and mount 
> a Merkle root forgery with a 64-byte coinbase transaction.
> >
> > As described in the 3.1.1 of the paper: 
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20190225/a27d8837/attachment-0001.pdf
>
> The above approach makes this malleation computationally infeasible.
>
> >> I'm referring to DoS mitigation (the only relevant security 
> consideration here). I'm pointing out that invalidity caching is pointless 
> in all cases, and in this case is the most pointless as type64 malleation 
> is the cheapest of all invalidity to detect. I would prefer that all bogus 
> blocks sent to my node are of this type. The worst types of invalidity 
> detection have no mitigation and from a security standpoint are 
> counterproductive to cache. I'm describing what overall is actually not a 
> tradeoff. It's all negative and no positive.
>
> > I think we're both discussing the same issue about DoS mitigation for 
> sure. Again, I think that saying the "invalidity caching" is pointless in 
> all cases cannot be fully grounded as a statement without precising (a) 
> what is the internal cache(s) layout of the full node processing block 
> messages and (b) the sha256 mining resources available during N difficulty 
> period and if any miner engage in self-fish mining like strategy.
>
> It has nothing to do with internal cache layout and nothing to do with 
> mining resources. Not having a cache is clearly more efficient than having 
> a cache that provides no advantage, regardless of how the cache is laid 
> out. There is no cost to forcing a node to perform far more block 
> validation computations than can be precluded by invalidity caching. The 
> caching simply increases the overall computational cost (as would another 
> redundant rule to try and make it more efficient). Discarding invalid 
> blocks after the minimal amount of work is the most efficient resolution. 
> What one does with the peer at that point is orthogonal (e.g. drop, ban).
>
> > About (a), I'll maintain my point I think it's a classic time-space 
> trade-off to ponder in function of the internal cache layouts.
>
> An attacker can throw a nearly infinite number of distinct invalid blocks 
> at your node (and all will connect to the chain and show proper PoW). As 
> such you will encounter zero cache hits and therefore nothing but overhead 
> from the cache. Please explain to me in detail how "cache layout" is going 
> to make any difference at all.
>
> > About (b) I think we''ll be back to the headers synchronization strategy 
> as implemented by a full node to discuss if they're exploitable asymmetries 
> for self-fish mining like strategies.
>
> I don't see this as a related/relevant topic. There are zero mining 
> resources required to overflow the invalidity cache. Just as Core recently 
> published regarding overflowing to its "ban" store, resulting in process 
> termination, this then introduces another attack vector that must be 
> mitigated.
>
> > If you can give a pseudo-code example of the "null point" validation 
> implementation in libbitcoin code (?) I think this can make the 
> conversation more concrete on the caching aspect.
>
> pseudo-code , not from libbitcoin...
>
> ```
> bool malleated64(block)
> {
>     segregated = ((block[80 + 4] == 0) and (block[80 + 4 + 1] == 1))
>     return block[segregated ? 86 : 85] != 
> 0xffffffff0000000000000000000000000000000000000000000000000000000000000000
> }
> ```
>
> Obviously there is no error handling (e.g. block too small, too many 
> inputs, etc.) but that is not relevant to the particular question. The 
> block.header is fixed size, always 80 bytes. The tx.version is also fixed, 
> always 4 bytes. A following 0 implies a segregated witness (otherwise it's 
> the input count), assuming there is a following 1. The first and only input 
> for the coinbase tx, which must be the first block tx, follows. If it does 
> not match 
> 0xffffffff0000000000000000000000000000000000000000000000000000000000000000 
> then the block is invalid. If it does match, it is computationally 
> infeasible that the merkle root is type64 malleated. That's it, absolutely 
> trivial and with no prerequisites. The only thing that even makes it 
> interesting is the segwit bifurcation.
>
> >> Rust has its own set of problems. No need to get into a language Jihad 
> here. My point was to clarify that the particular question was not about a 
> C (or C++) null pointer value, either on the surface or underneath an 
> abstraction.
>
> > Thanks for the additional comments on libbitcoin usage of dependencies, 
> yes I don't think there is a need to get into a language jihad here. It's 
> just like all languages have their memory model (stack, dynamic alloc, 
> smart pointers, etc) and when you're talking about performance it's useful 
> to have their minds, imho.
>
> Sure, but no language difference that I'm aware of could have any bearing 
> on this particular question.
>
> Best,
> Eric
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/78e8248d-bc77-452f-ac7e-19c28cbc3280n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 34481 bytes --]

next prev parent reply	other threads:[~2024-12-05 21:35 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-24 18:10 [bitcoindev] " 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-03-26 19:11 ` [bitcoindev] " Antoine Riard
2024-03-27 10:35   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-03-27 18:57     ` Antoine Riard
2024-04-18  0:46     ` Mark F
2024-04-18 10:04       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-04-25  6:08         ` Antoine Riard
2024-04-30 22:20           ` Mark F
2024-05-06  1:10             ` Antoine Riard
2024-07-20 21:39     ` Murad Ali
2024-06-17 22:15 ` Eric Voskuil
2024-06-18  8:13   ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-18 13:02     ` Eric Voskuil
2024-06-21 13:09       ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-24  0:35         ` Eric Voskuil
2024-06-27  9:35           ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-06-28 17:14             ` Eric Voskuil
2024-06-29  1:06               ` Antoine Riard
2024-06-29  1:31                 ` Eric Voskuil
2024-06-29  1:53                   ` Antoine Riard
2024-06-29 20:29                     ` Eric Voskuil
2024-06-29 20:40                       ` Eric Voskuil
2024-07-02  2:36                         ` Antoine Riard
2024-07-03  1:07                           ` Larry Ruane
2024-07-03 23:29                             ` Eric Voskuil
2024-07-04 13:20                               ` Antoine Riard
2024-07-04 14:45                                 ` Eric Voskuil
2024-07-18 17:39                                   ` Antoine Riard
2024-07-20 20:29                                     ` Eric Voskuil
2024-11-28  5:18                                       ` Antoine Riard [this message]
2024-07-03  1:13                           ` Eric Voskuil
2024-07-02 10:23               ` 'Antoine Poinsot' via Bitcoin Development Mailing List
2024-07-02 15:57                 ` Eric Voskuil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78e8248d-bc77-452f-ac7e-19c28cbc3280n@googlegroups.com \
    --to=antoine.riard@gmail$(echo .)com \
    --cc=bitcoindev@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox