>> It is not clear to me how determining the coinbase size can be done at 
an earlier stage of validation than detection of the non-null coinbase.
> My point wasn't about checking the coinbase size, it was about being able 
to cache the hash of a (non-malleated) invalid block as permanently invalid 
to avoid re-downloading and re-validating it.

This I understood, but I think you misunderstood me. Your point was 
specifically that, "it would let node implementations cache block failures 
at an earlier stage of validation." Since you have not addressed that 
aspect I assume you agree with my assertion above that the proposed rule 
does not actually achieve this.

Regarding the question of checking coinbase size, the issue is of detecting 
(or preventing) hashes mallied via the 64 byte tx technique. A rule against 
64 byte txs would allow this determination by checking the coinbase alone. 
If the coinbase is 64 bytes the block is invalid, if it is not the block 
hash cannot have been mallied (all txs must have been 64 bytes, see 
previous reference).

In that case if the block is invalid the invalidity can be cached. But 
block invalidity cannot actually be cached until the block is fully 
validated. A rule to prohibit *all* 64 byte txs is counterproductive as it 
only adds additional checks on typically thousands of txs per block, 
serving no purpose.

>> It seems to me that introducing an arbitrary tx size validity may create 
more potential implementation bugs than it resolves.
> The potential for implementation bugs is a fair point to raise, but in 
this case i don't think it's a big concern. Verifying no transaction in a 
block is 64 bytes is as simple a check as you can get.

You appear to be making the assumption that the check is performed after 
the block is fully parsed (contrary to your "earlier" criterion above). The 
only way to determine the tx sizes is to parse each tx for witness marker, 
input count, output count, input script sizes, output script sizes, witness 
sizes, and skipping over the header, several constants, and associated 
buffers. Doing this "early" to detect malleation is an extraordinarily 
complex and costly process. On the other hand, as I pointed out, a rational 
implementation would only do this early check for the coinbase.

Yet even determining the size of the coinbase is significantly more complex 
and costly than checking its first input point against null. That check 
(which is already necessary for validation) resolves the malleation 
question, can be performed on the raw unparsed block buffer by simply 
skipping header, version, reading input count and witness marker as 
necessary, offsetting to the 36 byte point buffer, and performing a byte 
comparison against 
[0000000000000000000000000000000000000000000000000000000000000000ffffffff].

This is:

(1) earlier
(2) faster
(3) simpler
(4) already consensus

>> And certainly anyone implementing such a verifier must know many 
intricacies of the protocol.
> They need to know some, but i don't think it's reasonable to expect them 
to realize the merkle tree construction is such that an inner node may be 
confused with a 64 bytes transaction.

A protocol developer needs to understand that the hash of an invalid block 
cannot be cached unless at least the coinbase has been restricted in size 
(under the proposal) -or- that the coinbase is a null point (presently or 
under the proposal). In the latter case the check is already performed in 
validation, so there is no way a block would presently be cached as invalid 
without checking it. The proposal adds a redundant check, even if limited 
to just the coinbase. [He must also understand the second type of 
malleability, discussed below.]

If this proposed rule was to activate we would implement it in a late stage 
tx.check, after txs/blocks had been fully deserialized. We would not check 
it an all in the case where the block is under checkpoint or milestone 
("assume valid"). In this case we would retain the early null point 
malleation check (along with the hash duplication malleation check) that we 
presently have, would validate tx commitments, and commit the block. In 
other words, the proposal adds unnecessary late stage checks only. 
Implementing it otherwise would just add complexity and hurt performance.

>> I do not see this. I see a very ugly perpetual seam which will likely 
result in unexpected complexities over time.
> What makes you think making 64 bytes transactions invalid could result in 
unexpected complexities? And why do you think it's likely?

As described above, it's later, slower, more complex, unnecessarily broad, 
and a consensus change. Beyond that it creates an arbitrary size limit - 
not a lower or upper bound, but a slice out of the domain. Discontinuities 
are inherent complexities in computing. The "unexpected" part speaks for 
itself.

>> This does not produce unmalleable block hashes. Duplicate tx hash 
malleation remains in either case, to the same effect. Without a resolution 
to both issues this is an empty promise.
> Duplicate txids have been invalid since 2012 (CVE-2012-2459).

I think again here you may have misunderstood me. I was not making a point 
pertaining to BIP30. I was referring to the other form of block hash 
malleability, which results from duplicating sets of trailing txs in a 
single block (see previous reference). This malleation vector remains, even 
with invalid 64 byte txs. As I pointed out, this has the "same effect" as 
the 64 byte tx issue. Merkle hashing the set of txs is insufficient to 
determine identity. In one case the coinbase must be checked (null point or 
size) and in the other case the set of tx hashes must be checked for 
trailing duplicated sets. [Core performs this second check within the 
Merkle hashing algorithm (with far more comparisons than necessary), though 
this can be performed earlier and independently to avoid any hashing in the 
malleation case.]

I would also point out in the interest of correctness that Core reverted 
its BIP30 soft fork implementation as a consequence of the BIP90 hard fork, 
following and requiring the BIP34 soft fork that presumably precluded it 
but didn't, so it is no longer the case that duplicate tx hashes are 
invalid in implementation. As you have proposed in this rollup, this 
requires fixing again.

> If 64 bytes transactions are also made invalid, this would make it 
impossible for two valid blocks to have the same hash.

Aside from the BIP30/34/90 issue addressed above, it is already 
"impossible" (cannot be stronger than computationally infeasible) for two 
*valid* blocks to have the same hash. The proposal does not enable that 
objective, it is already the case. No malleated block is a valid block.

The proposal aims only to make it earlier or easier or faster to check for 
block hash malleation. And as I've pointed out above, it doesn't achieve 
those objectives. Possibly the perception that this would be the case is a 
consequence of implementation details, but as I have shown above, it is not 
in fact the case.

Given either type of malleation, the malleated block can be determined to 
be invalid by a context free check. But this knowledge cannot ever be 
cached against the block hash, since the same hash may be valid. Invalidity 
can only be cached once a non-mallied block is validated and determined to 
be invalid. Block hash malleations are and will remain invalid blocks with 
or without the proposal, and it will continue to be necessary to avoid 
caching invalid against the malleation. As you said:

> it was about being able to cache the hash of a (non-malleated) invalid 
block as permanently invalid to avoid re-downloading and re-validating it.

This is already the case, and requires validating the full non-malleated 
block. Adding a redundant invalidity check doesn't improve this in any way.

Best,
Eric

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bitcoindev/9a4c4151-36ed-425a-a535-aa2837919a04n%40googlegroups.com.