Good morning all,
It seems to me that adding the length for checksumming purposes need not require the length to be *actually* added in the address format.
Indeed!
This has the following properties:
* The bech32 address format is retained, and no explicit length is added.
* There are now two checksum formats: one with just the witness program, the other which validates with the witness program length.
* Readers that do not understand the new checksum format will simply reject them without mis-sending to the wrong witness program.
That's very close to what I was suggesting: create an improved bech32 algorithm and use that for future addresses, rather than working around the problem in the address encoding while keeping the existing bech32 checksum. Sorry if that wasn't clear from my previous email.
In this case, there is no need to even implicitly include the length in the checksum algorithm. Replacing the "xor 1" at the end of the algorithm to "xor (2^30 - 1)" would reduce the occurrence of this weakness from 1/32 to 1/2^30, and have no downsides otherwise. I'd like to do some analysis to ascertain it actually will catch any other kind of insertion/deletion errors with high probability as well before actually proposing it, though.
There are other solutions which do include the length in some fashion directly in the checksum calculation, which may be preferable (I need to analyse things...). It's also possible to do this in such a way that for 33-symbol and 53-symbol data parts (corresponding to P2WPKH and P2WSH lengths) the new algorithm is defined as identical to the old one. That would simplify upstream users of a bech32 library (which would then effectively need no changes at all, apart from updating the checksum/decoder code).
That brings me to Matt's point: there is no need to do this right now. We can simply amend BIP173 to only permit length 20 and length 32 (and only length 20 for v0, if you like; but they're so far apart that permitting both shouldn't hurt), for now. Introducing the "new" address format (the one using an improved checksum algorithm) only needs to be there in time for when a non-32-byte-witness-program would come in sight.
Of course, I should update BIP173 to indicate the issue, and have a suggested improvement for other users of bech32, if they feel this issue is significant enough.
Cheers,
--
Pieter