public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Mark Friedenbach <mark@friedenbach•org>
To: Bitcoin Protocol Discussion <bitcoin-dev@lists•linuxfoundation.org>
Subject: [bitcoin-dev] Batch validation of CHECKMULTISIG using an extra hint field
Date: Tue, 18 Oct 2022 20:51:42 -0700	[thread overview]
Message-ID: <239D23FC-267F-4198-988D-35152E7E5AC8@friedenbach.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 6450 bytes --]

When Satoshi wrote the first version of bitcoin, s/he made what was almost certainly an unintentional mistake. A bug in the original CHECKMULTISIG implementation caused an extra item to be popped off the stack upon completion. This extra value is not used in any way, and has no consensus meaning. Since this value is often provided in the witness, it unfortunately provides a malleability vector as anybody can change the extra/dummy value in the signature without invalidating a transaction. In legacy scripts NULLDUMMY is a policy rule that states this value must be zero, and this was made a consensus rule for segwit scripts.

This isn’t the only problem with CHECKMULTISIG. For both ECDSA and Schnorr signatures, batch validation could enable an approximate 2x speedup, especially during the initial block download phase. However the CHECKMULTISIG algorithm, as written, seemingly precludes batch validation for threshold signatures as it attempts to validate the list of signatures with the list of pubkeys, in order, dropping an unused pubkey only when a signature validation fails. As an example, the following script

    [2 C B A 3 CHECKMULTISIG]

Could be satisfied by the following witness:

    [0 c a]

Where “a” is a signature for pubkey A, and “c” a signature for pubkey C. During validation, the signature a is checked using pubkey A, which is successful, so the internal algorithm increments the signature pointer AND the pubkey pointer to the next elements in the respective lists, removing both from future consideration. Next the signature c is checked with pubkey B, which fails, so only the pubkey pointer is incremented. Finally signature c is checked with pubkey C, which passes. Since 2 signatures passed and this is equal to the specified threshold, the opcode evaluates as true. All inputs (including the dummy 0 value) are popped from the stack.

The algorithm cannot batch validate these signatures because for any partial threshold it doesn’t know which signatures map to which pubkeys.

Not long after segwit was released for activation, making the NULLDUMMY rule consensus for segwit scripts, the observation was made by Luke-Jr on IRC[1] that this new rule was actually suboptimal. Satoshi’s mistake gave us an extra parameter to CHECKMULTISIG, and it was entirely within our means to use this parameter to convey extra information to the CHECKMULTISIG algorithm, and thereby enable batch validation of threshold signatures using this opcode.

The idea is simple: instead of requiring that the final parameter on the stack be zero, require instead that it be a minimally-encoded bitmap specifying which keys are used, or alternatively, which are not used and must therefore be skipped. Before attempting validation, ensure for a k-of-n threshold only k bits are set in the bitfield indicating the used pubkeys (or n-k bits set indicating the keys to skip). The updated CHECKMULTISIG algorithm is as follows: when attempting to validate a signature with a pubkey, first check the associated bit in the bitfield to see if the pubkey is used. If the bitfield indicates that the pubkey is NOT used, then skip it without even attempting validation. The only signature validations which are attempted are those which the bitfield indicates ought to pass. This is a soft-fork as any validator operating under the original rules (which ignore the “dummy” bitfield) would still arrive at the correct pubkey-signature mapping through trial and error.

Aside: If you wanted to hyper-optimize, you could use a binomial encoding of the bitmask hint field, given that the n-choose-k threshold is already known. Or you could forego encoding the k threshold entirely and infer it from the number of set bits. However in either case the number of bytes saved is negligible compared to the overall size of a multisig script and witness, and there’d be a significant tradeoff in usability. Such optimization is probably not worth it.

If you’d rather see this in terms of code, there’s an implementation of this that I coded up in 2019 and deployed to a non-bitcoin platform:

https://github.com/tradecraftio/tradecraft/commit/339dafc0be37ae5465290b22d204da4f37c6e261

Unfortunately this observation was made too late to be incorporated into segwit, but future versions of script could absolutely use the hint-field trick to enable batch validation of CHECKMULTISIG scripts. So you can imagine my surprise when reviewing the Taproot/Tapscript BIPs I saw that CHECKMULTISIG was disabled for Tapscript, and the justification given in the footnotes is that CHECKMULTISIG is not compatible with batch validation! Talking with a few other developers including Luke-Jr, it has become clear that this solution to the CHECKMULTISIG batch validation problem had been completely forgotten and did not come up during Tapscript review. I’m posting this now because I don’t want the trick to be lost again.

Kind regards,
Mark Friedenbach

PS: One could make the argument that threshold signatures are implementable with the new CHECKSIGADD opcode, so why bother? For example, the above 2-of-3 threshold could be implemented in Tapscript as:

    [OP_0 A CHECKSIGADD B CHECKSIGADD C CHECKSIGADD 2 EQUAL]

However (1) this requires six opcodes in addition to the pubkey pushes, instead of just 3, and the number of wasted opcodes scales linearly with the size of the threshold; and (2) the intent is less clear. Because the intent is not encoded directly in the program in the form of an explicit threshold but rather inferred from the program structure, there is a higher likelihood of making a mistake. Particularly for more advanced scripts than this.

One could also argue that there is no need for explicit k-of-n thresholds now that we have Schnorr signatures, as MuSig-like key aggregation schemes can be used instead. In many cases this is true, and doing key aggregation can result in smaller scripts with more private spend pathways. However there remain many use cases where for whatever reason interactive signing is not possible, due to signatures being pre-calculated and shared with third parties, for example, and therefore explicit thresholds must be used instead. For such applications a batch-validation friendly CHECKMULTISIG would be useful.

[1] I believe it was Luke-Jr, and in 2016 or 2017, probably in #bitcoin-wizards. I don’t have logs to check, however.

[-- Attachment #2: Type: text/html, Size: 15731 bytes --]

             reply	other threads:[~2022-10-19  3:51 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-19  3:51 Mark Friedenbach [this message]
2022-10-20 22:02 ` ZmnSCPxj

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=239D23FC-267F-4198-988D-35152E7E5AC8@friedenbach.org \
    --to=mark@friedenbach$(echo .)org \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox