[bitcoin-dev] Signing CHECKSIG position in Tapscript

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* [bitcoin-dev] Signing CHECKSIG position in Tapscript
@ 2019-11-27 21:29 Russell O'Connor
  2019-11-28  8:06 ` Anthony Towns
  0 siblings, 1 reply; 6+ messages in thread
From: Russell O'Connor @ 2019-11-27 21:29 UTC (permalink / raw)
  To: Pieter Wuille, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 3780 bytes --]

Hi all,

I'd like to revisit an old topic from last year about the data signed in
tapscript signatures <
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/016508.html
>.

The current tapscript proposal requires a signature on the last executed
CODESEPRATOR position.  I'd like to propose an amendment whereby instead of
signing the last executed CODESEPRATOR position, we simply always sign the
position of the CHECKSIG (or other signing opcode) being executed. Then we
can deprecate CODESEPARTOR (either by making it OP_SUCCESS, or a nop, or
always fail when executed, or whatever).

The main motivation for this proposal is to increase robustness against
various signature-copying attacks in Scripts that have multiple spending
conditions.  Bitcoin is already robust against attacks where the attacker
attempts to peddle a victim's UTXO as their own and try to copy the
victim's signature from one transaction input to another input.  Because
Bitcoin signatures specify which input within a transaction is being signed
for, such attacks fail (see https://bitcoin.stackexchange.com/a/85665/49364
).

However, unless CODESEPARATOR is explicitly used, there is no protection
against these sorts of attacks when there are multiple participants that
have signing conditions within a single UTXO (or rather within a single
tapleaf in the tapscript case).  As it stands, Bitcoin's signed data only
covers which input is being signed, and not the specific conditions are
being signed for.  So for example, if Alice and Bob are engaged in some
kind of multi-party protocol, and Alice wants to pre-sign a transaction
redeeming a UTXO but subject to the condition that a certain hash-preimage
is revealed, she might verify the Script template shows that the code path
to her public key enforces that the hash pre-image is revealed (using a
toolkit like miniscript can aid in this), and she might make her signature
feeling secure that it, if her signature is used, the required preimage
must be revealed on the blockchain.  But perhaps Bob has masquated Alice's
pubkey as his own, and maybe he has inserted a copy of Alice's pubkey into
a different path of the Script template.  Now Alice's signature can be
copied and used in this alternate path, allowing the UTXO to be redeemed
under circumstances that Alice didn't believe she was authorizing.  In
general, to protect herself, Alice needs to inspect the Script to see if
her pubkey occurs in any other branch.  Given that her pubkey, in
principle, could be derived from a computation rather that pushed directly
into the stack, it is arguably infeasible for Alice to perform the required
check in general.

I believe that it would be safer, and less surprising to users, to always
sign the CHECKSIG position by default.  This will automatically enforce
conditions with the signature in most cases, rather than requiring users to
proactively try to reason if CODESEPARATOR is required for protection
within their protocol or not, and risk having them leave it out for cost
savings when it ends up being required for security after all.

I do not believe signing the CHECKSIG position is an undue burden on those
signers who have no conditions they require enforcement for.  As it stands,
the tapscript proposal already requires the tapleaf_hash value under the
signature; this CHECKSIG position value is simply more of the same kind of
data.  In simple Script templates (e.g. those with only one CHECKSIG
operation) the signed position will be a fixed known value.  Complex Script
templates are precisely the situations where you want to be careful about
enforcement of conditions with your signature.

As a side benefit, we get to eliminate CODESEPARATOR, removing a fairly
awkward opcode from this script version.

[-- Attachment #2: Type: text/html, Size: 4461 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript
  2019-11-27 21:29 [bitcoin-dev] Signing CHECKSIG position in Tapscript Russell O'Connor
@ 2019-11-28  8:06 ` Anthony Towns
  2019-12-01 16:09   ` Russell O'Connor
  0 siblings, 1 reply; 6+ messages in thread
From: Anthony Towns @ 2019-11-28  8:06 UTC (permalink / raw)
  To: Russell O'Connor, Bitcoin Protocol Discussion; +Cc: Pieter Wuille

On Wed, Nov 27, 2019 at 04:29:32PM -0500, Russell O'Connor via bitcoin-dev wrote:
> The current tapscript proposal requires a signature on the last executed
> CODESEPRATOR position.  I'd like to propose an amendment whereby instead of
> signing the last executed CODESEPRATOR position, we simply always sign the
> position of the CHECKSIG (or other signing opcode) being executed.

FWIW, there's discussion of this at
http://www.erisian.com.au/taproot-bip-review/log-2019-11-28.html#l-65

> However, unless CODESEPARATOR is explicitly used, there is no protection
> against these sorts of attacks when there are multiple participants that have
> signing conditions within a single UTXO (or rather within a single tapleaf in
> the tapscript case).

(You already know this, but:)

With taproot key path spending, the only other conditions that can be
placed on a transaction are nSequence, nLockTime, and the annex, all of
which are committed to via the signature; so I think this concern only
applies to taproot script path spending.

The proposed sighashes for taproot script path spending all commit to
the script being used, so you can't reuse the signature in a different
leaf of the merkle tree of scripts for the UTXO, only in a separate
execution path within the script you're already looking at.

> So for example, if Alice and Bob are engaged in some kind of multi-party
> protocol, and Alice wants to pre-sign a transaction redeeming a UTXO but
> subject to the condition that a certain hash-preimage is revealed, she might
> verify the Script template shows that the code path to her public key enforces
> that the hash pre-image is revealed (using a toolkit like miniscript can aid in
> this), and she might make her signature feeling secure that it, if her
> signature is used, the required preimage must be revealed on the blockchain. 
> But perhaps Bob has masquated Alice's pubkey as his own, and maybe he has
> inserted a copy of Alice's pubkey into a different path of the Script
> template.
>
> Now Alice's signature can be copied and used in this alternate path,
> allowing the UTXO to be redeemed under circumstances that Alice didn't believe
> she was authorizing.  In general, to protect herself, Alice needs to inspect
> the Script to see if her pubkey occurs in any other branch.  Given that her
> pubkey, in principle, could be derived from a computation rather that pushed
> directly into the stack, it is arguably infeasible for Alice to perform the
> required check in general.

First, it seems like a bad idea for Alice to have put funds behind a
script she doesn't understand in the first place. There's plenty of
scripts that are analysable, so just not using ones that are too hard to
analyse sure seems like an option.

Second, if there are many branches in the script, it's probably more
efficient to do them via different branches in the merkle tree, which
at least for this purpose would make them easier to analyse as well
(since you can analyse them independently).

Third, if you are doing something crazy complex where a particular key
could appear in different CHECKSIG operators and they should have
independent signatures, that seems like you're at the level of
complexity where learning about CODESEPARATOR is a reasonable thing to
do.

I think CODESEPARATOR is a better solution to this problem anyway. In
particular, consider a "leaf path root OP_MERKLEPATHVERIFY" opcode,
and a script that says "anyone in group A can spend if the preimage for
X is revelaed, anyone in group B can spend unconditionally":

 IF HASH160 x EQUALVERIFY groupa ELSE groupb ENDIF
 MERKLEPATHVERIFY CHECKSIG

spendable by

 siga keya path preimagex 1

or

 sigb keyb path 0

With your proposed semantics, if my pubkey is in both groups, my signature
will sign for position 10, and still be valid on either path, even if
the signature commits to the CHECKSIG position.

I could fix my script either by having two CHECKSIG opcodes (one for
each branch) and also duplicating the MERKLEPATHVERIFY; or I could
add a CODESEPARATOR in either IF branch.

(Or I could just not reuse the exact same pubkey across groups; or I could
have two separate scripts: "HASH160 x EQUALVERIFY groupa MERKLEPATHVERIFY
CHECKSIG" and "groupb MERKLEPATHVERIFY CHECKSIG")

> I believe that it would be safer, and less surprising to users, to always sign
> the CHECKSIG position by default.

> As a side benefit, we get to eliminate CODESEPARATOR, removing a fairly awkward
> opcode from this script version.

As it stands, ANYPREVOUTANYSCRIPT proposes to not sign the script code
(allowing the signature to be reused in different scripts) but does
continue signing the CODESEPARATOR position, allowing you to optionally
restrict how flexibly you can reuse signatures. That seems like a better
tradeoff than having ANYPREVOUTANYSCRIPT signatures commit to the CHECKSIG
position which would make it a fair bit harder to design scripts that
can share signatures, or not having any way to restrict which scripts
the signature could apply to other than changing the pubkey.

A hypothetical alternate "codeseparator" design: when script execution
starts, initialise an empty byte string "trace"; each time an opcode
is executed append "0xFF"; each time an opcode is skipped append
"0x00". When a CODESEPARATOR is seen, calculate sha256(trace) and store
it, everytime a CHECKSIG is executed, include the sha256(trace) from the
last CODESEPARATOR in the digest [0]. That should make each checksig
commit to the exact path the script took up to the last CODESEPARATOR
seen. I think it's probably more complex than is really useful though,
so I'm not proposing it seriously.

[0] If there's not been a CODESEPARATOR, then sha256(trace)=sha256("");
    if there's been one CODESEPARATOR and it was the first opcode seen,
    sha256(trace)=sha256("\xff").

Cheers,
aj

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript
  2019-11-28  8:06 ` Anthony Towns
@ 2019-12-01 16:09   ` Russell O'Connor
  2019-12-03  8:35     ` Anthony Towns
  0 siblings, 1 reply; 6+ messages in thread
From: Russell O'Connor @ 2019-12-01 16:09 UTC (permalink / raw)
  To: Anthony Towns; +Cc: Bitcoin Protocol Discussion, Pieter Wuille

[-- Attachment #1: Type: text/plain, Size: 8338 bytes --]

On Thu, Nov 28, 2019 at 3:07 AM Anthony Towns <aj@erisian•com.au> wrote:

> FWIW, there's discussion of this at
> http://www.erisian.com.au/taproot-bip-review/log-2019-11-28.html#l-65
>

I think variants like signing the position of the enclosing
OP_IF/OP_NOTIF/OP_ELSE of the OP_IF/OP_NOTIF/OP_ELSE block that the
checksig is within, or signing the byte offset instead of the opcode number
offset are all fine.  In particular, signing the enclosing OP_IF... would
allow sharing of the hashed signed data in a normal multisig sequence of
operations.  Below I'll continue to refer to my proposal as signing the
CHECKSIG position, but please take it to mean any of these proposed,
semantically equivalent, realizations of this idea.

I also think that it is quite reasonable to have a sighash flag control
whether or not the signature covers the CHECKSIG position or not, with
SIGHASH_ALL including the CHECKSIG position.

> First, it seems like a bad idea for Alice to have put funds behind a
> script she doesn't understand in the first place. There's plenty of
> scripts that are analysable, so just not using ones that are too hard to
> analyse sure seems like an option.
>

I don't think this is true in general.  When constructing a script it seems
quite reasonable for one party to come to the table with their own custom
script that they want to use because they have some sort of 7-of-11 scheme
but in one of those cases is really a 2-of-3 and another is 5-of-6.  The
point is that you shouldn't need to decode their exact policy in order to
collaborate with them.  This notion is captured quite clearly in the MAST
aspect of taproot.  In many circumstances, it is sufficient for you to know
that there exists a branch that contains a particular script without need
to know what every branch contains.  Because we include the tapleaf in the
signature, we already prevent this signature copying attack against
attempts to transplant one's signature from one tapleaf to another.  My
proposal is to simply extend this same protection to branches within a
single tapscript.

Second, if there are many branches in the script, it's probably more
> efficient to do them via different branches in the merkle tree, which
> at least for this purpose would make them easier to analyse as well
> (since you can analyse them independently).
>

Of course this should be done when practical.  This point isn't under
dispute.

> Third, if you are doing something crazy complex where a particular key
> could appear in different CHECKSIG operators and they should have
> independent signatures, that seems like you're at the level of
> complexity where learning about CODESEPARATOR is a reasonable thing to
> do.
>

So while I agree that learning about CODESEPARATOR is a reasonable thing to
do, given that I haven't heard the CODESEPARATOR being proposed as
protection against this sort of signature-copying attack before and given
the subtle nature of the issue, I'm not sure people will know to use it to
protect themselves.  We should aim for a Script design that makes the
cheaper default Script programming choices the safer one.

On the other hand, in a previous thread a while ago I was also arguing that
sophisticated people are plausibly using CODESEPARATOR today, hidden away
in unredeemed P2SH UTXOs.  So perhaps I'm right about at least one of these
two points. :)

I think CODESEPARATOR is a better solution to this problem anyway. In
> particular, consider a "leaf path root OP_MERKLEPATHVERIFY" opcode,
> and a script that says "anyone in group A can spend if the preimage for
> X is revelaed, anyone in group B can spend unconditionally":
>
>  IF HASH160 x EQUALVERIFY groupa ELSE groupb ENDIF
>  MERKLEPATHVERIFY CHECKSIG
>
> spendable by
>
>  siga keya path preimagex 1
>
> or
>
>  sigb keyb path 0
>
> With your proposed semantics, if my pubkey is in both groups, my signature
> will sign for position 10, and still be valid on either path, even if
> the signature commits to the CHECKSIG position.
>
> I could fix my script either by having two CHECKSIG opcodes (one for
> each branch) and also duplicating the MERKLEPATHVERIFY; or I could
> add a CODESEPARATOR in either IF branch.
>

> (Or I could just not reuse the exact same pubkey across groups; or I could
> have two separate scripts: "HASH160 x EQUALVERIFY groupa MERKLEPATHVERIFY
> CHECKSIG" and "groupb MERKLEPATHVERIFY CHECKSIG")
>

I admit my proposal doesn't automatically prevent this signature-copying
attack against every Script template.  To be fully effective you need to be
aware of this signature-copying attack vector to ensure your scripts are
designed so that your CHECKSIG operations are protected by being within the
IF block that does the verification of the hash-preimage.  My thinking is
that my proposal is effective enough to save most people most of the time,
even if it doesn't save everyone all the time, all while having no
significant burden otherwise.  Therefore, I don't think your point that
there still exists a Script where a signature copying attack can be
performed is adequate by itself to dismiss my proposal.  However if you
believe that if we don't save everyone all the time then there is no point
in trying, or if you believe that signing the CHECKSIG position probably
will not protect most users most of the time, or if you believe the burden
on all the other cases is too great, then maybe it is better to rely on
people using CODESEPARATOR.

Given that MAST design of taproot greatly reduces this problem compared to
legacy script, I suppose you could argue that "the burden on all the other
cases is too great" simply because you believe the problematic situation is
now extremely rare.

I still think we ought to choose designs that are safer by default and
include as much user intention within the signed data as we can reasonably
get away, and use other sighash flags for those cases when we need to
exclude data from the signature.

In particular, imagine a world where CODESEPARATOR never existed.  We have
this signature copying attack to deal with, and we are designing a new
Segwit version in which we can now address the problem.  One proposal that
someone comes up with is to sign the CHECKSIG position (or sign the
enclosing OP_IF/OP_ELSE... position), maybe using a SIGHASH flag to
optionally disable it.  Someone else comes up with a proposal to add new
"CODESEPARATOR" opcode which requires adding a new piece of state to the
Script interpreter (the only non-stack based piece of state) to track the
last executed CODESEPARATOR position and include that in the signature.
Would you really prefer the CODESEPARATOR proposal?

> > I believe that it would be safer, and less surprising to users, to
> always sign
> > the CHECKSIG position by default.
>
> > As a side benefit, we get to eliminate CODESEPARATOR, removing a fairly
> awkward
> > opcode from this script version.
>
> As it stands, ANYPREVOUTANYSCRIPT proposes to not sign the script code
> (allowing the signature to be reused in different scripts) but does
> continue signing the CODESEPARATOR position, allowing you to optionally
> restrict how flexibly you can reuse signatures. That seems like a better
> tradeoff than having ANYPREVOUTANYSCRIPT signatures commit to the CHECKSIG
> position which would make it a fair bit harder to design scripts that
> can share signatures, or not having any way to restrict which scripts
> the signature could apply to other than changing the pubkey.
>

Um, I believe that signing the CODESEPERATOR position without signing the
script code is nonsensical.  You are talking about signing a piece of data
without an interpretation of its meaning.

Recall that originally CODESEPARTOR would let you sign a suffix of the
Script program.  In the context of signing the whole script (which is
always signed indirectly as part of the txid in legacy signatures) signing
the offset into that scripts contains just as much information as signing a
script suffix, while being constant sized.  When you remove the Script from
the data being signed, signing an offset is no longer equivalent to signing
a Script suffix, and an offset into an unknown data structure is a
meaningless value by itself.  There is no way that you should be signing
CODESEPARATOR position without also covering the Script with the signature.

[-- Attachment #2: Type: text/html, Size: 10301 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript
  2019-12-01 16:09   ` Russell O'Connor
@ 2019-12-03  8:35     ` Anthony Towns
  2019-12-05 20:24       ` Russell O'Connor
  0 siblings, 1 reply; 6+ messages in thread
From: Anthony Towns @ 2019-12-03  8:35 UTC (permalink / raw)
  To: Russell O'Connor; +Cc: Bitcoin Protocol Discussion

On Sun, Dec 01, 2019 at 11:09:54AM -0500, Russell O'Connor wrote:
> On Thu, Nov 28, 2019 at 3:07 AM Anthony Towns <aj@erisian•com.au> wrote:
>     First, it seems like a bad idea for Alice to have put funds behind a
>     script she doesn't understand in the first place. There's plenty of
>     scripts that are analysable, so just not using ones that are too hard to
>     analyse sure seems like an option.
> I don't think this is true in general.  When constructing a script it seems
> quite reasonable for one party to come to the table with their own custom
> script that they want to use because they have some sort of 7-of-11 scheme but
> in one of those cases is really a 2-of-3 and another is 5-of-6.  The point is
> that you shouldn't need to decode their exact policy in order to collaborate
> with them.

Hmm, I take the opposite lesson from your scenario -- it's only fine for
people to bring their own 2-of-3 or 5-of-6 or whatever and replace a
simple key if you've got something like miniscript where you understand
the script completely enough that you can be sure those changes are
fine. 

For contrast, with ECDSA and pre-miniscript, the above scenario might
have gone like someone proposing to change:

  7 A B C1 C2 C3 C4 C5 C6 C7 C8 C9 11 CHECKMULTISIG

for something like

  7
  SWAP IF TOALT 2 A1 A2 A3 3 CHECKMULTISIGVERIFY FROMALT 1SUB ENDIF
  SWAP IF TOALT 5 B1 B2 B3 B4 B5 B6 6 CHECKMULTISIGVERIFY FROMALT 1SUB ENDIF
  C1 C2 C3 C4 C5 C6 C7 C8 C9 11 CHECKMULTISIG

but I think you'd want to be pretty sure you can decode those added
policies rather than just accepting it because your "C4" key is still
there. (In particular, any script fragment that uses an opcode that used
to be OP_SUCCESS could have arbitrary effects on the script)

[0]

> This notion is captured quite clearly in the MAST aspect of
> taproot. In many circumstances, it is sufficient for you to know that there
> exists a branch that contains a particular script without need to know what
> every branch contains.

(I'm trying to avoid using MAST in the context of taproot, despite the
backronym, so please excuse the rephrasing--)

I think if you're going to start using a taproot address with multiple
tapscripts, either as a participant in a multiparty smart contract,
or just to have different ways of spending your funds, then you do have
to analyse all the branches to make sure there's no hidden "all the
money goes to the Lizard People" script.

Once you've done that, you can then simplify things -- maybe some
scripts are only useful for other participants in the contract, or maybe
you've got a few different hardware wallets and one only needs to know
about one branch, while the other only needs to know about some other
branch, but you still need to have done the analysis in the first place.

Of course, probably most of the time that "analysis" is just making sure
the scripts match some well known, hardcoded template, as filled out
with various (tweaked) keys that you've checked elsewhere, but that
still ensures you know all the scripts do what you need them too.

>     Third, if you are doing something crazy complex where a particular key
>     could appear in different CHECKSIG operators and they should have
>     independent signatures, that seems like you're at the level of
>     complexity where learning about CODESEPARATOR is a reasonable thing to
>     do.
> So while I agree that learning about CODESEPARATOR is a reasonable thing to do,
> given that I haven't heard the CODESEPARATOR being proposed as protection
> against this sort of signature-copying attack before

Err? The current behaviour of CODESEP with taproot was first discussed in
[1], which summarised it as "CODESEP -- lets you require different sigs
for different parts of a single script" which seems to me like just a
different way of saying the same thing.

[1] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/016500.html

I don't think tapscript's CODESEP or the current CODESEP can be used
for anything other than preventing a signature from being reused for a
different CHECKSIG operation on the same pubkey within the same script.

> and given the subtle
> nature of the issue, I'm not sure people will know to use it to protect
> themselves.  We should aim for a Script design that makes the cheaper default
> Script programming choices the safer one.

I think techniques like miniscript and having fixed templates specified
in BIPs and BOLTs and the like are better approaches -- both let you
easily allow a limited set of changes that can be safely made to a policy
(maybe just substituting keys, hashes and times, maybe allowing more
general changes).

> On the other hand, in a previous thread a while ago I was also arguing that
> sophisticated people are plausibly using CODESEPARATOR today, hidden away in
> unredeemed P2SH UTXOs.  So perhaps I'm right about at least one of these two
> points. :)

Sounds like an economics argument :)

>      IF HASH160 x EQUALVERIFY groupa ELSE groupb ENDIF
>      MERKLEPATHVERIFY CHECKSIG
>     spendable by
>      siga keya path preimagex 1
>     or
>      sigb keyb path 0
> I admit my proposal doesn't automatically prevent this signature-copying attack
> against every Script template.

Right -- so if you're worried about this sort of attack, you need to
analyse your script to at least be sure that it's not one of these cases
that aren't covered. And if you've got to analyse the script anyway
(which I think you do no matter what), then there's no benefit -- you're
either doing something simple and you're using templates or miniscript
to make the analysis easy; or you're doing something novel and complex,
and you can probably cope with using CODESEP.

(Ultimately I think there's only really two cases where you're
contributing a signature for a tx: either you're a party to the contract,
and you should have fully analysed all the possible ways the utxo could
be spent to make sure the smart contract stuff is correctly implemented
and you can't be cheated; or you're acting as an oracle or similar and
don't really care how the contract goes because you're not a party to
it, in which case people reusing your signature as much as they like is
fine. Hardware wallets don't need to analyse scripts they sign for, eg,
but that's only because for those cases where their owners have done
that first)

> To be fully effective you need to be aware of
> this signature-copying attack vector to ensure your scripts are designed so
> that your CHECKSIG operations are protected by being within the IF block that
> does the verification of the hash-preimage.  My thinking is that my proposal is
> effective enough to save most people most of the time, even if it doesn't save
> everyone all the time, all while having no significant burden otherwise.

I agree the burden's pretty minor; but I think having a single value
for the tx digest for each input for SIGHASH_ALL is kind-of nice for
validation; and I think having to pass through a CHECKSIG position
everytime you do a signature is likely to be annoying for implementors
for pretty much zero actual benefit.

> Therefore, I don't think your point that there still exists a Script where a
> signature copying attack can be performed is adequate by itself to dismiss my
> proposal.

I'm making two points with that example: (1) it's a case where if
you don't analyse the scripts somehow, you can still be vulnerable to
the attack with your change -- so your change doesn't let you avoid
knowing what scripts do; but also (2) that CODESEP is a marginally more
efficient/general fix the problem. Maybe (1) isn't too important,
because even if it weren't true, I still think you need to know what all
the scripts do, but I think (2)'s still reelevant.

> Given that MAST design of taproot greatly reduces this problem compared to
> legacy script, I suppose you could argue that "the burden on all the other
> cases is too great" simply because you believe the problematic situation is now
> extremely rare.

As you aluded to in the previous mail; I think the problem's currently
extremely rare and trivially avoidable because we don't really have any
way of manipulating pubkeys -- there's no CAT, EC_ADD/EC_MUL/EC_TWEAK
or MERKLEPATHVERIFY opcode (or actual Merkle Abstract Syntax Trees or
OP_EXEC etc) to make it a dynamic concern rather than a static one.

> In particular, imagine a world where CODESEPARATOR never existed.  We have this
> signature copying attack to deal with, and we are designing a new Segwit
> version in which we can now address the problem.  One proposal that someone
> comes up with is to sign the CHECKSIG position (or sign the enclosing OP_IF/
> OP_ELSE... position), maybe using a SIGHASH flag to optionally disable it. 
> Someone else comes up with a proposal to add new "CODESEPARATOR" opcode which
> requires adding a new piece of state to the Script interpreter (the only
> non-stack based piece of state) to track the last executed CODESEPARATOR
> position and include that in the signature.  Would you really prefer the
> CODESEPARATOR proposal?

If CODESEP had never existed, I think my first response would be to say
"well, just make sure you don't reuse pubkeys, and because each
bip-schnorr sig commits to the pubkey, problem solved."

There's only two use cases I'm aware of, one is the ridiculous
reveal-a-secret-key-by-forced-nonce-reuse script that's never actually
been implemented [2] and ntumblebit's escrow script [3]. The first of
those requires pubkey recovery so doesn't work with bip-schnorr anyway;
and it's not clear to me whether the second is really reason enough to
justify a dedicated opcode/sighash/etc.

[2] https://lists.linuxfoundation.org/pipermail/lightning-dev/2015-November/000363.html
[3] https://github.com/NTumbleBit/NTumbleBit/blob/master/NTumbleBit/EscrowScriptBuilder.cs

An option would be to remove CODESEP and treat it as OP_SUCCESS -- that
way it could be introduced later with pretty much the exact semantics
that are currently proposed; or with some more useful semantics. That
way we could bring in whatever functionality was actually needed at the
same time as introducing CAT/EC_MUL/etc.

But my default position is to think that the way things currently work is
mostly fine, and we should default ot just keeping the same functionality
-- so SIGHASH_ALL doesn't do anything fancy, but CODESEP can be used to
prevent sig reuse.

>     > As a side benefit, we get to eliminate CODESEPARATOR, removing a fairly
>     awkward
>     > opcode from this script version.
> 
>     As it stands, ANYPREVOUTANYSCRIPT proposes to not sign the script code
>     (allowing the signature to be reused in different scripts) but does
>     continue signing the CODESEPARATOR position, allowing you to optionally
>     restrict how flexibly you can reuse signatures. That seems like a better
>     tradeoff than having ANYPREVOUTANYSCRIPT signatures commit to the CHECKSIG
>     position which would make it a fair bit harder to design scripts that
>     can share signatures, or not having any way to restrict which scripts
>     the signature could apply to other than changing the pubkey.

> Recall that originally CODESEPARTOR would let you sign a suffix of the Script
> program.  In the context of signing the whole script (which is always signed
> indirectly as part of the txid in legacy signatures) signing the offset into
> that scripts contains just as much information as signing a script suffix,
> while being constant sized.  When you remove the Script from the data being
> signed, signing an offset is no longer equivalent to signing a Script suffix,
> and an offset into an unknown data structure is a meaningless value by itself. 
The tapscript implementation isn't intended to be equivalent to signing
a script suffix; all it does is add an index to the digest being signed
so that signatures at different indexes are distinct. That it's
equivalent to the current behaviour is definitely a feature, but I think
that's a surprising coincidence than a useful way of thinking about the
actual usefulness of CODESEP in tapscript...

[4]

> Um, I believe that signing the CODESEPERATOR position without signing the
> script code is nonsensical.  You are talking about signing a piece of data
> without an interpretation of its meaning.

With ANYPREVOUTANYSCRIPT, you're still differentiating signatures by
index, you just no longer also commit to any of the other details of
the script. That means you can't prevent your signature being reused in
random other scripts someone else designs -- hence the "ANYSCRIPT" part --
but you can prevent any of your funds from going to those addresses, so
that's not really your problem anyway. What it does mean is that you can
prevent your signature from being reused in different scripts you do know
about; eg you might have a UTXO with four different tapscript branches:

     1) OP_1 CHECKSIG
     2) CODESEP OP_1 CHECKSIGVERIFY HASH160 x EQUAL
     3) n CLTV DROP CODESEP OP_1 CHECKSIGVERIFY
     4) k CSV DROP CODESEP OP_1 CHECKSIGVERIFY

(where OP_1 means using the taproot internal pubkey with support for
ANYPREVOUT*) -- that way a signature for either path (1) or (2) is only
valid for that path, but a signature for (3) can be reused for (4)
(or vice-versa), but not (1) or (2); and all those signatures could
be reused for other corresponding scripts, for instance with different
values for x,n,k if desired.

> There is no way that you should be signing CODESEPARATOR position without also
> covering the Script with the signature.

So I think it's more sensible than it seems; and still plausible enough
to leave in. If you don't want to separate your ANYPREVOUT scripts,
you can just not put a CODESEP in -- or at least only put CODESEP after
all your ANYPREVOUT CHECKSIGs; so it doesn't seem like it's creating
any added complexity.

Cheers,
aj

[0] For what it's worth, there's another reason not to allow replacing
    keys in a threshold sig with different policies: if you've got say
    30 people with a majority threshold of 16, then you could two groups
    of 9 people form parties and each agree to all vote along party lines;
    but if you let them replace their keys with multisig policies along
    those lines, you're now enforcing a 10-of-30 policy instead (as long
    as the 10 are 5 from the first party and 5 from the second party)
    and allowing minority control instead of majority rule.

[4] I wonder if it would be worth exploring whether we could do
    something more like the original (presumed) intent of CODESEP, given
    use of NOINPUT/ANYPREVOUT so as not to commit to the full script,
    you could potentially have a SIGHASH that committed to a hash of
    the script that's been executed so far, and also the witness data
    that's been consumed so far, but it would ensure the first part of
    the script behaved exactly as you expected, and allow the rest of
    the script to be arbitrarily weird, and (I think) be efficiently
    implementable. That doesn't give you delegation without the ability
    to also have executable witness data of some sort, but maybe something
    like it is interesting anyway?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript
  2019-12-03  8:35     ` Anthony Towns
@ 2019-12-05 20:24       ` Russell O'Connor
  2019-12-06  4:51         ` Anthony Towns
  0 siblings, 1 reply; 6+ messages in thread
From: Russell O'Connor @ 2019-12-05 20:24 UTC (permalink / raw)
  To: Anthony Towns; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 7174 bytes --]

After chatting with andytoshi and others, and some more thinking I've been
convinced that my specific concern about other users masquerading other
people pubkeys as their own in complex scripts is actually a non-issue.

No matter what you write in Script (today), you are limited to expressing
some policy that is logically equivalent to a set of conditions and
signatures on pubkeys that can be expressed in disjunctive normal form.  We
can write such a policy as

(C[1] && PK[1,1] && ... && PK[1,m[1]]) || ... || (C[n] && PK[n,1] && ... &&
PK[n,m[n]])

where C[i] is some conjunction of conditions such as timelock constraints,
or hash-lock constraints or any other kind of proof of publication, and
where PK[i,j] is a requirement of a signature against a specific public key.

From Alice's point of view, she can divide set of clauses under the
disjunction into those that contain a pubkey that she considers (partially)
under her control and those clauses that she does not control (even though
as we shall see those other keys might actually be under Alice's control,
unbeknownst to her). To that end, let us consider a specific representative
policy.

    (C[1] && APK[1]) || (C[2] && APK[2] && BPK[2]) || (C[3] && BPK[3])

where Alice considers herself in control of APK[1] and APK[2], and where
she considers Bob in control of BPK[2] and BPK[3] and where C[1], C[2], and
C[3] are different conditions, let's say three different hash-locks.  We
will also say that Alice has ensured that her pubkeys in different clauses
are different (i.e. APK[1] != APK[2]), but she doesn't make any such
assumption for Bob's keys and neither will we.

When Alice funded this Script, or otherwise accepted it for consideration,
she agreed that she wouldn't control the redemption of the UTXO as long as
the preimage for C[3] is published.  In particular, Alice doesn't even need
to fully decode the Script semantics for that clause beyond determining
that it enforces the C[3] requirement that she cares about. Even if Bob was
masquerading Alice's pubkey as his own (as in BPK[3] = APK[1] or BPK[3] =
APK[2]), and he ends up copying her signature into that clause, Alice ends
up with C[3] published as she originally accepted as a possibility.  Bob
masquerading Alice's pubkey as his own only serves to hamper his own
ability to sign for his clauses (I mean, Bob might be trying to convince
some third party that Alice signed for something she didn't actually sign
for, but such misrepresentations of the meaning of digital signatures is
outside our scope and this just serves as a reminder not to be deceived by
Bob's tricks here).

And the same argument holds for BPK[2].  Even if BPK[2] = APK[1] and Bob
tries to copy Alice's signature into the C[2] condition, he still needs a
countersignature with her APK[2], so Alice remains in control of that
clause.  And if BPK[2] = APK[2] then Bob can only copy Alice's signature on
the C[2] condition, but in that case she has already authorised that
condition.  Again, Bob masquerading Alice's pubkey as his own only serves
to hamper his own ability to sign for his clauses.

So what makes our potential issue here safe, versus the dangers that would
happen in <https://bitcoin.stackexchange.com/a/85665/49364> where Bob
masqurades Alice's UTXO as his own?  The key problem in the UTXO case isn't
so much Bob masquerading Alice's pubkey as his own, as it is an issue with
Alice reusing her pubkeys and Bob taking advantage of that.  We do, in
fact, have exactly the same issue in Script.  If Alice were to reuse
pubkeys such that APK[1] = APK[2], then Bob could take her signature for
C[1] and transplant it to redeem under condition C[2].  We see that it is
imperative that Alice ensures that she doesn't reuse pubkeys that she
considers under her control for different conditions when she wants her
signature to distinguish between them.

For various reasons, some historical, it is much harder to avoid pubkey
reuse for different UTXOs than it is to avoid pubkey reuse within a single
script.  We often use Bitcoin addresses in non-interactive ways, such as
putting them on flyers or painting them on walls and such.  Without a
standard for tweaking such pubkeys in a per-transaction way, we end up with
a lot of pubkey reuse between various UTXOs.  However, within a Script,
avoiding pubkey reuse ought to be easier.  Alice must communicate different
pubkeys intended for different clauses, or if Bob is constructing a whole
complex script on Alice's behalf, he may need to add CODESEPARATORs if
tweaking Alice's pubkey isn't an option.

The conversion of a policy to disjunctive normal form can involve an
exponential blowup (see <
https://en.wikipedia.org/wiki/Disjunctive_normal_form#Conversion_to_DNF>).
For instance, if Alice's policy (not in disjunctive normal form) is of the
form

    (C[1] || D[1]) && ... && (C[n] || D[n]) && APK

where C[i] and D[i] are all distinct hashlocks, we require O(2^n) clauses
to put it in disjunctive normal form.  If Alice wants to create signatures
that are restricted to a specific combination of C[i]'s and D[i]'s, she
needs to use an exponential number of pubkeys, which isn't tractable to do
in Script.  But neither my original proposal nor CODESEPARATOR helps in
this case either because CODESEPARATOR can mark only the last executed
position.  Taproot's MAST (Merklized Alternative Script Tree per aj's
suggestion), can maybe provide a tractable solution to this in cases where
it is applicable.  The MAST is always a disjunction and because the tapleaf
is signed, it is safe to reuse pubkeys between alternative branches.

This analysis suggests that we should amend CODESEPARATORs behaviour to
update an accumulator (presumably a running hash value), so that all
executed CODESEPARATOR positions end up covered by the signature.  That
would provide a solution to the above problem for those cases where
taproot's MAST cannot be used.  I'm not sure if it is better to propose
such an amendment to CODESEPARATOR's behaviour now, or to propose to
soft-fork in such optional behaviour at a later time.

However, what I said above was even too simplified.  In general, a policy
of the form.

    (Exists w[1]. C[1](w[1]) && PK[1,1](w[1]) && ... && PK[1,m[1]](w[1]) ||
... || (Exists w[n]. C[n](w[n]) && PK[n,1](w[n]) && ... && PK[n,m[n]](w[n]))

where each term could possibly be parameterized by some witness value
(though at the moment there isn't enough functionality in Script to
parameterize the pubkeys in any reasonably way and it maybe isn't even
possible to parameterise the conditions in any reasonable way).  In
general, you might want your signature to cover (some function of) this
witness value.  This suggests that we would actually want a CODESEPARATOR
variant that pushes a stack item into the accumulator that gets covered by
the signature rather than pushing the CODESEPARATOR position.  Though at
this point the name CODESEPARATOR is probably not suitable, even if it
subsumes the functionality.  Again, I'm not sure if it is better to propose
such a replacement for CODESEPARATOR's behaviour now, or to propose to
soft-fork in such optional behaviour at a later time.

[-- Attachment #2: Type: text/html, Size: 8126 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript
  2019-12-05 20:24       ` Russell O'Connor
@ 2019-12-06  4:51         ` Anthony Towns
  0 siblings, 0 replies; 6+ messages in thread
From: Anthony Towns @ 2019-12-06  4:51 UTC (permalink / raw)
  To: Russell O'Connor; +Cc: Bitcoin Protocol Discussion

On Thu, Dec 05, 2019 at 03:24:46PM -0500, Russell O'Connor wrote:

Thanks for the careful write up! That matches what I was thinking.

> This analysis suggests that we should amend CODESEPARATORs behaviour to update
> an accumulator (presumably a running hash value), so that all executed
> CODESEPARATOR positions end up covered by the signature.

On IRC, gmaxwell suggests "OP_BREADCRUMB" as a name for (something like)
this functionality.

(I think it's a barely plausible stretch to use the name "CODESEPARATOR"
for marking a position in the script -- that separates what was before
and after, at least; anything more general seems like it warrants a
better name though)

> That would provide a
> solution to the above problem for those cases where taproot's MAST cannot be
> used.  I'm not sure if it is better to propose such an amendment to
> CODESEPARATOR's behaviour now, or to propose to soft-fork in such optional
> behaviour at a later time.
> However, what I said above was even too simplified.  

FWIW, I think it's too soon to propose this because (a) it's not clear
there's a practical need for it, (b) it's not clear the functionality is
quite right (opcode vs more automatic sighash flag?), and (c) as you say,
it's not clear it's powerful enough.

> In general, a policy of the form.
>     (Exists w[1]. C[1](w[1]) && PK[1,1](w[1]) && ... && PK[1,m[1]](w[1]) || ...
> || (Exists w[n]. C[n](w[n]) && PK[n,1](w[n]) && ... && PK[n,m[n]](w[n]))
> where each term could possibly be parameterized by some witness value (though
> at the moment there isn't enough functionality in Script to parameterize the
> pubkeys in any reasonably way and it maybe isn't even possible to parameterise
> the conditions in any reasonable way).  In general, you might want your
> signature to cover (some function of) this witness value.  This suggests that
> we would actually want a CODESEPARATOR variant that pushes a stack item into
> the accumulator that gets covered by the signature rather than pushing the
> CODESEPARATOR position.  Though at this point the name CODESEPARATOR is
> probably not suitable, even if it subsumes the functionality.

> Again, I'm not
> sure if it is better to propose such a replacement for CODESEPARATOR's
> behaviour now, or to propose to soft-fork in such optional behaviour at a later
> time.

Last bit first, it seems pretty clear to me that this is too novel an
idea to propose it immediately -- we should explore the problem space
more first to see what's the best way of doing it before coding it into
consensus. And (guessing) I think the tapscript upgrade methods should
be fine for handling this later.

I think the annex is also not general enough for what you're thinking
here, in that it wouldn't allow for one signature to constrain the witness
data more than some other signature -- so you'd need to determine all
the constraints for all signatures to finish filling out the annex,
and could only then start signing.

I think you could conceivably do any/all of:

 * commit to a hash of all the witness data that hasn't been popped off
   the stack ("suffix" commitment -- the data will be used by later script
   opcodes)
 * commit to a hash of all the witness data that has been popped off the
   stack ("prefix" commitment -- this is the data that's been used by
   earlier script opcodes)
 * commit to the hash of the current stack

That would be expensive, but still doable as O(1) per opcode / stack
element. I think any other masking would mean you'd have potentially
O(size of witness data) or O(size of stack) runtime per signature which
I think would be unacceptable...

I guess a general implementation to at least think about the possibilities
might be an "OP_DATACOMMIT" opcode that pops an element from the stack,
does hash_"DataCommit"(element), and then any later signatures commit
to that value (maybe with OP_0 OP_DATACOMMIT allowing you to get back to
the default state). You'd either need to write your script carefully to
commit to witness data you're using elsewhere, or have some other new
opcodes to do that more conveniently...

CODESEP at position "x" in the script is equivalent to "<x> DATACOMMIT"
here, I think. "BREADCRUMB .. BREADCRUMB" could be something like:

   OP_0 TOALT [at start of script]
   ..
   FROMALT x CAT SHA256 DUP TOALT DATACOMMIT   
   ..
   FROMALT y CAT SHA256 DUP TOALT DATACOMMIT   

if the altstack was otherwise unused, I guess; so the accumulator
behaviour probably warrants something better.

It also more or less gives you CHECKSIGFROMSTACK behaviour by doing
"SWAP OP_DATACOMMIT OP_CHECKSIG" and a SIGHASH_NONE|ANYPREVOUTANYSCRIPT
signature.

But that seems like a plausible generalisation to think about?

Cheers,
aj

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-06  4:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-27 21:29 [bitcoin-dev] Signing CHECKSIG position in Tapscript Russell O'Connor
2019-11-28  8:06 ` Anthony Towns
2019-12-01 16:09   ` Russell O'Connor
2019-12-03  8:35     ` Anthony Towns
2019-12-05 20:24       ` Russell O'Connor
2019-12-06  4:51         ` Anthony Towns

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox