On Thu, May 27, 2021 at 04:14:13PM -0400, Antoine Riard via bitcoin-dev wrote:
> This overhead could be smoothed even further in the future with more advanced
> sighash malleability flags like SIGHASH_IOMAP, allowing transaction signers to
> commit to a map of inputs/outputs [2]. In the context of input-based, the
> overflowed fee value could be redirected to an outgoing output.

> Input-based (SIGHASH_ANYPREVOUT+SIGHASH_IOMAP): Multiple chains of transactions
> might be aggregated together *non-interactively*. One bumping input and
> outgoing output can be attached to the aggregated root.

> [2] https://bitcointalk.org/index.php?topic=252960.0

> I haven't seen any recent specs for "IOMAP", but there are a few things
> that have bugged me about them in the past:

TBH, I don't think we have been further with Darosior than comparing the compression schemes relevant for the bitfield :)

Thanks to start the hard grinding work!

>  (1) allowing partially overlapping sets of outputs could allow "theft",
>      eg if I give you a signature "you can spend A+B as long as I get X"
>      and "you can spend A+C as long as I get X", you could combine them
>      to spend A+B+C instead but still only give me 1 X.

Yes I think there is an even more unsafe case than described. A transaction third-party knowledgeable about the partial sets could combine them, then attach an additional siphoning output Y. E.g, if {A=50, B=50, C=50} and X=100 the third-party could attach output Y=50 ?

Though I believe the validity of those thefts are a function of further specification of the transaction digest coverage, as you might have a malleability scheme where B or C's signatures hash are implicitly committing to subset inputs order. If you have `H_prevouts(A || B)` and `H_prevouts(A || C)`, an attacker wouldn't be able to satisfy both B and C scripts in the same transaction ?

One mitigation which was mentioned in previous pinning discussion was to add a per-participant finalizing key to A's script and thus lockdown transaction template at broadcast. I don't think it works here as you can't assume that your counterparties, from different protocol sessions, won't collude together to combine their finalizing signatures and achieve a spend replay across sessions ?

That said, I'm not even sure we should disallow partially overlapping sets of outputs at the consensus-level, one could imagine a crowdfunding application where you delegate A+B and A+C to different parties, and you implicitly allow them to cooperate as long as they fulfill X's output value ?

>  (2) a range specification or a whole bitfield is a lot heavier than an
>      extra bit to add to the sighash

Yes, one quick optimization in case of far-depth output committed in the bitfield could be to have a few initial bits serving as vectors to blank out unused bitfield spaces. Though I concede a new sighash bits arithmetic might be too fancy for consensus-code.


>  (3) this lets you specify lots of different ways of hashing the
>      outputs, which then can't be cached, so you get kind-of quadratic
>      behaviour -- O(n^2/8) where n/2 is the size of the inputs, which
>      gives you the number of signatures, and n/2 is also the size of the
>      outputs, so n/4 is a different half of the output selected for each
>      signature in the input.

If you assume n size of transaction data, and that each signature hash is committing to inputs + half of outputs, yes I think it's even worst kind-of quadratic, like O(3n^2/4) ? And you might even worsen the hashing in function of flexibility allowed, like still committing to the whole transaction size but a different combination order of outputs selected for each signature.

But under the "don't bring me problems, bring me solutions" banner, here's an idea.

> The easy way to avoid O(n^2) behaviour in (3) is to disallow partial
> overlaps. So let's treat the tx as being distinct bundles of x-inputs
> and y-outputs, and we'll use the annex for grouping, since that is
> committed to by singatures. Call the annex field "sig_group_count".

> When processing inputs, setup a new state pair, (start, end), initially
> (0,0).
>
> When evaluating an input, lookup sig_group_count. If it's not present,
> then set start := end. If it's present and 0, leave start and end
> unchanged. Otherwise, if it's present and greather than 0, set
> start := end, and then set end := start + sig_group_count.

IIUC the design rationale, the "sig_group_count" lockdowns the hashing of outputs for a given input, thus allowing midstate reuse across signatures input.

> Introduce a new SIGHASH_GROUP flag, as an alternative to ALL/SINGLE/NONE,
> that commits to each output i, start <= i < end. If start==end or end >
> num_outputs, signature is invalid.
>
> That means each output in a tx could be hashed three times instead of
> twice (once for its particular group, as well as once for SIGHASH_ALL
> and once for SIGHASH_SINGLE), and I think would let you combine x-input
> and y-outputs fairly safely, by having the first input commit to "y"
> in the annex, and the remaining x-1 commit to "0".
>
> That does mean if you have two different sets of inputs (x1 and x2)
> each spending to the exact same set of y outputs, you could claim all
> but one of them while only paying a single set of y outputs. But you
> could include an "OP_RETURN hash(x1)" tapleaf branch in one of the y
> outputs to ensure the outputs aren't precisely the same to avoid that
> problem, so maybe that's fine?

If the index i is absolute w.r.t to the transaction output index, I think this design might have a shortcoming.

Let's say you want to combine {x_1, y_1} and {x_2, y_2} where {x, y} denotes bundles of Lightning commitment transactions.

x_1 is dual-signed by Alice and Bob under the SIGHASH_GROUP flag with `sig_group_count`=3.
x_2 is dual-signed by Alice and Caroll under the SIGHASH_GROUP flag, with `sig_group_count`=2.
y_1 and y_2 are disjunctive.

At broadcast, Alice is not able to combine {x_1,y_1} and {x_2, y_2} for the reason that x_1, x_2 are colliding on the absolute output position.

One fix could be to skim the "end > num_ouputs" semantic, and thus have Alice negotiate (start,end) encompassing all her channels outputs index and then strictly ordering her i indexes on all her counterparties. But I don't think you can assume index order to be transitive across Lightning nodes, at least without bundle combination gaps in your local set of channels.

I think this SIGHASH_GROUP proposal might solve other use-cases, but if I understand the semantics correctly, it doesn't seem to achieve the batch fee-bumping of multiple Lightning commitment with O(1) onchain footprint I was thinking of for IOMAP...

One counter-proposal to solve this "pre-signed y-outputs ordinate" could be instead to envision the SIGHASH_GROUP as vector coordinates and the annex field as the projection.

Let's say annex field := (hashOutputsGroups) and SIGHASH_GROUP(i,j) where j is a non-null integer.
Call i the starting index of the output group committed by this input.
Call j the output group size committed by this input.

At validation, compute `hashOutputsGroup` = H(outputs[i...i+j-1]).
If the computed `hashOutputGroup` isn't equal to the input annex field `hashOutputsGroup`, fails validation.
Otherwise, substitute `hashOutputGroup` to bip-143 `hashOutputs` while conserving , and proceed to signature verification.

As (i,j) are not included in the annex and are only part of witness data, they can be selected by the bundles combiner at broadcast to construct a valid transaction.

If the combiner is malicious and (i,j) points to another outputs group, the computed hash is going to be invalid, as it doesn't satisfy the annex `output_group` field.

If you want to disallow partial overlaps for your bundle, we could even have a bit k. If k=1, verify that all transaction `output_group` fields are not colliding.

Hmmmm, sounds more flexible but you might still have a bit of hashing complexity to deal with ?

> Okay, now that I've written and re-written that a couple of times,
> it looks like I'm just reinventing Rusty's signature bundles from 2018:
>
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-April/015862.html
>
> (though at least I think using the annex is probably an improvement on
> having values that affect other inputs being buried deeper in an input's
> witness data)
>
> Without something like this, I think it will be very hard to incorporate
> fees into eltoo with layered commitments [0]. As a new sighash mode it
> would make sense to include it as part of ANYPREVOUT to avoid introducing
> many new "unknown key types".

Well, I agree on the overall direction though maybe we should detail primitive requirements a bit more, otherwise it might not fit the second-layer use-case we're interested with.

Cheers,
Antoine

Le jeu. 8 juil. 2021 à 07:17, Anthony Towns <aj@erisian.com.au> a écrit :
On Thu, May 27, 2021 at 04:14:13PM -0400, Antoine Riard via bitcoin-dev wrote:
> This overhead could be smoothed even further in the future with more advanced
> sighash malleability flags like SIGHASH_IOMAP, allowing transaction signers to
> commit to a map of inputs/outputs [2]. In the context of input-based, the
> overflowed fee value could be redirected to an outgoing output.

> Input-based (SIGHASH_ANYPREVOUT+SIGHASH_IOMAP): Multiple chains of transactions
> might be aggregated together *non-interactively*. One bumping input and
> outgoing output can be attached to the aggregated root.

> [2] https://bitcointalk.org/index.php?topic=252960.0

I haven't seen any recent specs for "IOMAP", but there are a few things
that have bugged me about them in the past:

 (1) allowing partially overlapping sets of outputs could allow "theft",
     eg if I give you a signature "you can spend A+B as long as I get X"
     and "you can spend A+C as long as I get X", you could combine them
     to spend A+B+C instead but still only give me 1 X.

 (2) a range specification or a whole bitfield is a lot heavier than an
     extra bit to add to the sighash

 (3) this lets you specify lots of different ways of hashing the
     outputs, which then can't be cached, so you get kind-of quadratic
     behaviour -- O(n^2/8) where n/2 is the size of the inputs, which
     gives you the number of signatures, and n/2 is also the size of the
     outputs, so n/4 is a different half of the output selected for each
     signature in the input.

But under the "don't bring me problems, bring me solutions" banner,
here's an idea.

The easy way to avoid O(n^2) behaviour in (3) is to disallow partial
overlaps. So let's treat the tx as being distinct bundles of x-inputs
and y-outputs, and we'll use the annex for grouping, since that is
committed to by singatures. Call the annex field "sig_group_count".

When processing inputs, setup a new state pair, (start, end), initially
(0,0).

When evaluating an input, lookup sig_group_count. If it's not present,
then set start := end. If it's present and 0, leave start and end
unchanged. Otherwise, if it's present and greather than 0, set
start := end, and then set end := start + sig_group_count.

Introduce a new SIGHASH_GROUP flag, as an alternative to ALL/SINGLE/NONE,
that commits to each output i, start <= i < end. If start==end or end >
num_outputs, signature is invalid.

That means each output in a tx could be hashed three times instead of
twice (once for its particular group, as well as once for SIGHASH_ALL
and once for SIGHASH_SINGLE), and I think would let you combine x-input
and y-outputs fairly safely, by having the first input commit to "y"
in the annex, and the remaining x-1 commit to "0".

That does mean if you have two different sets of inputs (x1 and x2)
each spending to the exact same set of y outputs, you could claim all
but one of them while only paying a single set of y outputs. But you
could include an "OP_RETURN hash(x1)" tapleaf branch in one of the y
outputs to ensure the outputs aren't precisely the same to avoid that
problem, so maybe that's fine?

Okay, now that I've written and re-written that a couple of times,
it looks like I'm just reinventing Rusty's signature bundles from 2018:

https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-April/015862.html

(though at least I think using the annex is probably an improvement on
having values that affect other inputs being buried deeper in an input's
witness data)



Without something like this, I think it will be very hard to incorporate
fees into eltoo with layered commitments [0]. As a new sighash mode it
would make sense to include it as part of ANYPREVOUT to avoid introducing
many new "unknown key types".

[0] https://lists.linuxfoundation.org/pipermail/lightning-dev/2020-January/002448.html
    also, https://www.erisian.com.au/lightning-dev/log-2021-07-08.html

Cheers,
aj