public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
* [bitcoin-dev] BIP 174 thoughts
@ 2018-06-15 23:34 Pieter Wuille
  2018-06-16 15:00 ` Peter D. Gray
                   ` (4 more replies)
  0 siblings, 5 replies; 54+ messages in thread
From: Pieter Wuille @ 2018-06-15 23:34 UTC (permalink / raw)
  To: Bitcoin Dev

Hello all,

given some recent work and discussions around BIP 174 (Partially
Signed Bitcoin Transaction Format) I'd like to bring up a few ideas.

First of all, it's unclear to me to what extent projects have already
worked on implementations, and thus to what extent the specification
is still subject to change. A response of "this is way too late" is
perfectly fine.

So here goes:

* Key-value map model or set model.

This was suggested in this thread:
https://twitter.com/matejcik/status/1002618633472892929

The motivation behind using a key-value model rather than a simple
list of records was that PSBTs can be duplicated (given to multiple
people for signing, for example), and merged back together after
signing. With a generic key-value model, any implementation can remove
the duplication even if they don't understand fields that may have
been added in future extensions.

However, almost the same can be accomplished by using the simpler set
model (the file consists of a set of records, with no duplication
allowed). This would mean that it would technically be legal to have
two partial signatures with the same key for the same input, if a
non-deterministic signer is used.

On the other hand, this means that certain data currently encoded
inside keys can be dropped, reducing the PSBT size. This is
particularly true for redeemscripts and witnessscripts, as they can
just be computed by the client when deserializing. The two types could
even be merged into just "scripts" records - as they don't need to be
separated based on the way they're looked up (Hash160 for P2SH, SHA256
for P2WSH). The same could be done for the BIP32 derivation paths,
though this may be expensive, as the client would need to derive all
keys before being able to figure out which one(s) it needs.

One exception is the "transaction" record, which needs to be unique.
That can either be done by adding an exception ("there can only be one
transaction record"), or by encoding it separately outside the normal
records (that may also be useful to make it clear that it is always
required).

* Ability for Combiners to verify two PSBT are for the same transaction

Clearly two PSBTs for incompatible transactions cannot be combined,
and this should not be allowed.

It may be easier to enforce this if the "transaction" record inside a
PSBT was required to be in a canonical form, meaning with empty
scriptSigs and witnesses. In order to do so, there could be per-input
records for "finalized scriptSig" and "finalized witness". Actually
placing those inside the transaction itself would only be allowed when
all inputs are finalized.

* Optional signing

I think all operation for the Signer responsibility should be
optional. This will inevitably lead to incompatibilities, but with the
intent of being forward compatible with future developments, I don't
think it is possible to require every implementation to support the
same set of scripts or contracts. For example, some signers may only
implement single-key P2PKH, or may only support signing SegWit inputs.
It's the user's responsibility to find compatible signers (which is
generally not an issue, as the different participants in a setup
necessarily have this figured out before being able to create an
address). This does mean that there can't be an unconditional test
vector that specifies the produced signature in certain circumstances,
but there could be "For implementations that choose to implement
signing for P2PKH inputs using RFC6979, the expected output given
input X and access to key Y is Z".

On the other hand, the Combiner and Finalizer roles can probably be
specified much more accurately than they are now.

* Derivation from xpub or fingerprint

For BIP32 derivation paths, the spec currently only encodes the 32-bit
fingerprint of the parent or master xpub. When the Signer only has a
single xprv from which everything is derived, this is obviously
sufficient. When there are many xprv, or when they're not available
indexed by fingerprint, this may be less convenient for the signer.
Furthermore, it violates the "PSBT contains all information necessary
for signing, excluding private keys" idea - at least if we don't treat
the chaincode as part of the private key.

For that reason I would suggest that the derivation paths include the
full public key and chaincode of the parent or master things are
derived from. This does mean that the Creator needs to know the full
xpub which things are derived from, rather than just its fingerprint.

* Generic key offset derivation

Whenever a BIP32 derivation path does not include any hardened steps,
the entirety of the derivation can be conveyed as "The private key for
P is equal to the private key for Q plus x", with P and Q points and x
a scalar. This representation is more flexible (it also supports
pay-to-contract derivations), more efficient, and more compact. The
downside is that it requires the Signer to support such derivation,
which I don't believe any current hardware devices do.

Would it make sense to add this as an additional derivation method?

* Hex encoding?

This is a very minor thing. But presumably there will be some standard
way to store PSBTs as text for copy-pasting - either prescribed by the
spec, or de facto. These structures may become pretty large, so
perhaps it's worth choosing something more compact than hexadecimal -
for example Base64 or even Z85 (https://rfc.zeromq.org/spec:32/Z85/).

Cheers,

-- 
Pieter


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-15 23:34 [bitcoin-dev] BIP 174 thoughts Pieter Wuille
@ 2018-06-16 15:00 ` Peter D. Gray
  2018-06-19  9:38 ` Jonas Schnelli
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 54+ messages in thread
From: Peter D. Gray @ 2018-06-16 15:00 UTC (permalink / raw)
  To: Pieter Wuille; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 931 bytes --]

On Fri, Jun 15, 2018 at 04:34:40PM -0700, Pieter Wuille wrote:
...
> First of all, it's unclear to me to what extent projects have already
> worked on implementations, and thus to what extent the specification
> is still subject to change. A response of "this is way too late" is
> perfectly fine.
...

The new Coldcard hardware wallet is based on PSBT (ie. BIP 174 as
published), and we consider it "PSBT Native". It can add signatures
to PSBT files delivered on MicroSD card and/or over USB, and is
able to finalize PSBT files for lots of simple cases. It already
works well against the existing BIP174 pull request.

I think the BIP174 spec is reasonable as it is, and should only be
changed in a forwards-compatible way from this point... but obviously
I'm biased.

As for your specific comments, I don't have strong feelings really.

---
Peter D. Gray  ||  Founder, Coinkite  ||  Twitter: @dochex  ||  GPG: A3A31BAD 5A2A5B10


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-15 23:34 [bitcoin-dev] BIP 174 thoughts Pieter Wuille
  2018-06-16 15:00 ` Peter D. Gray
@ 2018-06-19  9:38 ` Jonas Schnelli
  2018-06-19 14:20 ` matejcik
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 54+ messages in thread
From: Jonas Schnelli @ 2018-06-19  9:38 UTC (permalink / raw)
  To: Pieter Wuille, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 2164 bytes --]

> * Key-value map model or set model.
> * Ability for Combiners to verify two PSBT are for the same transaction
> * Optional signing
> * Derivation from xpub or fingerprint
> * Generic key offset derivation
> * Hex encoding?

I think all of Pieters points are valid and reasonable thought, though I’m unsure if it would be worth changing the existing-implementation-breaking things like the k/v set model.
AFAIK things like non-hex-encoding or generic key offset derivation are extensions and would not break existing implementations.

Further thoughts on BIP174 from my side.

Key derivation in multisig:
From my understanding, the signers and the creator must have agreed – in advance to the PSBT use case – on a key derivation scheme.
BIP32 derivation is assumed, but may not always be the case.
Sharing xpubs (the chaincode) may be a concern in non-trust-relationships between signer(s) and the creator (regarding Pieters xpub/fingerprint concerns).
Providing the type 0x03, the bip32 derivation path is one form of a support to faster (or computational possible) derivation of the required keys for signing a particular input.
From my point of view, it is a support of additional metadata shared between creator and signer and provided from the creator to the signer for faster (or computation possible) key deviation.

I think it could be more flexible (generic) in BIP174.
It could be just a single child key {32-bit int}, or just a keypath ({32-bit int}]{32-bit int}…) which is very likely sufficient for a HWW to derive the relevant key without the creation of a lookup-window or other „maps".
It could even be an enciphered payload which was shared during address/redeem-script generation and „loops“ back during a signing request.

Maybe I’m overcomplicating things, but for practical multisig with HWWs, a simple BIP32-child-key-index or BIP32-keypath derivation support field should be sufficient.
A generic „derivation support field“, provided from the signer to the creator during address-generation that just „loops“ back during the PSBT use-cases is probably a overkill.


Thanks
—
/jonas


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-15 23:34 [bitcoin-dev] BIP 174 thoughts Pieter Wuille
  2018-06-16 15:00 ` Peter D. Gray
  2018-06-19  9:38 ` Jonas Schnelli
@ 2018-06-19 14:20 ` matejcik
  2018-06-19 15:20   ` Jonas Schnelli
  2018-06-19 17:16   ` Pieter Wuille
  2018-06-19 14:22 ` matejcik
  2018-06-21  0:39 ` Achow101
  4 siblings, 2 replies; 54+ messages in thread
From: matejcik @ 2018-06-19 14:20 UTC (permalink / raw)
  To: bitcoin-dev


[-- Attachment #1.1: Type: text/plain, Size: 3783 bytes --]

Hello,

First of all, we'd like to apologize for such a late feedback, since
there is a PR for this already. We've come up with a few more notes on
this, so we are introducing those in this message and replying on
Pieter's points in another one.


1) Why isn't the global type 0x03 (BIP-32 path) per-input? How do we
know, which BIP-32 path goes to which input? The only idea that comes to
my mind is that we should match the input's scriptPubKey's pubkey to
this 0x03's key (the public key).

If our understanding is correct, the BIP-32 path is global to save space
in case two inputs share the same BIP-32 path? How often does that
happen? And in case it does, doesn't it mean an address reuse which is
discouraged?

Also, we believe that if the public key is to be used as "spent to by an
output" it should be in an output section. If the public key is to be
used to sign an input, it should be in the input section. Again, how
often are those the same? We understand creating another section might
be cumbersome, but it'd significantly increase clarity to have global,
input and output section.

Alternately, we could keep “spend to” public keys in the global section,
and put the input public keys to the per-input sections. This is less
clear, but doesn’t introduce another section. A question to consider is,
will there be more per-output data? If yes, it might make sense to have
an output section.


2) The global items 0x01 (redeem script) and 0x02 (witness script) are
somewhat confusing. Let's consider only the redeem script (0x01) to make
it simple. The value description says: "A redeem script that will be
needed to sign a Pay-To-Script-Hash input or is spent to by an output.".
Does this mean that the record includes both input's redeem script
(because we need to sign it), but also a redeem script for the output
(to verify we are sending to a correct P2SH)? To mix those two seems
really confusing.

Yet again, adding a new output section would make this more readable. We
would include the input’s redeem script in the input section and the
output’s redeem script again in the output section, because they’ll most
likely differ anyway.

The rationale says that the scripts are global to avoid duplication.
However, how often is this the case? The scripts include a hash of some
OP codes and the recipient's public key for example. So a) how often are
two scripts equal to justify this? b) if they're the same, doesn't it
yet again signalize address reuse?


3) The sighash type 0x03 says the sighash is only a recommendation. That
seems rather ambiguous. If the field is specified shouldn't it be binding?


4) Is it a good idea to skip records which types we are unaware of? We
can't come up with a reasonable example, but intuitively this seems as a
potential security issue. We think we should consider  introducing a
flag, which would define if the record is "optional". In case the signer
encounters a record it doesn't recognize and such flag is not set, it
aborts the procedure. If we assume the set model we could change the
structure to <type><optional flag><length>{data}. We are not keen on
this, but we wanted to include this idea to see what you think.

-----------

In general, the standard is trying to be very space-conservative,
however is that really necessary? We would argue for clarity and ease of
use over space constraints. We think more straightforward approach is
desired, although more space demanding. What are the arguments to make
this as small as possible? If we understand correctly, this format is
not intended for blockchain nor for persistent storage, so size doesn’t
matter nearly as much.


Thank you,

Tomas Susanka
Jan Matejek


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-15 23:34 [bitcoin-dev] BIP 174 thoughts Pieter Wuille
                   ` (2 preceding siblings ...)
  2018-06-19 14:20 ` matejcik
@ 2018-06-19 14:22 ` matejcik
  2018-06-21  0:39 ` Achow101
  4 siblings, 0 replies; 54+ messages in thread
From: matejcik @ 2018-06-19 14:22 UTC (permalink / raw)
  To: bitcoin-dev


[-- Attachment #1.1: Type: text/plain, Size: 4937 bytes --]

hello,
this is our second e-mail with replies to Pieter's suggestions.

On 16.6.2018 01:34, pieter.wuille at gmail.com (Pieter Wuille) wrote:
> * Key-value map model or set model.
> 
> This was suggested in this thread:
> https://twitter.com/matejcik/status/1002618633472892929
> 
> The motivation behind using a key-value model rather than a simple
> list of records was that PSBTs can be duplicated (given to multiple
> people for signing, for example), and merged back together after
> signing. With a generic key-value model, any implementation can remove
> the duplication even if they don't understand fields that may have
> been added in future extensions.
> 
> However, almost the same can be accomplished by using the simpler set
> model (the file consists of a set of records, with no duplication
> allowed). This would mean that it would technically be legal to have
> two partial signatures with the same key for the same input, if a
> non-deterministic signer is used.

Strongly agree with this.

Just to note, we should probably use varint for the <type> field - this
allows us, e.g., to create “namespaces” for future extensions by using
one byte as namespace identifier and one as field identifier.

> 
> On the other hand, this means that certain data currently encoded
> inside keys can be dropped, reducing the PSBT size. This is
> particularly true for redeemscripts and witnessscripts, as they can
> just be computed by the client when deserializing. The two types could
> even be merged into just "scripts" records - as they don't need to be
> separated based on the way they're looked up (Hash160 for P2SH, SHA256
> for P2WSH). The same could be done for the BIP32 derivation paths,
> though this may be expensive, as the client would need to derive all
> keys before being able to figure out which one(s) it needs.

It could be nice if the output scripts records would be ordered the same
as their corresponding outputs. But what if the Creator doesn’t want to
include a script for an output? Perhaps the Script record should have a
<vout> field to match it to the appropriate output.

As for input scripts, we suggest that they are per-input and not
included in the global record, see the other thread.

> 
> One exception is the "transaction" record, which needs to be unique.
> That can either be done by adding an exception ("there can only be one
> transaction record"), or by encoding it separately outside the normal
> records (that may also be useful to make it clear that it is always
> required).

This seems to be the case for some fields already - i.e., an input field
must have exactly one of Non-witness UTXO or Witness Output. So “adding
an exception” is probably just a matter of language?


We’d also like to note that the “number of inputs” field should be
mandatory - and as such, possibly also a candidate for outside-record field.

> 
> * Ability for Combiners to verify two PSBT are for the same transaction
> 
> Clearly two PSBTs for incompatible transactions cannot be combined,
> and this should not be allowed.
> 
> It may be easier to enforce this if the "transaction" record inside a
> PSBT was required to be in a canonical form, meaning with empty
> scriptSigs and witnesses. In order to do so, there could be per-input
> records for "finalized scriptSig" and "finalized witness". Actually
> placing those inside the transaction itself would only be allowed when
> all inputs are finalized.

Agreed! Also increases clarity, which is desired.

> * Derivation from xpub or fingerprint
> 
> For BIP32 derivation paths, the spec currently only encodes the 32-bit
> fingerprint of the parent or master xpub. When the Signer only has a
> single xprv from which everything is derived, this is obviously
> sufficient. When there are many xprv, or when they're not available
> indexed by fingerprint, this may be less convenient for the signer.
> Furthermore, it violates the "PSBT contains all information necessary
> for signing, excluding private keys" idea - at least if we don't treat
> the chaincode as part of the private key.
> 
> For that reason I would suggest that the derivation paths include the
> full public key and chaincode of the parent or master things are
> derived from. This does mean that the Creator needs to know the full
> xpub which things are derived from, rather than just its fingerprint.


We don’t understand the rationale for this idea. Do you see a scenario
where an index on master fingerprint is not available but index by xpubs
is? In our envisioned use cases at least, indexing private keys by xpubs
(as opposed to deriving from a BIP32 path) makes no sense.

Maybe this folds into the proposal for generic derivation below, or
something like implementation-specific derivation methods?

best regards

Jan Matejek
Tomas Susanka


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-19 14:20 ` matejcik
@ 2018-06-19 15:20   ` Jonas Schnelli
  2018-06-21 20:28     ` Peter D. Gray
  2018-06-19 17:16   ` Pieter Wuille
  1 sibling, 1 reply; 54+ messages in thread
From: Jonas Schnelli @ 2018-06-19 15:20 UTC (permalink / raw)
  To: matejcik, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

I agree with matejcik’s point 1 to 3 and especially with point 4.
The mandatory flag (or optional-flag) makes much sense to me.

> -----------
> 
> In general, the standard is trying to be very space-conservative,
> however is that really necessary? We would argue for clarity and ease of
> use over space constraints. We think more straightforward approach is
> desired, although more space demanding. What are the arguments to make
> this as small as possible? If we understand correctly, this format is
> not intended for blockchain nor for persistent storage, so size doesn’t
> matter nearly as much.

I don’t see any reasons why space would be an issue.

HWWs probably can’t handle PBST natively since it is not optimised for
presenting various informations in a signing-verification.

A single stream-in of a PSBT through USB (or similar channel) will not work in
many cases since HWW come often with very restrictive RAM constraints.

Furthermore, I forget to mention in my last mail, that registering (or defining)
a mime-type for PSBT would probably a great usability feature.
(Send PSBT by email/messanger and with dbl-click to open feature, etc.)


/jonas

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-19 14:20 ` matejcik
  2018-06-19 15:20   ` Jonas Schnelli
@ 2018-06-19 17:16   ` Pieter Wuille
  2018-06-21 11:29     ` matejcik
  2018-06-21 11:44     ` Tomas Susanka
  1 sibling, 2 replies; 54+ messages in thread
From: Pieter Wuille @ 2018-06-19 17:16 UTC (permalink / raw)
  To: matejcik, Bitcoin Protocol Discussion

On Tue, Jun 19, 2018 at 7:20 AM, matejcik via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:

Thanks for your comments so far. I'm very happy to see people dig into
the details, and consider alternative approaches.

> 1) Why isn't the global type 0x03 (BIP-32 path) per-input? How do we
> know, which BIP-32 path goes to which input? The only idea that comes to
> my mind is that we should match the input's scriptPubKey's pubkey to
> this 0x03's key (the public key).

> If our understanding is correct, the BIP-32 path is global to save space
> in case two inputs share the same BIP-32 path? How often does that
> happen? And in case it does, doesn't it mean an address reuse which is
> discouraged?

Yes, the reason is address reuse. It may be discouraged, but it still
happens in practice (and unfortunately it's very hard to prevent
people from sending to the same address twice).

It's certainly possible to make them per-input (and even per-output as
suggested below), but I don't think it gains you much. At least when a
signer supports any kind of multisig, it needs to match up public keys
with derivation paths. If several can be provided, looking them up
from a global table or a per-input table shouldn't fundamentally
change anything.

However, perhaps it makes sense to get rid of the global section
entirely, and make the whole format a transaction plus per-input and
per-output extra fields. This would result in duplication in case of
key reuse, but perhaps that's worth the complexity reduction.

> 2) The global items 0x01 (redeem script) and 0x02 (witness script) are
> somewhat confusing. Let's consider only the redeem script (0x01) to make
> it simple. The value description says: "A redeem script that will be
> needed to sign a Pay-To-Script-Hash input or is spent to by an output.".
> Does this mean that the record includes both input's redeem script
> (because we need to sign it), but also a redeem script for the output
> (to verify we are sending to a correct P2SH)? To mix those two seems
> really confusing.
>
> Yet again, adding a new output section would make this more readable. We
> would include the input’s redeem script in the input section and the
> output’s redeem script again in the output section, because they’ll most
> likely differ anyway.

I think here it makes sense because there can actually only be (up to)
one redeemscript and (up to) one witnessscript. So if we made those
per-input and per-output, it may simplify signers as they don't need a
table lookup to find the correct one. That would also mean we can drop
their hashes, even if we keep a key-value model.

> 3) The sighash type 0x03 says the sighash is only a recommendation. That
> seems rather ambiguous. If the field is specified shouldn't it be binding?

Perhaps, yes.

> 4) Is it a good idea to skip records which types we are unaware of? We
> can't come up with a reasonable example, but intuitively this seems as a
> potential security issue. We think we should consider  introducing a
> flag, which would define if the record is "optional". In case the signer
> encounters a record it doesn't recognize and such flag is not set, it
> aborts the procedure. If we assume the set model we could change the
> structure to <type><optional flag><length>{data}. We are not keen on
> this, but we wanted to include this idea to see what you think.

Originally there was at least this intuition for why it shouldn't be
necessary: the resulting signature for an input is either valid or
invalid. Adding information to a PSBT (which is what signers do)
either helps with that or not. The worst case is that they simply
don't have enough information to produce a signature together. But an
ignored unknown field being present should never result in signing the
wrong thing (they can always see the transaction being signed), or
failing to sign if signing was possible in the first place. Another
way of looking at it, the operation of a signer is driven by queries:
it looks at the scriptPubKey of the output being spent, sees it is
P2SH, looks for the redeemscript, sees it is P2WSH, looks for the
witnessscript, sees it is multisig, looks for other signers'
signatures, finds enough for the threshold, and proceeds to sign and
create a full transaction. If at any point one of those things is
missing or not comprehensible to the signer, he simply fails and
doesn't modify the PSBT.

However, if the sighash request type becomes mandatory, perhaps this
is not the case anymore, as misinterpreting something like this could
indeed result in an incorrect signature.

If we go down this route, if a field is marked as mandatory, can you
still act as a combiner for it? Future extensions should always
maintain the invariant that a simple combiner which just merges all
the fields and deduplicates should always be correct, I think. So such
a mandatory field should only apply to signers?

> In general, the standard is trying to be very space-conservative,
> however is that really necessary? We would argue for clarity and ease of
> use over space constraints. We think more straightforward approach is
> desired, although more space demanding. What are the arguments to make
> this as small as possible? If we understand correctly, this format is
> not intended for blockchain nor for persistent storage, so size doesn’t
> matter nearly as much.

I wouldn't say it's trying very hard to be space-conservative. The
design train of thought started from "what information does a signer
need", and found a signer would need information on the transaction to
sign, and on scripts to descend into, information on keys to derive,
and information on signatures provided by other participants. Given
that some of this information was global (at least the transaction),
and some of this information was per-input (at least the signatures),
separate scopes were needed for those. Once you have a global scope,
and you envision a signer which looks up scripts and keys in a map of
known ones (like the signing code in Bitcoin Core), there is basically
no downside to make the keys and scripts global - while giving space
savings for free to deduplication.

However, perhaps that's not the right way to think about things, and
the result is simpler if we only keep the transaction itself global,
and everything else per-input (and per-output).

I think there are good reasons to not be gratuitously large (I expect
that at least while experimenting, people will copy-paste these things
a lot and page-long copy pastes become unwieldy quickly), but maybe
not at the cost of structural complexity.

On Tue, Jun 19, 2018 at 7:22 AM, matejcik via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
> hello,
> this is our second e-mail with replies to Pieter's suggestions.
>
> On 16.6.2018 01:34, pieter.wuille at gmail.com (Pieter Wuille) wrote:
>> * Key-value map model or set model.

> Just to note, we should probably use varint for the <type> field - this
> allows us, e.g., to create “namespaces” for future extensions by using
> one byte as namespace identifier and one as field identifier.

Agree, but this doesn't actually need to be specified right now. As
the key's (and or value's) interpretation (including the type) is
completely unspecified, an extension can just start using 2-byte keys
(as long as the first byte of those 2 isn't used by another field
already).

>> One exception is the "transaction" record, which needs to be unique.
>> That can either be done by adding an exception ("there can only be one
>> transaction record"), or by encoding it separately outside the normal
>> records (that may also be useful to make it clear that it is always
>> required).
>
> This seems to be the case for some fields already - i.e., an input field
> must have exactly one of Non-witness UTXO or Witness Output. So “adding
> an exception” is probably just a matter of language?

Hmm, I wouldn't say so. Perhaps the transaction's inputs and outputs
are chosen by one entity, and then sent to another entity which has
access to the UTXOs or previous transactions. So while the UTXOs must
be present before signing, I wouldn't say the file format itself must
enforce that the UTXOs are present.

However, perhaps we do want to enforce at-most one UTXO per input. If
there are more potential extensions like this, perhaps a key-value
model is better, as it's much easier to enforce no duplicate keys than
it is to add field-specific logic to combiners (especially for
extensions they don't know about yet).

> We’d also like to note that the “number of inputs” field should be
> mandatory - and as such, possibly also a candidate for outside-record field.

If we go with the "not put signatures/witnesses inside the transaction
until all of them are finalized" suggestion, perhaps the number of
inputs field can be dropped. There would be always one exactly for
each input (but some may have the "final script/witness" field and
others won't).

>> * Derivation from xpub or fingerprint
>>
>> For BIP32 derivation paths, the spec currently only encodes the 32-bit
>> fingerprint of the parent or master xpub. When the Signer only has a
>> single xprv from which everything is derived, this is obviously
>> sufficient. When there are many xprv, or when they're not available
>> indexed by fingerprint, this may be less convenient for the signer.
>> Furthermore, it violates the "PSBT contains all information necessary
>> for signing, excluding private keys" idea - at least if we don't treat
>> the chaincode as part of the private key.
>>
>> For that reason I would suggest that the derivation paths include the
>> full public key and chaincode of the parent or master things are
>> derived from. This does mean that the Creator needs to know the full
>> xpub which things are derived from, rather than just its fingerprint.
>
>
> We don’t understand the rationale for this idea. Do you see a scenario
> where an index on master fingerprint is not available but index by xpubs
> is? In our envisioned use cases at least, indexing private keys by xpubs
> (as opposed to deriving from a BIP32 path) makes no sense.

Let me elaborate.

Right now, the BIP32 fields are of the form <master
fingerprint><childidx><childidx><childidx>...

Instead, I suggest fields of the form <master pubkey><master
chaincode><childidx><childidx><childidx>...

The fingerprint in this case is identical to the first 32 bit of the
Hash160 of <master pubkey>, so certainly no information is lost by
making this change.

This may be advantageous for three reasons:
* It permits signers to have ~thousands of master keys (at which point
32-bit fingerprints would start having reasonable chance for
collisions, meaning multiple derivation attempts would be needed to
figure out which one to use).
* It permits signers to index their master keys by whatever they like
(for example, SHA256 rather than Hash160 or prefix thereof).
* It permits signers who don't store a chaincode at all, and just
protect a single private key.

Cheers,

-- 
Pieter


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-15 23:34 [bitcoin-dev] BIP 174 thoughts Pieter Wuille
                   ` (3 preceding siblings ...)
  2018-06-19 14:22 ` matejcik
@ 2018-06-21  0:39 ` Achow101
  2018-06-21 14:32   ` Tomas Susanka
  4 siblings, 1 reply; 54+ messages in thread
From: Achow101 @ 2018-06-21  0:39 UTC (permalink / raw)
  To: Bitcoin Protocol Discussion

Hi,

On June 15, 2018 4:34 PM, Pieter Wuille via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> 
> Hello all,
> 
> given some recent work and discussions around BIP 174 (Partially
> Signed Bitcoin Transaction Format) I'd like to bring up a few ideas.
> First of all, it's unclear to me to what extent projects have already
> worked on implementations, and thus to what extent the specification
> is still subject to change. A response of "this is way too late" is
> perfectly fine.

While I agree that the BIP itself should be revised to reflect these suggestions, I fear that it may be too late. I know of a few other developers who have implemented BIP 174 already but have not yet responded to this email.

>     
> -   Generic key offset derivation
>
>     Whenever a BIP32 derivation path does not include any hardened steps,
>     the entirety of the derivation can be conveyed as "The private key for
>     P is equal to the private key for Q plus x", with P and Q points and x
>     a scalar. This representation is more flexible (it also supports
>     pay-to-contract derivations), more efficient, and more compact. The
>     downside is that it requires the Signer to support such derivation,
>     which I don't believe any current hardware devices do.
>     Would it make sense to add this as an additional derivation method?

While this is a good idea, I'm not sure that implementers would understand this as it requires knowing the cryptography which makes this possible. As an optional feature, not all wallets would understand it, and those that do could create PSBTs which other wallets do not understand and thus cannot sign even if they have the private keys and actually can sign.

>     
> -   Hex encoding?
>     
>     This is a very minor thing. But presumably there will be some standard
>     way to store PSBTs as text for copy-pasting - either prescribed by the
>     spec, or de facto. These structures may become pretty large, so
>     perhaps it's worth choosing something more compact than hexadecimal -
>     for example Base64 or even Z85 (https://rfc.zeromq.org/spec:32/Z85/).

Agreed. Are there any encodings that do not have double click breaking characters?


On June 19, 2018 2:38 AM, Jonas Schnelli via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> I think it could be more flexible (generic) in BIP174.
> It could be just a single child key {32-bit int}, or just a keypath ({32-bit int}]{32-bit int}…) which is very likely sufficient for a HWW to derive the relevant key without the creation of a lookup-window or other „maps".

This ignores all of the other times that a BIP32 keypath needs to be provided. It is not only used for multisig, there may be other times that there are multiple derivation paths and master keys due to multiple inputs and such. Adding a field specific to multisig and HWW only seems pointless and redundant to me.

On June 19, 2018 2:38 AM, Jonas Schnelli via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

>
> A question to consider is,
> will there be more per-output data? If yes, it might make sense to have
> an output section.

I think it is unlikely that there would be anymore per-output data.

> 3) The sighash type 0x03 says the sighash is only a recommendation. That
>seems rather ambiguous. If the field is specified shouldn't it be binding?

I disagree. It is up to the signer to decide what they wish to sign, not for the creator to specify what to sign. The creator can ask the signer to sign something in a particular way, but it is ultimately up to the signer to decide.

> 4) Is it a good idea to skip records which types we are unaware of? We
> can't come up with a reasonable example, but intuitively this seems as a
> potential security issue. We think we should consider  introducing a
> flag, which would define if the record is "optional". In case the signer
> encounters a record it doesn't recognize and such flag is not set, it
> aborts the procedure. If we assume the set model we could change the
> structure to <type><optional flag><length>{data}. We are not keen on
> this, but we wanted to include this idea to see what you think.

The idea behind skipping unknown types is to allow forward compatibility. A combiner written now should be able to combine transactions created in the future with new types as combining is really only just merging the maps together.

> In general, the standard is trying to be very space-conservative,
> however is that really necessary? We would argue for clarity and ease of
> use over space constraints. We think more straightforward approach is
> desired, although more space demanding. What are the arguments to make
> this as small as possible? If we understand correctly, this format is
> not intended for blockchain nor for persistent storage, so size doesn’t
> matter nearly as much.

Size is not really a constraint, but we do not want to be unnecessarily large. The PSBT still has to be transmitted to other people. It will likely be used by copy and pasting the string into a text box. Copying and pasting very long strings of text can be annoying and cumbersome. So the goal is to keep the format still relatively clear while avoiding the duplication of data.


Andrew




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-19 17:16   ` Pieter Wuille
@ 2018-06-21 11:29     ` matejcik
  2018-06-21 17:39       ` Pieter Wuille
  2018-06-21 11:44     ` Tomas Susanka
  1 sibling, 1 reply; 54+ messages in thread
From: matejcik @ 2018-06-21 11:29 UTC (permalink / raw)
  To: Pieter Wuille, Bitcoin Protocol Discussion


[-- Attachment #1.1: Type: text/plain, Size: 7741 bytes --]

On 19.6.2018 19:16, Pieter Wuille wrote:
>> 1) Why isn't the global type 0x03 (BIP-32 path) per-input? How do we
>> know, which BIP-32 path goes to which input? The only idea that comes to
>> my mind is that we should match the input's scriptPubKey's pubkey to
>> this 0x03's key (the public key).
> 
>> If our understanding is correct, the BIP-32 path is global to save space
>> in case two inputs share the same BIP-32 path? How often does that
>> happen? And in case it does, doesn't it mean an address reuse which is
>> discouraged?
> 
> Yes, the reason is address reuse. It may be discouraged, but it still
> happens in practice (and unfortunately it's very hard to prevent
> people from sending to the same address twice).
> 
> It's certainly possible to make them per-input (and even per-output as
> suggested below), but I don't think it gains you much. At least when a
> signer supports any kind of multisig, it needs to match up public keys
> with derivation paths. If several can be provided, looking them up
> from a global table or a per-input table shouldn't fundamentally
> change anything.

So here’s a thing I’m still confused about.

Imagine two cases, for a naive Signer:
- either all data is global
- or most data is per input.

Now, the general signing flow is this:
1. Pre-serialize the transaction
2. Prepare the current input - fill out scriptPubKey (or equivalent for
segwit)
3. find a secret key
4. output public key + signature

Step (3) is the main issue here.

In the case of everything per-input, the naive Signer can do this:
1. (in the global section) pre-serialize the transaction
2. (in each input) find and fill out scriptPubKey from the provided UTXO
3. (for a given BIP32 path) check if the master fingerprint matches
mine, if yes, derive secret key, output pubkey, signature
4. goto 3 (more keys per input), goto 2 (next input)

Note that this flow works perfectly for multisig; it’s going to be the
job of a Finalizer to build the final scriptSig, but each input can have
multiple partial signatures -- and, interestingly, the naive Signer
doesn’t even need to know about multisig.

A less naive Signer will want to check things, maybe derive a scriptSig
itself and check if it matches the given hash, etc., but it can do this
all in place. You go linearly through the signing flow and place a
couple strategic assertions along the way.

However, if the data is global, as is now, it gets more complicated:
1. (in the global section) pre-serialize the transaction, prefill lookup
tables
2. (for a given BIP32 path) check if mine, then derive public key and
store in a dictionary
3. (for each input) find _and parse_ scriptPubKey, extract (PK or)
script hash
4. lookup redeem script based on script-hash; if not found, goto 2; if
found, parse out public key
5. lookup public key in the BIP32 dictionary; if not found, goto 2
6. output pubkey, signature

In addition to being more steps and lookups, it requires the Signer to
understand the redeem script. A strict Signer will want that anyway, but
in the first case, the Signer can regenerate the scripts and compare
specificaly the ones it's working with; here, you need to parse them
even before you know what you're comparing to.

Is there something I’m missing? Because as I see it, there is literally
no advantage to the more complicated flow; that’s why we assumed that
the format is space-saving, because saving space was the only reason we
could imagine.

> If we go down this route, if a field is marked as mandatory, can you
> still act as a combiner for it? Future extensions should always
> maintain the invariant that a simple combiner which just merges all
> the fields and deduplicates should always be correct, I think. So such
> a mandatory field should only apply to signers?

(...)

> However, perhaps we do want to enforce at-most one UTXO per input. If
> there are more potential extensions like this, perhaps a key-value
> model is better, as it's much easier to enforce no duplicate keys than
> it is to add field-specific logic to combiners (especially for
> extensions they don't know about yet).

In general, you seem to focus a lot on the role of Combiners, esp.
simple Combiners. To me, that doesn’t look like a significant role. As I
envision it, a Combiner really doesn’t need to do anything more
complicated than merge and deduplicate records, simply based on the
uniqueness of the whole record.
It’s the Finalizer’s job to reconstruct and validate the result. Also
ISTM if something messes up the PSBT (such as including multiple
conflicting fields anywhere), it’s OK to leave it to Finalizer to fail.

Are the Combiners supposed to be separate from Finalizers? (Is there a
risk of a Combiner passing along a bad PSBT, Finalizer rejecting it, and
the other parties not finding out?)


> If we go with the "not put signatures/witnesses inside the transaction
> until all of them are finalized" suggestion, perhaps the number of
> inputs field can be dropped. There would be always one exactly for
> each input (but some may have the "final script/witness" field and
> others won't).

Strongly agree with this. A guarantee that number of inputs in the
transaction corresponds to number of input fields for PBST looks cleaner
than specifying it separately. This way we can also drop the "input index".


> Right now, the BIP32 fields are of the form <master
> fingerprint><childidx><childidx><childidx>...
> 
> Instead, I suggest fields of the form <master pubkey><master
> chaincode><childidx><childidx><childidx>...
> 
> The fingerprint in this case is identical to the first 32 bit of the
> Hash160 of <master pubkey>, so certainly no information is lost by
> making this change.
> 
> This may be advantageous for three reasons:
> * It permits signers to have ~thousands of master keys (at which point
> 32-bit fingerprints would start having reasonable chance for
> collisions, meaning multiple derivation attempts would be needed to
> figure out which one to use).
> * It permits signers to index their master keys by whatever they like
> (for example, SHA256 rather than Hash160 or prefix thereof)> * It permits signers who don't store a chaincode at all, and just
> protect a single private key.

I like this last usecase a lot, but perhaps that's a role for a
"sub-Creator"? see below.

Also, is there a reason to publish the chain code, wouldn't just the
public key be sufficient to accomplish all three usecases you list?
I sort of dislike the notion that you need to give all this information
to a possibly untrusted Creator.


An aside to this in particular, I’ve been thinking about the requirement
to share derivation paths and public keys with the Creator. The spec
assumes that this will happen; you’re talking about providing full
xpub+chaincode too. At least, the Creator must prefill BIP32 paths and
master key fingerprints. Possibly also prefill public keys in the redeem
scripts?

This might not be an improvement proposal, but a point worth being
raised and maybe explained in the spec. Perhaps the original Creator
doesn’t have access to this data, and delegates this to some
“sub-Creators”  - I imagine a coordinator sending a PSBT to signing
parties, each of which acts as a sub-Creator (fills out derivation paths
and public keys) and a Signer (forwarding to a HWW). Some of the
discussion even suggests some sort of generic “key derivation field”
with arbitrary contents - fingerprint + bip32 path? xpub + chain code?
derivation points? encrypted xprv?

thank you for your comments

regards
m.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-19 17:16   ` Pieter Wuille
  2018-06-21 11:29     ` matejcik
@ 2018-06-21 11:44     ` Tomas Susanka
  1 sibling, 0 replies; 54+ messages in thread
From: Tomas Susanka @ 2018-06-21 11:44 UTC (permalink / raw)
  To: Pieter Wuille via bitcoin-dev

Hi,

On 19.6.2018 19:16, Pieter Wuille via bitcoin-dev wrote:
> Yes, the reason is address reuse. It may be discouraged, but it still
> happens in practice (and unfortunately it's very hard to prevent
> people from sending to the same address twice).
>
> It's certainly possible to make them per-input (and even per-output as
> suggested below), but I don't think it gains you much. At least when a
> signer supports any kind of multisig, it needs to match up public keys
> with derivation paths. If several can be provided, looking them up
> from a global table or a per-input table shouldn't fundamentally
> change anything.
>
> However, perhaps it makes sense to get rid of the global section
> entirely, and make the whole format a transaction plus per-input and
> per-output extra fields. This would result in duplication in case of
> key reuse, but perhaps that's worth the complexity reduction.
I think having a global section with just one record (the transaction)
is just fine, in case we come up with some other fields later on which
would fit the global section. Otherwise I totally agree.
>> 2) The global items 0x01 (redeem script) and 0x02 (witness script) are
>> somewhat confusing. Let's consider only the redeem script (0x01) to make
>> it simple. The value description says: "A redeem script that will be
>> needed to sign a Pay-To-Script-Hash input or is spent to by an output.".
>> Does this mean that the record includes both input's redeem script
>> (because we need to sign it), but also a redeem script for the output
>> (to verify we are sending to a correct P2SH)? To mix those two seems
>> really confusing.
>>
>> Yet again, adding a new output section would make this more readable. We
>> would include the input’s redeem script in the input section and the
>> output’s redeem script again in the output section, because they’ll most
>> likely differ anyway.
> I think here it makes sense because there can actually only be (up to)
> one redeemscript and (up to) one witnessscript. So if we made those
> per-input and per-output, it may simplify signers as they don't need a
> table lookup to find the correct one. That would also mean we can drop
> their hashes, even if we keep a key-value model.
Yes, indeed. Just to clarify: in the first sentence you mean "per
output", right? There can actually only be (up to) one redeemscript and
(up to) one witnessscript *per output*.
>> 4) Is it a good idea to skip records which types we are unaware of? We
>> can't come up with a reasonable example, but intuitively this seems as a
>> potential security issue. We think we should consider  introducing a
>> flag, which would define if the record is "optional". In case the signer
>> encounters a record it doesn't recognize and such flag is not set, it
>> aborts the procedure. If we assume the set model we could change the
>> structure to <type><optional flag><length>{data}. We are not keen on
>> this, but we wanted to include this idea to see what you think.
> Originally there was at least this intuition for why it shouldn't be
> necessary: the resulting signature for an input is either valid or
> invalid. Adding information to a PSBT (which is what signers do)
> either helps with that or not. The worst case is that they simply
> don't have enough information to produce a signature together. But an
> ignored unknown field being present should never result in signing the
> wrong thing (they can always see the transaction being signed), or
> failing to sign if signing was possible in the first place. Another
> way of looking at it, the operation of a signer is driven by queries:
> it looks at the scriptPubKey of the output being spent, sees it is
> P2SH, looks for the redeemscript, sees it is P2WSH, looks for the
> witnessscript, sees it is multisig, looks for other signers'
> signatures, finds enough for the threshold, and proceeds to sign and
> create a full transaction. If at any point one of those things is
> missing or not comprehensible to the signer, he simply fails and
> doesn't modify the PSBT.
The rationale behind this was, what if at some point we come up with a
PSBT record, which forbids some kind of operation or alters some
behaviour. In another words, by omitting such record the signer would
create a signature, which is valid, but actually signed something
different than the Creator intended.

> However, if the sighash request type becomes mandatory, perhaps this
> is not the case anymore, as misinterpreting something like this could
> indeed result in an incorrect signature.
I believe this use case illustrates it quite well. Let’s suppose the
sighash record is binding and the Signer does not know it. The Creator
creates a PSBT with sighash set SIGHASH_SINGLE. The Signer sings the
transaction with SIGHASH_ALL, because they are not aware of such field.
This results in a valid signature, however not what the Creator intended
it to be.

>> We’d also like to note that the “number of inputs” field should be
>> mandatory - and as such, possibly also a candidate for outside-record field.
> If we go with the "not put signatures/witnesses inside the transaction
> until all of them are finalized" suggestion, perhaps the number of
> inputs field can be dropped. There would be always one exactly for
> each input (but some may have the "final script/witness" field and
> others won't).
Agree. I'm be fine with dropping the field completely in that case.


Thanks,
Tomas




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-21  0:39 ` Achow101
@ 2018-06-21 14:32   ` Tomas Susanka
  2018-06-21 15:40     ` Greg Sanders
  2018-06-21 19:56     ` Peter D. Gray
  0 siblings, 2 replies; 54+ messages in thread
From: Tomas Susanka @ 2018-06-21 14:32 UTC (permalink / raw)
  To: Achow101 via bitcoin-dev

Hello,

First of all, let me thank you for all the hard work you and others have
put into this.


On 21.6.2018 02:39, Achow101 via bitcoin-dev wrote:
> While I agree that the BIP itself should be revised to reflect these suggestions, I fear that it may be too late. I know of a few other developers who have implemented BIP 174 already but have not yet responded to this email.

We do realize that this discussion should have happened earlier, however
agreeing on a good standard should be the number one priority for all
the parties involved.

The fact that someone already implemented this is indeed unfortunate,
but I don't think we should lower our demands on the standard just
because of a bad timing.

>> A question to consider is,
>> will there be more per-output data? If yes, it might make sense to have
>> an output section.
> I think it is unlikely that there would be anymore per-output data.

Hmm, upon further reflection, maybe it's not even worth including *any*
per-output data, aside from what the original transaction contains.

The output redeem script is either:
- unknown, because we have received only an address from the receiver
- or it is known, because it is ours and in that case it doesn’t make
sense to include it in PSBT

We got stuck on the idea of the Creator providing future (output)
redeem/witness scripts. But that seems to be a minority use case and can
be solved efficiently via the same channels that coordinate the PSBT
creation. Sorry to change opinions so quickly on this one.

>
>> 3) The sighash type 0x03 says the sighash is only a recommendation. That
>> seems rather ambiguous. If the field is specified shouldn't it be binding?
> I disagree. It is up to the signer to decide what they wish to sign, not for the creator to specify what to sign. The creator can ask the signer to sign something in a particular way, but it is ultimately up to the signer to decide.

This seems very ambiguous. The Signer always has the option of not
signing. *What* to sign is a matter of coordination between the parties;
otherwise, you could make all the fields advisory and let anyone sign
anything they like?

We don't understand the usecase for a field that is advisory but not
binding. On what basis would you choose to respect or disregard the
advisory field? Either one party has a preference, in which case they
have to coordinate with the other anyway - or they don't, in which case
they simply leave the field out.

> Size is not really a constraint, but we do not want to be unnecessarily large. The PSBT still has to be transmitted to other people. It will likely be used by copy and pasting the string into a text box. Copying and pasting very long strings of text can be annoying and cumbersome. So the goal is to keep the format still relatively clear while avoiding the duplication of data.

I agree. Just to put some numbers on this: if we expect a 5-part
derivation path, and add the master key fingerprint, that is 4 + 5*4 =
24 bytes (~32 base64 letters) per input and signer. I'd argue this is
not significant.
If we used full xpub, per Pieter's suggestion, that would grow to 32 +
32 + 5*4 = 84 bytes (~112 letters) per input/signer, which is quite a lot.

On the other hand, keeping the BIP32 paths per-input means that we don't
need to include the public key (as in the lookup key), so that's 32
bytes down per path. In general, all the keys can be fully reconstructed
from their values:

redeem script key = hash160(value)
witness script key = sha256(value)
bip32 key = derive(value)

The one exception is a partial signature. But even in that case we
expect that a given public key will always correspond to the same
signature, so we can act as if the public key is not part of the "key".
In other words, we can move the public key to the value part of the record.

This holds true unless there's some non-deterministic signing scheme,
*and* multiple Signers sign with the same public key, which is what
Pieter was alluding to on Twitter
(https://twitter.com/pwuille/status/1002627925110185984). Still, I would
argue (as he also suggested) that keeping the format more complex to
support this particular use case is probably not worth it.

Also, we can mostly ignore deduplication of witness/redeem scripts.
These still need to be included in the resulting transaction, duplicated
if necessary, so I think counting their repetition against the size of
PSBT isn't worth it.


Best,
Tomas





^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-21 14:32   ` Tomas Susanka
@ 2018-06-21 15:40     ` Greg Sanders
  2018-06-21 19:56     ` Peter D. Gray
  1 sibling, 0 replies; 54+ messages in thread
From: Greg Sanders @ 2018-06-21 15:40 UTC (permalink / raw)
  To: tomas.susanka, Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 5757 bytes --]

>Hmm, upon further reflection, maybe it's not even worth including *any*
per-output data, aside from what the original transaction contains.

>The output redeem script is either:
- unknown, because we have received only an address from the receiver
- or it is known, because it is ours and in that case it doesn’t make
sense to include it in PSBT

Signers are an extremely heterogeneous bunch. A signer may need to
introspect on the script, such as "this is a 2-of-3,
and I'm one of the keys". Even in basic p2pkh settings not adding any
output information rules out things like change
detection on any conceivable hardware wallet, or even simple software
wallets that don't carry significant state.

On Thu, Jun 21, 2018 at 10:35 AM Tomas Susanka via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:

> Hello,
>
> First of all, let me thank you for all the hard work you and others have
> put into this.
>
>
> On 21.6.2018 02:39, Achow101 via bitcoin-dev wrote:
> > While I agree that the BIP itself should be revised to reflect these
> suggestions, I fear that it may be too late. I know of a few other
> developers who have implemented BIP 174 already but have not yet responded
> to this email.
>
> We do realize that this discussion should have happened earlier, however
> agreeing on a good standard should be the number one priority for all
> the parties involved.
>
> The fact that someone already implemented this is indeed unfortunate,
> but I don't think we should lower our demands on the standard just
> because of a bad timing.
>
> >> A question to consider is,
> >> will there be more per-output data? If yes, it might make sense to have
> >> an output section.
> > I think it is unlikely that there would be anymore per-output data.
>
> Hmm, upon further reflection, maybe it's not even worth including *any*
> per-output data, aside from what the original transaction contains.
>
> The output redeem script is either:
> - unknown, because we have received only an address from the receiver
> - or it is known, because it is ours and in that case it doesn’t make
> sense to include it in PSBT
>
> We got stuck on the idea of the Creator providing future (output)
> redeem/witness scripts. But that seems to be a minority use case and can
> be solved efficiently via the same channels that coordinate the PSBT
> creation. Sorry to change opinions so quickly on this one.
>
> >
> >> 3) The sighash type 0x03 says the sighash is only a recommendation. That
> >> seems rather ambiguous. If the field is specified shouldn't it be
> binding?
> > I disagree. It is up to the signer to decide what they wish to sign, not
> for the creator to specify what to sign. The creator can ask the signer to
> sign something in a particular way, but it is ultimately up to the signer
> to decide.
>
> This seems very ambiguous. The Signer always has the option of not
> signing. *What* to sign is a matter of coordination between the parties;
> otherwise, you could make all the fields advisory and let anyone sign
> anything they like?
>
> We don't understand the usecase for a field that is advisory but not
> binding. On what basis would you choose to respect or disregard the
> advisory field? Either one party has a preference, in which case they
> have to coordinate with the other anyway - or they don't, in which case
> they simply leave the field out.
>
> > Size is not really a constraint, but we do not want to be unnecessarily
> large. The PSBT still has to be transmitted to other people. It will likely
> be used by copy and pasting the string into a text box. Copying and pasting
> very long strings of text can be annoying and cumbersome. So the goal is to
> keep the format still relatively clear while avoiding the duplication of
> data.
>
> I agree. Just to put some numbers on this: if we expect a 5-part
> derivation path, and add the master key fingerprint, that is 4 + 5*4 =
> 24 bytes (~32 base64 letters) per input and signer. I'd argue this is
> not significant.
> If we used full xpub, per Pieter's suggestion, that would grow to 32 +
> 32 + 5*4 = 84 bytes (~112 letters) per input/signer, which is quite a lot.
>
> On the other hand, keeping the BIP32 paths per-input means that we don't
> need to include the public key (as in the lookup key), so that's 32
> bytes down per path. In general, all the keys can be fully reconstructed
> from their values:
>
> redeem script key = hash160(value)
> witness script key = sha256(value)
> bip32 key = derive(value)
>
> The one exception is a partial signature. But even in that case we
> expect that a given public key will always correspond to the same
> signature, so we can act as if the public key is not part of the "key".
> In other words, we can move the public key to the value part of the record.
>
> This holds true unless there's some non-deterministic signing scheme,
> *and* multiple Signers sign with the same public key, which is what
> Pieter was alluding to on Twitter
> (https://twitter.com/pwuille/status/1002627925110185984). Still, I would
> argue (as he also suggested) that keeping the format more complex to
> support this particular use case is probably not worth it.
>
> Also, we can mostly ignore deduplication of witness/redeem scripts.
> These still need to be included in the resulting transaction, duplicated
> if necessary, so I think counting their repetition against the size of
> PSBT isn't worth it.
>
>
> Best,
> Tomas
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists•linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>

[-- Attachment #2: Type: text/html, Size: 7809 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-21 11:29     ` matejcik
@ 2018-06-21 17:39       ` Pieter Wuille
  0 siblings, 0 replies; 54+ messages in thread
From: Pieter Wuille @ 2018-06-21 17:39 UTC (permalink / raw)
  To: matejcik; +Cc: Bitcoin Protocol Discussion

On Thu, Jun 21, 2018 at 4:29 AM, matejcik <jan.matejek@satoshilabs•com> wrote:
> In the case of everything per-input, the naive Signer can do this:
> 1. (in the global section) pre-serialize the transaction
> 2. (in each input) find and fill out scriptPubKey from the provided UTXO
> 3. (for a given BIP32 path) check if the master fingerprint matches
> mine, if yes, derive secret key, output pubkey, signature
> 4. goto 3 (more keys per input), goto 2 (next input)
>
> Note that this flow works perfectly for multisig; it’s going to be the
> job of a Finalizer to build the final scriptSig, but each input can have
> multiple partial signatures -- and, interestingly, the naive Signer
> doesn’t even need to know about multisig.

Ah, you're thinking of an even simpler signer than I was imagining. I
don't think this works in general, because the hash being signed
depends on the structure of the script. For example, if it is P2SH, it
is the redeemscript that goes into the scriptCode serialization rather
than the scriptPubKey. If it is segwit, BIP143 serialization needs to
be used, etc. It may work if your signing is restricted to just one of
those structures, though.

> A less naive Signer will want to check things, maybe derive a scriptSig
> itself and check if it matches the given hash, etc., but it can do this
> all in place. You go linearly through the signing flow and place a
> couple strategic assertions along the way.

Right - but I think anything but the simplest signer must do this,
just to be able to distinguish between different kinds of signature
hashing.

But you're right, having per-input redeemscript/witnessscript
simplifies things still - instead of needing to look a script hash in
a map, you can just compare it with *the* redeemscript/witnessscript.

> However, if the data is global, as is now, it gets more complicated:
> 1. (in the global section) pre-serialize the transaction, prefill lookup
> tables
> 2. (for a given BIP32 path) check if mine, then derive public key and
> store in a dictionary
> 3. (for each input) find _and parse_ scriptPubKey, extract (PK or)
> script hash
> 4. lookup redeem script based on script-hash; if not found, goto 2; if
> found, parse out public key
> 5. lookup public key in the BIP32 dictionary; if not found, goto 2
> 6. output pubkey, signature

I understand your point now. I hadn't considered the possibility of
just signing with all BIP32 derivation paths given for which the
master matches, instead of extracting pubkeys/pkhs from the script.
That's a major simplification for signers indeed. I do think you need
some conditions before to determine the script structure (see above),
but this is a good point in favour of making the derivation paths
per-input.

> In general, you seem to focus a lot on the role of Combiners, esp.
> simple Combiners. To me, that doesn’t look like a significant role. As I
> envision it, a Combiner really doesn’t need to do anything more
> complicated than merge and deduplicate records, simply based on the
> uniqueness of the whole record.

It's more a side-effect of focusing on forward compatibility. I expect
that we will have transactions with inputs spending different kinds of
outputs, and some signers may not be able to understand all of them.
However, as long as the design goal of having Combiners function
correctly for things they don't understand, everything should be able
to work together fine.

> It’s the Finalizer’s job to reconstruct and validate the result. Also
> ISTM if something messes up the PSBT (such as including multiple
> conflicting fields anywhere), it’s OK to leave it to Finalizer to fail.

Agree.

> An aside to this in particular, I’ve been thinking about the requirement
> to share derivation paths and public keys with the Creator. The spec
> assumes that this will happen; you’re talking about providing full
> xpub+chaincode too. At least, the Creator must prefill BIP32 paths and
> master key fingerprints. Possibly also prefill public keys in the redeem
> scripts?
>
> This might not be an improvement proposal, but a point worth being
> raised and maybe explained in the spec. Perhaps the original Creator
> doesn’t have access to this data, and delegates this to some
> “sub-Creators”  - I imagine a coordinator sending a PSBT to signing
> parties, each of which acts as a sub-Creator (fills out derivation paths
> and public keys) and a Signer (forwarding to a HWW). Some of the
> discussion even suggests some sort of generic “key derivation field”
> with arbitrary contents - fingerprint + bip32 path? xpub + chain code?
> derivation points? encrypted xprv?

That makes sense - I think we've already touched this when discussing
the requirement for UTXOs to be added. Perhaps those aren't added by
the Creator, but by some index server. The same could be true for the
scripts or derivations paths.

And indeed, most of the information in the derivation paths is
effectively opaque to the Creator - it's just some data given out by
the Signer about its keys that gets passed back to it so it can
identify the key. There is benefit in keeping it in a fixed structure
(like xpub/chaincode, or fingerprint + derivation indexes), to
guarantee compatibility between multiple signer implementations with
access to the same key.

On Tue, Jun 19, 2018 at 5:39 PM, Jason Les via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
>>Hex encoding?
>
> I was hoping for some standard here was well and I agree using something
> more compact than hex is important. My understanding is Z85 is better for
> use with JSON than Base64, which is probably a good benefit to have here.

Both Base64 and Z85 can be stored in JSON strings without quoting
(neither uses quotation characters or backslashes), but Z85 is
slightly more compact (Z85 is 5 characters for 4 bytes, Base64 is 4
characters for 3 bytes). Both use non-alphanumeric characters, so I
don't think there is much difference w.r.t. copy-pastability either.
Z85 is far less common though.

On Thu, Jun 21, 2018 at 4:44 AM, Tomas Susanka via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
>> I think here it makes sense because there can actually only be (up to)
>> one redeemscript and (up to) one witnessscript. So if we made those
>> per-input and per-output, it may simplify signers as they don't need a
>> table lookup to find the correct one. That would also mean we can drop
>> their hashes, even if we keep a key-value model.
> Yes, indeed. Just to clarify: in the first sentence you mean "per
> output", right? There can actually only be (up to) one redeemscript and
> (up to) one witnessscript *per output*.

Up to one per output, and up to one per input - indeed.

On Thu, Jun 21, 2018 at 7:32 AM, Tomas Susanka via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
>>> A question to consider is,
>>> will there be more per-output data? If yes, it might make sense to have
>>> an output section.
>> I think it is unlikely that there would be anymore per-output data.
>
> Hmm, upon further reflection, maybe it's not even worth including *any*
> per-output data, aside from what the original transaction contains.
>
> The output redeem script is either:
> - unknown, because we have received only an address from the receiver
> - or it is known, because it is ours and in that case it doesn’t make
> sense to include it in PSBT
>
> We got stuck on the idea of the Creator providing future (output)
> redeem/witness scripts. But that seems to be a minority use case and can
> be solved efficiently via the same channels that coordinate the PSBT
> creation. Sorry to change opinions so quickly on this one.

Perhaps you're missing the reason for having output scripts? It is so
that signers that wish to known the amounts transferred can be told
which outputs of the to-be transaction are change, and thus shouldn't
be counted towards the balance. By providing the scripts and
derivation paths in a PSBT, the Creator can prove to the Signer that
certain outputs do not actually move funds to some other entity.


Based on the points before, my preference is having everything
per-input and per-output except the transaction (with empty
scriptSig/witness) itself, and having exactly one set/map per input
and output (which may include a "finalized scriptSig/witness field"
for finalized inputs). The overhead of having at least one separator
byte for every input and output in the transaction is at most a few
percent compared to the data in the transaction itself. If size is
really an issue (but I think we've already established that small size
gains aren't worth much extra complexity), we could also serialize the
transaction without scriptSigs/witnesses (which are at least one byte
each, and guaranteed to be empty).

I'm unsure about typed record vs. key-value model. If we'd go with a
per-input script approach, the key would just be a single byte ("the
redeemscript" and "the witnessscript"), so the advantage of being able
to drop the script hashes applies equally to both models. After that,
it seems the only difference seems to be that a well-defined prefix of
the records is enforced to be unique as opposed to the entire record
being enforced to be unique. I don't think there is much difference in
complexity, as Combiners and Signers still need to enforce some kind
of uniqueness even in a typed records model.

Cheers,

-- 
Pieter


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-21 14:32   ` Tomas Susanka
  2018-06-21 15:40     ` Greg Sanders
@ 2018-06-21 19:56     ` Peter D. Gray
  2018-06-21 21:39       ` Gregory Maxwell
  2018-06-22 19:10       ` Pieter Wuille
  1 sibling, 2 replies; 54+ messages in thread
From: Peter D. Gray @ 2018-06-21 19:56 UTC (permalink / raw)
  To: Tomas Susanka; +Cc: Achow101 via bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 1822 bytes --]

On Thu, Jun 21, 2018 at 04:32:07PM +0200, Tomas Susanka wrote:
...
> First of all, let me thank you for all the hard work you and others have
> put into this.
> 
> On 21.6.2018 02:39, Achow101 via bitcoin-dev wrote:
> > While I agree that the BIP itself should be revised to reflect these suggestions, I fear that it may be too late. I know of a few other developers who have implemented BIP 174 already but have not yet responded to this email.
> 
> We do realize that this discussion should have happened earlier, however
> agreeing on a good standard should be the number one priority for all
> the parties involved.
> 
> The fact that someone already implemented this is indeed unfortunate,
> but I don't think we should lower our demands on the standard just
> because of a bad timing.

We all want a "good" standard but we have that already, IMHO.

What you are really saying is you want a "better" standard, and I
would argue that's our enemy right now. It's just too easy to propose a
few tweaks, with "wouldn't it be better if..." 

I feel strongly we are entering the "design by committee" territory with BIP174.

I have personally implemented this spec on an embedded micro, as
the signer and finalizer roles, and written multiple parsers for
it as well. There is nothing wrong with it, and it perfectly meets
my needs as a hardware wallet.

So, there is a good proposal already spec'ed and implemented by
multiple parties. Andrew has been very patiently shepherding the PR
for over six months already.

PSBT is something we need, and has been missing from the ecosystem
for a long time. Let's push this out and start talking about future
versions after we learn from this one.

---
Peter D. Gray  ||  Founder, Coinkite  ||  Twitter: @dochex  ||  GPG: A3A31BAD 5A2A5B10


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-19 15:20   ` Jonas Schnelli
@ 2018-06-21 20:28     ` Peter D. Gray
  0 siblings, 0 replies; 54+ messages in thread
From: Peter D. Gray @ 2018-06-21 20:28 UTC (permalink / raw)
  To: Jonas Schnelli; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1355 bytes --]

On Tue, Jun 19, 2018 at 05:20:34PM +0200, Jonas Schnelli wrote:
...
> 
> I don’t see any reasons why space would be an issue.
> 
> HWWs probably can’t handle PBST natively since it is not optimised for
> presenting various informations in a signing-verification.

The Coldcard hardware wallet is PSBT native and does work directly from PSBT.

> A single stream-in of a PSBT through USB (or similar channel) will not work in
> many cases since HWW come often with very restrictive RAM constraints.

For the Coldcard, we expect a PSBT to be 'uploaded' over USB (can
also be provided on MicroSD card) and we work in-place with it,
scanning over it a few different times. If the user approves the
transaction, we produce a signed PSBT or final transaction and that
gets downloaded.

We support 256k byte PSBT files with hundreds of inputs/outputs
(IIRC, and exact limits still TBD) and are operating in a system
with only 25k free RAM after startup.

> Furthermore, I forget to mention in my last mail, that registering (or defining)
> a mime-type for PSBT would probably a great usability feature.
> (Send PSBT by email/messanger and with dbl-click to open feature, etc.)

+1 for mimetype, especially since it's a binary format.

---
Peter D. Gray  ||  Founder, Coinkite  ||  Twitter: @dochex  ||  GPG: A3A31BAD 5A2A5B10


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-21 19:56     ` Peter D. Gray
@ 2018-06-21 21:39       ` Gregory Maxwell
  2018-06-22 19:10       ` Pieter Wuille
  1 sibling, 0 replies; 54+ messages in thread
From: Gregory Maxwell @ 2018-06-21 21:39 UTC (permalink / raw)
  To: Peter Gray, Bitcoin Protocol Discussion

On Thu, Jun 21, 2018 at 7:56 PM, Peter D. Gray via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
> PSBT is something we need, and has been missing from the ecosystem
> for a long time. Let's push this out and start talking about future
> versions after we learn from this one.

When you implement proposals that have little to no public discussion
about them you take the risk that your work needs to be changed when
other people do actually begin reviewing the work.  It is incredibly
demoralizing as a designer and a reviewer to have proposals that were
put out for discussion show up implemented in things with these vested
interests then insisting that they not be refined further.  I think
kind of handling is toxic to performing development in public.

Although it's silly enough that it won't happen, I think our industry
would be better off if there was a social norm that anytime someone
insists an unfinished proposal shouldn't be changed because they
already implemented it that the spec should _always_ be changed, in
order to discourage further instances of that conduct.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-21 19:56     ` Peter D. Gray
  2018-06-21 21:39       ` Gregory Maxwell
@ 2018-06-22 19:10       ` Pieter Wuille
  2018-06-22 22:28         ` Achow101
  1 sibling, 1 reply; 54+ messages in thread
From: Pieter Wuille @ 2018-06-22 19:10 UTC (permalink / raw)
  To: Peter Gray, Bitcoin Protocol Discussion

On Thu, Jun 21, 2018 at 12:56 PM, Peter D. Gray via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
> I have personally implemented this spec on an embedded micro, as
> the signer and finalizer roles, and written multiple parsers for
> it as well. There is nothing wrong with it, and it perfectly meets
> my needs as a hardware wallet.

This is awesome to hear. We need to hear from people who have comments
or issues they encounter while implementing, but also cases where
things are fine as is.

> So, there is a good proposal already spec'ed and implemented by
> multiple parties. Andrew has been very patiently shepherding the PR
> for over six months already.
>
> PSBT is something we need, and has been missing from the ecosystem
> for a long time. Let's push this out and start talking about future
> versions after we learn from this one.

I understand you find the suggestions being brought up in this thread
to be bikeshedding over details, and I certainly agree that "changing
X will gratuitously cause us more work" is a good reason not to make
breaking changes to minutiae. However, at least abstractly speaking,
it would be highly unfortunate if the fact that someone implemented a
draft specification results in a vested interest against changes which
may materially improve the standard.

In practice, the process surrounding BIPs' production readiness is not
nearly as clear as it could be, and there are plenty of BIPs actually
deployed in production which are still marked as draft. So in reality,
truth is that this thread is "late", and also why I started the
discussion by asking what the state of implementations was. As a
result, the discussion should be "which changes are worth the hassle",
and not "what other ideas can we throw in" - and some of the things
brought up are certainly the latter.

So to get back to the question what changes are worth the hassle - I
believe the per-input derivation paths suggested by matejcik may be
one. As is written right now, I believe BIP174 requires Signers to
pretty much always parse or template match the scripts involved. This
means it is relatively hard to implement a Signer which is compatible
with many types of scripts - including ones that haven't been
considered yet. However, if derivation paths are per-input, a signer
can just produce partial signatures for all keys it has the master
for. As long as the Finalizer understands the script type, this would
mean that Signers will work with any script. My guess is that this
would be especially relevant to devices where the Signer
implementation is hard to change, like when it is implemented in a
hardware signer directly.

What do you think?

Cheers,

-- 
Pieter


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-22 19:10       ` Pieter Wuille
@ 2018-06-22 22:28         ` Achow101
  2018-06-23 17:00           ` William Casarin
                             ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Achow101 @ 2018-06-22 22:28 UTC (permalink / raw)
  To: Bitcoin Protocol Discussion

Hi all,

After reading the comments here about BIP 174, I would like to propose the following changes:

- Moving redeemScripts, witnessScripts, and BIP 32 derivation paths to per-input and per-output data

I think that by moving these three fields into input and output specific maps, the format will be
easier to read and simpler for signers to parse. Instead of having to be able to parse entire
scripts and extract pubkeys, the signer can simply look at which pubkeys are provided in the inputs
and sign the input based upon the presence of a pubkey for which the signer has a privkey.

A neat trick that fits well with this model is that a plain pubkey (one that is not part of a BIP 32
derivation) can still be put in a BIP 32 derivation path field where the value is just the fingerprint
of the pubkey itself. This would indicate that no derivation needs to be done from the master key, and
the master key is just the specified key itself.

Additionally, by having the redeemScript and witnessScript readily available in the input, signers
do not need to construct a map to find a redeemScript or witnessScript and can instead just look
directly in the input data. There is also no need to include the hashes of these scripts, so the key
is just the type. This also allows us to enforce the requirement for only one redeemScript and one
witnessScript per input easily by continuing to follow the generic rule of unique keys.

By using input specific and output specific fields, there is no need for the input index and the input
count types as all inputs will be accounted for.

- Finalized scriptSig and scriptWitness fields

To determine whether two PSBTs are the same, we can compare the unsigned transaction. To ensure that the
unsigned transactions are the same for two PSBTs with data for the same tx, we cannot put scriptSigs or
scriptWitnesses into it. Thus for each input, two new fields have been added to store the finalized scriptSig
and finalized scriptWitness.

- Mandatory sighash

The sighash type field will be changed from a recommendation to a requirement. Signatures will need to 
use the specified sighash type for that input. If a Signer cannot sign for a particular sighash type, it
must not add a partial signature.

- Encoding

I have decided that PSBTs should either be in binary or encoded as a Base64 string. For the latter, several
Bitcoin clients already support Base64 encoding of data (for signed messages) so this will not add any extra
dependencies like Z85 would.


A draft of the revised BIP can be found here: https://github.com/achow101/bips/blob/bip174-rev/bip-0174.mediawiki
If these changes are satisfactory, I will open a PR to the BIPs repo to update the BIP tomorrow. I will also
create test vectors and update the implementation PR'ed to Core.

Andrew


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-22 22:28         ` Achow101
@ 2018-06-23 17:00           ` William Casarin
  2018-06-23 20:33             ` Andrew Chow
  2018-06-23 18:27           ` Peter D. Gray
  2018-06-25 19:47           ` Tomas Susanka
  2 siblings, 1 reply; 54+ messages in thread
From: William Casarin @ 2018-06-23 17:00 UTC (permalink / raw)
  To: Achow101, Bitcoin Protocol Discussion

Achow101 via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> writes:

> I have decided that PSBTs should either be in binary or encoded as a
> Base64 string. For the latter, several Bitcoin clients already support
> Base64 encoding of data (for signed messages) so this will not add any
> extra dependencies like Z85 would.

Since we're still considering the encoding, I wonder if it would be a
good idea to have a human-readible part like lightning invoices[1]?

lightning invoice 

  vvvvvv
  lnbc1pvjluezpp5qqqsyqcyq5rqwzqfqqqsyqcyq5rqwzqfqqqsyqcyq5rqwzqfqypqdpl2pkx2ctnv5sxxmmwwd5kgetjypeh2ursdae8g6twvus8g6rfwvs8qun0dfjkxaq8rkx3yf5tcsyz3d73gafnh3cax9rn449d9p5uxz9ezhhypd0elx87sjle52x86fux2ypatgddc6k63n7erqz25le42c4u4ecky03ylcqca784w

psbt?

  vvvv
  psbtcHNidP8BAHwCAAAAAi6MfY03xCfgYOwALsHCvDAZb8L3XWqIRMvANlHAgUMKAQAAAAD/////lqBODMY283eTPj2TrMxif6rNvNtaliTfG0kL0EXyTSwAAAAAAP////8B4CvlDgAAAAAXqRS1O7DcHbjI2APj4594TULkc3/6DYcAAAAAFQEgNzbDwGBTiW1wQc6PW64992zEkUdSIQPIcnzjXxyT6wviFAbumpI8iSGf6cnoUEyDFKaiLRKVwCEDx03HEMQH19tuBB7iEtmFzSgm2T+AbtRJErmh2mkcl3NSrhUB87qKEg2WCuB9Hb5vDDf7TJJtdtUiACCo9ERnvxcdUUmRU+AcC9YpEQn8OL0hs8MiTJ3GtXWQ3yECqPREZ78XHVFJkVPgHAvWKREJ/Di9IbPDIkydxrV1kN9HUiEC6A3sMdFnhlwWhenXqSkeZqTqIsZc/uMkKJoWZ8zaO4chAljLvDyylai+usIzqtx3c5eIBJk3mL5TkKtET23UxTJ+Uq4AAQD9/wACAAAAAYst0vc10KkzivlkAqipHkhBzT/tiCNi5zKfsE8f9lMlAAAAAGpHMEQCIHe+3+qZEMm6TgDeyUHazpdPi0c0mZLF1DEsHPV5bM5VAiBhZOa//3rBFZAGTKVxWDcJM3yKOJc9sucPTp2Ts7zOHQEhAy1kRHRZeE43yy3aNmxpetu9yKrirW23TtLa3jnXWIL6/v///wOCtCoEAAAAABl2qRTaUzZI/TOdV5d5DmuxZn2ehv37aIisgPD6AgAAAAAXqRQgNzbDwGBTiW1wQc6PW64992zEkYcAtMQEAAAAABepFLU7sNwduMjYA+Pjn3hNQuRzf/oNh54vEwAAAQEgAMLrCwAAAAAXqRTzuooSDZYK4H0dvm8MN/tMkm121YcA

Then perhaps you could drop the magic code as well?

Also we could do a base encoding that excludes + and / characters, such
as base62 (gmp-style). It's easier to copy/paste (double clicking a
string stops at / or + in base64 encodings).

example human readible part + base62

  psbtWzecLNK5WdwZyceXUYVo0E1TqgLWF0jRXsrLnsuGifjOJl5QHEaaeQbfHfzvbYH85uhUUvpfNc2RYnBqM9E4UqOjzRzDg4QGypL2bxoEsUmmAqOC7nRoN8SuftftuFBI9YabFjVZC9ykOIuJaMzanmKHxnhuh55Hh4mxpwDsvkGFHEHzYHJfkIAiaCEmpdxVBD3qvXNlspDwLKkssUlmDjH7X9zCGyTBE90XvwNdrwM63Q4T45GQbe3c4oQlzCnJuHf5FLnH2oR70hgxIoM01af35iJpZRZAGITtdnKvm9PbH3huEf7TXTzXuNLB9XFh50UlGvnPKcIfFHvgzTSqeN3NmXdzPzsNSRY83BnfHFtTIZnczIyDi5oWsi0sL8f5ABUqGHD61GXDXJGcsqWOjiW6zjhz1L2IKN6OdSVGBFf7C7gH2EYvkWJcKYcJ34gBGsLuXYCU8vzauxEYXXlOXohQ1qKj6Eb0DqOyroRD57uw9fG1e3ueCGlBKmyTI4z4Q1JQXSuLYzBGPlBpVuSZmDBUe28b1EVetJbP9rQ5r6aKsuNX1GToXq1KY5Xh5hsMixJ2o8kG8IBKQSZBRaxjiVEQDWoN3FED869vNHiQtgSLjbqQFZRJuDK0UTMfQCtcg7NdYulPxbUYFNF5Ug6wCvWrTpX1SdbDgGOqZel4ibM18fk9uSIIVDFK9XbenLH3NBOKj0hkxgvrbICZMWBc8GW78TLV4acO75tFBt4a4ziH0wztWGbEEGIAZTDaGmJ51omiRNUVfIX6fO9CeN3Nx3c7Ja2hAjMqQcYcKHEK8tFtLuUdR2jqLuGXOPV4gsqJb8TdkKGEZaA0RRqwHm6HG86OCOEGYqptt43iljv52qkh4znyekJI2mYPItcaw11tsxHaRQcs8Us9Ehlbf6ngmIW6tlo

base64: 920 bytes
base62: 927 bytes

Cheers,


[1] https://github.com/lightningnetwork/lightning-rfc/blob/master/11-payment-encoding.md#human-readable-part


--
https://jb55.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-22 22:28         ` Achow101
  2018-06-23 17:00           ` William Casarin
@ 2018-06-23 18:27           ` Peter D. Gray
  2018-06-25 19:47           ` Tomas Susanka
  2 siblings, 0 replies; 54+ messages in thread
From: Peter D. Gray @ 2018-06-23 18:27 UTC (permalink / raw)
  To: Achow101, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 2824 bytes --]

On Fri, Jun 22, 2018 at 06:28:33PM -0400, Achow101 wrote:
> After reading the comments here about BIP 174, I would like to propose the following changes:
> 
> - Moving redeemScripts, witnessScripts, and BIP 32 derivation paths to per-input and per-output data
...

I like this. I agree it's making things easier and it's more flexible.

> - Finalized scriptSig and scriptWitness fields
> 
> To determine whether two PSBTs are the same, we can compare the unsigned transaction. To ensure that the
> unsigned transactions are the same for two PSBTs with data for the same tx, we cannot put scriptSigs or
> scriptWitnesses into it. Thus for each input, two new fields have been added to store the finalized scriptSig
> and finalized scriptWitness.
...

To be honest, I don't understand the reasons/implications of this change, but it seems harmless.

> - Mandatory sighash
...

Good improvement.

> - Encoding
> 
> I have decided that PSBTs should either be in binary or encoded as a Base64 string. For the latter, several
> Bitcoin clients already support Base64 encoding of data (for signed messages) so this will not add any extra
> dependencies like Z85 would.
...

Personally, I don't think you should spec an encoding. It should be binary only and hex for developers and JSON interfaces. My thinking is that PSBT's are not user-visible things. They have a short lifetime and are nothing something that is "stored" or referenced much later. Hex is good enough and has no downsides as an excoding except for density.

On the other hand, suggesting a filename extension (probably .PSBT?) and a mime-type to match, are helpful since wallets and such will want to register with their operating systems to become handlers of those mimetypes. Really that's a lot more important for interoperability at this point, than an encoding.

> A draft of the revised BIP can be found here: https://github.com/achow101/bips/blob/bip174-rev/bip-0174.mediawiki
> If these changes are satisfactory, I will open a PR to the BIPs repo to update the BIP tomorrow. I will also
> create test vectors and update the implementation PR'ed to Core.
...

Looking forward to test vectors, and I might have more to say once my code can handle them (again).

Feedback on the BIP as it stands now: 

- Appendix A needs an update, and I suggest defining symbols (PK_PARTIAL_SIG == 0x02) for the numeric key values. This helps implementers as we don't all define our own symbols and/or use numeric constants in our code.

- Those tables are just not working. Might want to reformat as descriptive lists, point form, or generally anything else... sorry.

> Andrew
> _______________________________________________

---
Peter D. Gray  ||  Founder, Coinkite  ||  Twitter: @dochex  ||  GPG: A3A31BAD 5A2A5B10

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-23 17:00           ` William Casarin
@ 2018-06-23 20:33             ` Andrew Chow
  2018-06-24  8:19               ` Andrea
  0 siblings, 1 reply; 54+ messages in thread
From: Andrew Chow @ 2018-06-23 20:33 UTC (permalink / raw)
  To: William Casarin, Bitcoin Protocol Discussion, Peter Gray



On 06/23/2018 10:00 AM, William Casarin wrote:
> Since we're still considering the encoding, I wonder if it would be a
> good idea to have a human-readible part like lightning invoices[1]?
I don't think that is necessary.
> Then perhaps you could drop the magic code as well?
The magic is still necessary for the binary format in order to prevent
normal transaction deserializers from accidentally deserializing a psbt.
> Also we could do a base encoding that excludes + and / characters, such
> as base62 (gmp-style). It's easier to copy/paste (double clicking a
> string stops at / or + in base64 encodings).
While that would be ideal, I think it is better to use an encoding that
most wallets already support. Most wallets already have Base64 decoding
available so that they can decode signed messages which also use Base64
encoding. I think it is unnecessary to introduce another encoding.


On 06/23/2018 11:27 AM, Peter D. Gray wrote:
> Personally, I don't think you should spec an encoding. It should be binary only and hex for developers and JSON interfaces. My thinking is that PSBT's are not user-visible things. They have a short lifetime and are nothing something that is "stored" or referenced much later. Hex is good enough and has no downsides as an excoding except for density.
I think what will end up happening though is that, at least in the
beginning, PSBTs will primarily be strings that people end up copy and
pasting. Since a PSBT can get pretty large, the strings are rather
cumbersome to move around, especially as hex. At least with Base64 the
strings will be smaller.
> On the other hand, suggesting a filename extension (probably .PSBT?) and a mime-type to match, are helpful since wallets and such will want to register with their operating systems to become handlers of those mimetypes. Really that's a lot more important for interoperability at this point, than an encoding.
Agreed. I will include those in the BIP.
> Looking forward to test vectors, and I might have more to say once my code can handle them (again).
>
> Feedback on the BIP as it stands now: 
>
> - Appendix A needs an update, and I suggest defining symbols (PK_PARTIAL_SIG == 0x02) for the numeric key values. This helps implementers as we don't all define our own symbols and/or use numeric constants in our code.
Okay.
> - Those tables are just not working. Might want to reformat as descriptive lists, point form, or generally anything else... sorry.
I will try my best to fix that. Mediawiki sucks...

Andrew



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-23 20:33             ` Andrew Chow
@ 2018-06-24  8:19               ` Andrea
  2018-06-24  8:28                 ` Andrew Chow
  0 siblings, 1 reply; 54+ messages in thread
From: Andrea @ 2018-06-24  8:19 UTC (permalink / raw)
  To: Andrew Chow, Bitcoin Protocol Discussion

Hi, 

I think in the revised spec we can remove completely the "global types" as a map or even as typed record. Since there is only one type (the transaction) and it's compulsory to have one (and only one) we could just drop the definition of global type and the key associated with it, simply after the header + separator there must be a transaction.​​ Having read all the discussion i also agree having per-input key derivation and per-output data is a lot more handy for signers, no special feeling regarding the encoding.Looking forward for the test vectors and the new spec.

Cheers, Andrea.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On June 23, 2018 10:33 PM, Andrew Chow via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> ​​
> 
> On 06/23/2018 10:00 AM, William Casarin wrote:
> 
> > Since we're still considering the encoding, I wonder if it would be a
> > 
> > good idea to have a human-readible part like lightning invoices[1]?
> 
> I don't think that is necessary.
> 
> > Then perhaps you could drop the magic code as well?
> 
> The magic is still necessary for the binary format in order to prevent
> 
> normal transaction deserializers from accidentally deserializing a psbt.
> 
> > Also we could do a base encoding that excludes + and / characters, such
> > 
> > as base62 (gmp-style). It's easier to copy/paste (double clicking a
> > 
> > string stops at / or + in base64 encodings).
> 
> While that would be ideal, I think it is better to use an encoding that
> 
> most wallets already support. Most wallets already have Base64 decoding
> 
> available so that they can decode signed messages which also use Base64
> 
> encoding. I think it is unnecessary to introduce another encoding.
> 
> On 06/23/2018 11:27 AM, Peter D. Gray wrote:
> 
> > Personally, I don't think you should spec an encoding. It should be binary only and hex for developers and JSON interfaces. My thinking is that PSBT's are not user-visible things. They have a short lifetime and are nothing something that is "stored" or referenced much later. Hex is good enough and has no downsides as an excoding except for density.
> 
> I think what will end up happening though is that, at least in the
> 
> beginning, PSBTs will primarily be strings that people end up copy and
> 
> pasting. Since a PSBT can get pretty large, the strings are rather
> 
> cumbersome to move around, especially as hex. At least with Base64 the
> 
> strings will be smaller.
> 
> > On the other hand, suggesting a filename extension (probably .PSBT?) and a mime-type to match, are helpful since wallets and such will want to register with their operating systems to become handlers of those mimetypes. Really that's a lot more important for interoperability at this point, than an encoding.
> 
> Agreed. I will include those in the BIP.
> 
> > Looking forward to test vectors, and I might have more to say once my code can handle them (again).
> > 
> > Feedback on the BIP as it stands now:
> > 
> > -   Appendix A needs an update, and I suggest defining symbols (PK_PARTIAL_SIG == 0x02) for the numeric key values. This helps implementers as we don't all define our own symbols and/or use numeric constants in our code.
> 
> Okay.
> 
> > -   Those tables are just not working. Might want to reformat as descriptive lists, point form, or generally anything else... sorry.
> 
> I will try my best to fix that. Mediawiki sucks...
> 
> Andrew
> 
> bitcoin-dev mailing list
> 
> bitcoin-dev@lists•linuxfoundation.org
> 
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-24  8:19               ` Andrea
@ 2018-06-24  8:28                 ` Andrew Chow
  2018-06-24  9:00                   ` Andrea
  0 siblings, 1 reply; 54+ messages in thread
From: Andrew Chow @ 2018-06-24  8:28 UTC (permalink / raw)
  To: Andrea, Bitcoin Protocol Discussion

I disagree with the idea that global types can be removed. Firstly, it
is less of a breaking change to leave it there than to remove it
entirely. Secondly, there may be a point in the future where global
types would be useful/necessary. By having it still be there, we allow
for future extensibility.

Andrew


On 06/24/2018 01:19 AM, Andrea wrote:
> Hi, 
>
> I think in the revised spec we can remove completely the "global types" as a map or even as typed record. Since there is only one type (the transaction) and it's compulsory to have one (and only one) we could just drop the definition of global type and the key associated with it, simply after the header + separator there must be a transaction.​​ Having read all the discussion i also agree having per-input key derivation and per-output data is a lot more handy for signers, no special feeling regarding the encoding.Looking forward for the test vectors and the new spec.
>
> Cheers, Andrea.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>
> On June 23, 2018 10:33 PM, Andrew Chow via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:
>
>> ​​
>>
>> On 06/23/2018 10:00 AM, William Casarin wrote:
>>
>>> Since we're still considering the encoding, I wonder if it would be a
>>>
>>> good idea to have a human-readible part like lightning invoices[1]?
>> I don't think that is necessary.
>>
>>> Then perhaps you could drop the magic code as well?
>> The magic is still necessary for the binary format in order to prevent
>>
>> normal transaction deserializers from accidentally deserializing a psbt.
>>
>>> Also we could do a base encoding that excludes + and / characters, such
>>>
>>> as base62 (gmp-style). It's easier to copy/paste (double clicking a
>>>
>>> string stops at / or + in base64 encodings).
>> While that would be ideal, I think it is better to use an encoding that
>>
>> most wallets already support. Most wallets already have Base64 decoding
>>
>> available so that they can decode signed messages which also use Base64
>>
>> encoding. I think it is unnecessary to introduce another encoding.
>>
>> On 06/23/2018 11:27 AM, Peter D. Gray wrote:
>>
>>> Personally, I don't think you should spec an encoding. It should be binary only and hex for developers and JSON interfaces. My thinking is that PSBT's are not user-visible things. They have a short lifetime and are nothing something that is "stored" or referenced much later. Hex is good enough and has no downsides as an excoding except for density.
>> I think what will end up happening though is that, at least in the
>>
>> beginning, PSBTs will primarily be strings that people end up copy and
>>
>> pasting. Since a PSBT can get pretty large, the strings are rather
>>
>> cumbersome to move around, especially as hex. At least with Base64 the
>>
>> strings will be smaller.
>>
>>> On the other hand, suggesting a filename extension (probably .PSBT?) and a mime-type to match, are helpful since wallets and such will want to register with their operating systems to become handlers of those mimetypes. Really that's a lot more important for interoperability at this point, than an encoding.
>> Agreed. I will include those in the BIP.
>>
>>> Looking forward to test vectors, and I might have more to say once my code can handle them (again).
>>>
>>> Feedback on the BIP as it stands now:
>>>
>>> -   Appendix A needs an update, and I suggest defining symbols (PK_PARTIAL_SIG == 0x02) for the numeric key values. This helps implementers as we don't all define our own symbols and/or use numeric constants in our code.
>> Okay.
>>
>>> -   Those tables are just not working. Might want to reformat as descriptive lists, point form, or generally anything else... sorry.
>> I will try my best to fix that. Mediawiki sucks...
>>
>> Andrew
>>
>> bitcoin-dev mailing list
>>
>> bitcoin-dev@lists•linuxfoundation.org
>>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-24  8:28                 ` Andrew Chow
@ 2018-06-24  9:00                   ` Andrea
  0 siblings, 0 replies; 54+ messages in thread
From: Andrea @ 2018-06-24  9:00 UTC (permalink / raw)
  To: Andrew Chow; +Cc: Bitcoin Protocol Discussion

Keeping it for future extensions is a good point, my understanding was that since we always need exactly one transaction it could be part of the definition of PSBT instead of being a key-value (although it is more of a breaking change). 


Cheers, Andrea.

​​

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On June 24, 2018 10:28 AM, Andrew Chow <achow101-lists@achow101•com> wrote:

> ​​
> 
> I disagree with the idea that global types can be removed. Firstly, it
> 
> is less of a breaking change to leave it there than to remove it
> 
> entirely. Secondly, there may be a point in the future where global
> 
> types would be useful/necessary. By having it still be there, we allow
> 
> for future extensibility.
> 
> Andrew
> 
> On 06/24/2018 01:19 AM, Andrea wrote:
> 
> > Hi,
> > 
> > I think in the revised spec we can remove completely the "global types" as a map or even as typed record. Since there is only one type (the transaction) and it's compulsory to have one (and only one) we could just drop the definition of global type and the key associated with it, simply after the header + separator there must be a transaction. Having read all the discussion i also agree having per-input key derivation and per-output data is a lot more handy for signers, no special feeling regarding the encoding.Looking forward for the test vectors and the new spec.
> > 
> > Cheers, Andrea.
> > 
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > 
> > On June 23, 2018 10:33 PM, Andrew Chow via bitcoin-dev bitcoin-dev@lists•linuxfoundation.org wrote:
> > 
> > > On 06/23/2018 10:00 AM, William Casarin wrote:
> > > 
> > > > Since we're still considering the encoding, I wonder if it would be a
> > > > 
> > > > good idea to have a human-readible part like lightning invoices[1]?
> > > > 
> > > > I don't think that is necessary.
> > > 
> > > > Then perhaps you could drop the magic code as well?
> > > > 
> > > > The magic is still necessary for the binary format in order to prevent
> > > 
> > > normal transaction deserializers from accidentally deserializing a psbt.
> > > 
> > > > Also we could do a base encoding that excludes + and / characters, such
> > > > 
> > > > as base62 (gmp-style). It's easier to copy/paste (double clicking a
> > > > 
> > > > string stops at / or + in base64 encodings).
> > > > 
> > > > While that would be ideal, I think it is better to use an encoding that
> > > 
> > > most wallets already support. Most wallets already have Base64 decoding
> > > 
> > > available so that they can decode signed messages which also use Base64
> > > 
> > > encoding. I think it is unnecessary to introduce another encoding.
> > > 
> > > On 06/23/2018 11:27 AM, Peter D. Gray wrote:
> > > 
> > > > Personally, I don't think you should spec an encoding. It should be binary only and hex for developers and JSON interfaces. My thinking is that PSBT's are not user-visible things. They have a short lifetime and are nothing something that is "stored" or referenced much later. Hex is good enough and has no downsides as an excoding except for density.
> > > > 
> > > > I think what will end up happening though is that, at least in the
> > > 
> > > beginning, PSBTs will primarily be strings that people end up copy and
> > > 
> > > pasting. Since a PSBT can get pretty large, the strings are rather
> > > 
> > > cumbersome to move around, especially as hex. At least with Base64 the
> > > 
> > > strings will be smaller.
> > > 
> > > > On the other hand, suggesting a filename extension (probably .PSBT?) and a mime-type to match, are helpful since wallets and such will want to register with their operating systems to become handlers of those mimetypes. Really that's a lot more important for interoperability at this point, than an encoding.
> > > > 
> > > > Agreed. I will include those in the BIP.
> > > 
> > > > Looking forward to test vectors, and I might have more to say once my code can handle them (again).
> > > > 
> > > > Feedback on the BIP as it stands now:
> > > > 
> > > > -   Appendix A needs an update, and I suggest defining symbols (PK_PARTIAL_SIG == 0x02) for the numeric key values. This helps implementers as we don't all define our own symbols and/or use numeric constants in our code.
> > > >     
> > > >     Okay.
> > > >     
> > > 
> > > > -   Those tables are just not working. Might want to reformat as descriptive lists, point form, or generally anything else... sorry.
> > > >     
> > > >     I will try my best to fix that. Mediawiki sucks...
> > > >     
> > > 
> > > Andrew
> > > 
> > > bitcoin-dev mailing list
> > > 
> > > bitcoin-dev@lists•linuxfoundation.org
> > > 
> > > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-22 22:28         ` Achow101
  2018-06-23 17:00           ` William Casarin
  2018-06-23 18:27           ` Peter D. Gray
@ 2018-06-25 19:47           ` Tomas Susanka
  2018-06-25 20:10             ` Jonas Schnelli
  2018-06-25 20:30             ` Achow101
  2 siblings, 2 replies; 54+ messages in thread
From: Tomas Susanka @ 2018-06-25 19:47 UTC (permalink / raw)
  To: Achow101 via bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 1982 bytes --]

Hi,

this is great.

On 23.6.2018 00:28, Achow101 via bitcoin-dev wrote:

> Hi all,
>
> After reading the comments here about BIP 174, I would like to propose the following changes:

From my perspective those are exactly the points I have felt strongly
about. I still think "typed records" would be a better choice, but it's
something I'm willing to compromise on. As I'm looking at the draft, we
currently have 13 records and only 3 of them have keys... Matejcik was a
bit keener on this, so we'll try to discuss this more during the week
and we also look at the draft more carefully to see if we can come up
with some nit-picks.

> - Encoding
>
> I have decided that PSBTs should either be in binary or encoded as a Base64 string. For the latter, several
> Bitcoin clients already support Base64 encoding of data (for signed messages) so this will not add any extra
> dependencies like Z85 would.

I agree. If we're arguing for not using protobuf, because it is a
dependency, we shouldn't add dependency for some lesser-known encoding
format.

As was partially brought up by William, shouldn't we consider using
bech32? It doesn't break on double-click and it is a dependency for
native Segwit addresses anyway, so wallets might already support it or
they will at some point. But we should probably run some numbers on this
first, since bech32 will obviously be larger than base64.


On 24.6.2018 10:28, Andrew Chow via bitcoin-dev wrote:

> I disagree with the idea that global types can be removed. Firstly, it
> is less of a breaking change to leave it there than to remove it
> entirely. Secondly, there may be a point in the future where global
> types would be useful/necessary. By having it still be there, we allow
> for future extensibility.

I agree. It doesn't hurt if the global section stays and it is more
forward-looking.


Best,
Tomas

PS: This email didn't get through at first, so I hope this isn't a repost.


[-- Attachment #2: Type: text/html, Size: 2601 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-25 19:47           ` Tomas Susanka
@ 2018-06-25 20:10             ` Jonas Schnelli
  2018-06-25 20:30             ` Achow101
  1 sibling, 0 replies; 54+ messages in thread
From: Jonas Schnelli @ 2018-06-25 20:10 UTC (permalink / raw)
  To: Tomas Susanka, Bitcoin Protocol Discussion


[-- Attachment #1.1: Type: text/plain, Size: 609 bytes --]

Hi
> As was partially brought up by William, shouldn't we consider using
> bech32? It doesn't break on double-click and it is a dependency for
> native Segwit addresses anyway, so wallets might already support it or
> they will at some point. But we should probably run some numbers on this
> first, since bech32 will obviously be larger than base64.
I don’t think bech32 is a fit here.
Bech32 is a BCH where the error detecting properties are optimised for 1023 chars max and in the special case of the Bech32 BCH, error detection of 4 chars are guaranteed with a max length of 90 chars.

/jonas

[-- Attachment #1.2: Type: text/html, Size: 1156 bytes --]

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-25 19:47           ` Tomas Susanka
  2018-06-25 20:10             ` Jonas Schnelli
@ 2018-06-25 20:30             ` Achow101
  2018-06-26 15:33               ` matejcik
  1 sibling, 1 reply; 54+ messages in thread
From: Achow101 @ 2018-06-25 20:30 UTC (permalink / raw)
  To: Tomas Susanka, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1781 bytes --]

Hi,

On June 25, 2018 12:47 PM, Tomas Susanka via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> From my perspective those are exactly the points I have felt strongly
> about. I still think "typed records" would be a better choice, but it's
> something I'm willing to compromise on. As I'm looking at the draft, we
> currently have 13 records and only 3 of them have keys... Matejcik was a
> bit keener on this, so we'll try to discuss this more during the week
> and we also look at the draft more carefully to see if we can come up
> with some nit-picks.

So there are a few reasons for not using typed records. Firstly, it is less of a breaking change to retain the key-value map model.

Secondly, it is easier to enforce uniqueness for certain things. For example, in each input, we only want to have one redeemScript and one witnessScript. With a typed records set, we would have to say that only on record of each type is allowed, which means that combiners need to understand types and be able to partially parse the records. However with a key-value model, we can more generically say that every key-value pair must have a unique key which means that combiners do not need to know anything about types and just needs to enforce key uniqueness. Since the type is the only thing in the key for redeemScripts and witnessScripts, this uniqueness automatically applies to this, as well as for other key-value pairs.

Lastly, the typed records model does not save a lot of space in a transaction. Each record has at most one extra byte in the key-value model, with records that must also have keys having no space savings. The data inside each key-value pair far exceeds one byte, so on additional byte per key-value pair isn't all that big of a deal, IMO.

Andrew

[-- Attachment #2: Type: text/html, Size: 2393 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-25 20:30             ` Achow101
@ 2018-06-26 15:33               ` matejcik
  2018-06-26 16:58                 ` William Casarin
                                   ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: matejcik @ 2018-06-26 15:33 UTC (permalink / raw)
  To: bitcoin-dev


[-- Attachment #1.1: Type: text/plain, Size: 5261 bytes --]

hello,

in general, I agree with my colleague Tomas, the proposed changes are
good and achieve the most important things that we wanted. We'll review
the proposal in more detail later.

For now a couple minor things I have run into:

- valid test vector 2 ("one P2PKH input and one P2SH-P2WPKH input")
seems broken, at least its hex version; a delimiter seems to be missing,
misplaced or corrupted
- at least the first signing vector is not updated, but you probably
know that
- BIP32 derivation fields don't specify size of the "master public key",
which would make it un-parsable :)
- "Transaction format is specified as follows" and its table need to be
updated.


I'm still going to argue against the key-value model though.

It's true that this is not significant in terms of space. But I'm more
concerned about human readability, i.e., confusing future implementers.
At this point, the key-value model is there "for historical reasons",
except these aren't valid even before finalizing the format. The
original rationale for using key-values seems to be gone (no key-based
lookups are necessary). As for combining and deduplication, whether key
data is present or not is now purely a stand-in for a "repeatable" flag.
We could just as easily say, e.g., that the high bit of "type" specifies
whether this record can be repeated.

(Moreover, as I wrote previously, the Combiner seems like a weirdly
placed role. I still don't see its significance and why is it important
to correctly combine PSBTs by agents that don't understand them. If you
have a usecase in mind, please explain.
ISTM a Combiner could just as well combine based on whole-record
uniqueness, and leave the duplicate detection to the Finalizer. In case
the incoming PSBTs have incompatible unique fields, the Combiner would
have to fail anyway, so the Finalizer might as well do it. Perhaps it
would be good to leave out the Combiner role entirely?)

There's two remaining types where key data is used: BIP32 derivations
and partial signatures. In case of BIP32 derivation, the key data is
redundant ( pubkey = derive(value) ), so I'd argue we should leave that
out and save space. In case of partial signatures, it's simple enough to
make the pubkey part of the value.

Re breaking change, we are proposing breaking changes anyway, existing
code *will* need to be touched, and given that this is a hand-parsed
format, changing `parse_keyvalue` to `parse_record` seems like a small
matter?

---

At this point I'm obliged to again argue for using protobuf.

Thing is: BIP174 *is basically protobuf* (v2) as it stands. If I'm
succesful in convincing you to switch to a record set model, it's going
to be "protobuf with different varint".

I mean this very seriously: (the relevant subset of) protobuf is a set
of records in the following format:
<record type><varint field length><field data>
Record types can repeat, the schema specifies whether a field is
repeatable or not - if it's not, the last parsed value is used.

BIP174 is a ad-hoc format, simple to parse by hand; but that results in
_having to_ parse it by hand. In contrast, protobuf has a huge
collection of implementations that will do the job of sorting record
types into relevant struct fields, proper delimiting of records, etc.
...while at the same time, implementing "protobuf-based-BIP174" by hand
is roughly equally difficult as implementing the current BIP174.

N.B., it's possible to write a parser for protobuf-BIP174 without
needing a general protobuf library. Protobuf formats are designed with
forwards- and backwards- compatibility in mind, so having a hand-written
implementation should not lead to incompatibilities.

I did an experiment with this and other variants of the BIP174 format.
You can see them here:
[1] https://github.com/matejcik/bip174-playground
see in particular:
[2] https://github.com/matejcik/bip174-playground/blob/master/bip174.proto

The tool at [1] does size comparisons. On the test vectors, protobuf is
slightly smaller than key-value, and roughly equal to record-set, losing
out a little when BIP32 fields are used.
(I'm also leaving out key-data for BIP32 fields.)

There's some technical points to consider about protobuf, too:

- I decided to structure the message as a single "PSBT" type, where
"InputType" and "OutputType" are repeatable embedded fields. This seemed
better in terms of extensibility - we could add more sections in the
future. But the drawback is that you need to know the size of
Input/OutputType record in advance.
The other option is sending the messages separated by NUL bytes, same as
now, in which case you don't need to specify size.

- in InputType, i'm using "uint32" for sighash. This type is not
length-delimited, so non-protobuf consumers would have to understand it
specially. We could also declare that all fields must be
length-delimited[1] and implement sighash as a separate message type
with one field.

- non-protobuf consumers will also need to understand both protobuf
varint and bitcoin compact uint, which is a little ugly

best regards
matejcik


[1] https://developers.google.com/protocol-buffers/docs/encoding#structure


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-26 15:33               ` matejcik
@ 2018-06-26 16:58                 ` William Casarin
  2018-06-26 17:11                   ` Marek Palatinus
  2018-06-27 14:11                   ` matejcik
  2018-06-26 20:30                 ` Pieter Wuille
  2018-06-26 21:56                 ` [bitcoin-dev] BIP 174 thoughts Achow101
  2 siblings, 2 replies; 54+ messages in thread
From: William Casarin @ 2018-06-26 16:58 UTC (permalink / raw)
  To: matejcik; +Cc: Bitcoin Protocol Discussion

matejcik via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> writes:

> BIP174 is a ad-hoc format, simple to parse by hand; but that results
> in _having to_ parse it by hand. In contrast, protobuf has a huge
> collection of implementations that will do the job of sorting record
> types into relevant struct fields, proper delimiting of records, etc.

seems a bit overkill for how simple the format is, and pulling in a
large dependency just for this is a bit silly. Although making it
protobuf-compatible is an interesting idea, but I fear would be more
work than is worth? I haven't looked closed enough at the protobuf
encoding to be sure.

> ...while at the same time, implementing "protobuf-based-BIP174" by
> hand is roughly equally difficult as implementing the current BIP174.

as a data point, I was able to build a simple serializer[1] in about an
afternoon. I would much prefer to use this lib in say, clightning (my
original goal), without having to have the larger protobuf dependency.

Cheers,

[1] https://github.com/jb55/libpsbt


--
https://jb55.com



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-26 16:58                 ` William Casarin
@ 2018-06-26 17:11                   ` Marek Palatinus
  2018-06-27 14:11                   ` matejcik
  1 sibling, 0 replies; 54+ messages in thread
From: Marek Palatinus @ 2018-06-26 17:11 UTC (permalink / raw)
  To: William Casarin, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1388 bytes --]

On Tue, Jun 26, 2018 at 6:58 PM, William Casarin via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:
>
> seems a bit overkill for how simple the format is, and pulling in a
> large dependency just for this is a bit silly. Although making it
> protobuf-compatible is an interesting idea, but I fear would be more
> work than is worth? I haven't looked closed enough at the protobuf
> encoding to be sure.
>
> > ...while at the same time, implementing "protobuf-based-BIP174" by
> > hand is roughly equally difficult as implementing the current BIP174.
>
> as a data point, I was able to build a simple serializer[1] in about an
> afternoon. I would much prefer to use this lib in say, clightning (my
> original goal), without having to have the larger protobuf dependency.
>
>
That was exactly matejcik's point; you can easily write protobuf-compatible
encoder/decoder for such simple structure in about an afternoon, if you
need. Or you can use existing protobuf parsers in matter of minute, if you
don't care about dependencies.

Also, many projects already have protobuf parsers, so it work in other way,
too; you need BIP174 parser as extra dependency/library, although you
already use protobuf library (like Trezor device does). For needs of
BIP174, the difference between ad-hoc format and protobuf is neglible, so
it is a mistake to introduce yet another format.

slush

[-- Attachment #2: Type: text/html, Size: 1857 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-26 15:33               ` matejcik
  2018-06-26 16:58                 ` William Casarin
@ 2018-06-26 20:30                 ` Pieter Wuille
  2018-06-27 14:04                   ` matejcik
  2018-06-26 21:56                 ` [bitcoin-dev] BIP 174 thoughts Achow101
  2 siblings, 1 reply; 54+ messages in thread
From: Pieter Wuille @ 2018-06-26 20:30 UTC (permalink / raw)
  To: matejcik, Bitcoin Protocol Discussion

On Tue, Jun 26, 2018 at 8:33 AM, matejcik via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
> I'm still going to argue against the key-value model though.
>
> It's true that this is not significant in terms of space. But I'm more
> concerned about human readability, i.e., confusing future implementers.
> At this point, the key-value model is there "for historical reasons",
> except these aren't valid even before finalizing the format. The
> original rationale for using key-values seems to be gone (no key-based
> lookups are necessary). As for combining and deduplication, whether key
> data is present or not is now purely a stand-in for a "repeatable" flag.
> We could just as easily say, e.g., that the high bit of "type" specifies
> whether this record can be repeated.

I understand this is a philosophical point, but to me it's the
opposite. The file conveys "the script is X", "the signature for key X
is Y", "the derivation for key X is Y" - all extra metadata added to
inputs of the form "the X is Y". In a typed record model, you still
have Xes, but they are restricted to a single number (the record
type). In cases where that is insufficient, your solution is adding a
repeatable flag to switch from "the first byte needs to be unique" to
"the entire record needs to be unique". Why just those two? It seems
much more natural to have a length that directly tells you how many of
the first bytes need to be unique (which brings you back to the
key-value model).

Since the redundant script hashes were removed by making the scripts
per-input, I think the most compelling reason (size advantages) for a
record based model is gone.

> (Moreover, as I wrote previously, the Combiner seems like a weirdly
> placed role. I still don't see its significance and why is it important
> to correctly combine PSBTs by agents that don't understand them. If you
> have a usecase in mind, please explain.

Forward compatibility with new script types. A transaction may spend
inputs from different outputs, with different script types. Perhaps
some of these are highly specialized things only implemented by some
software (say HTLCs of a particular structure), in non-overlapping
ways where no piece of software can handle all scripts involved in a
single transaction. If Combiners cannot deal with unknown fields, they
won't be able to deal with unknown scripts. That would mean that
combining must be done independently by Combiner implementations for
each script type involved. As this is easily avoided by adding a
slight bit of structure (parts of the fields that need to be unique -
"keys"), this seems the preferable option.

> ISTM a Combiner could just as well combine based on whole-record
> uniqueness, and leave the duplicate detection to the Finalizer. In case
> the incoming PSBTs have incompatible unique fields, the Combiner would
> have to fail anyway, so the Finalizer might as well do it. Perhaps it
> would be good to leave out the Combiner role entirely?)

No, a Combiner can pick any of the values in case different PSBTs have
different values for the same key. That's the point: by having a
key-value structure the choice of fields can be made such that
Combiners don't need to care about the contents. Finalizers do need to
understand the contents, but they only operate once at the end.
Combiners may be involved in any PSBT passing from one entity to
another.

> There's two remaining types where key data is used: BIP32 derivations
> and partial signatures. In case of BIP32 derivation, the key data is
> redundant ( pubkey = derive(value) ), so I'd argue we should leave that
> out and save space. In case of partial signatures, it's simple enough to
> make the pubkey part of the value.

In case of BIP32 derivation, computing the pubkeys is possibly
expensive. A simple signer can choose to just sign with whatever keys
are present, but they're not the only way to implement a signer, and
even less the only software interacting with this format. Others may
want to use a matching approach to find keys that are relevant;
without pubkeys in the format, they're forced to perform derivations
for all keys present.

And yes, it's simple enough to make the key part of the value
everywhere, but in that case it becomes legal for a PSBT to contain
multiple signatures for a key, for example, and all software needs to
deal with that possibility. With a stronger uniqueness constraint,
only Combiners need to deal with repetitions.

> Thing is: BIP174 *is basically protobuf* (v2) as it stands. If I'm
> succesful in convincing you to switch to a record set model, it's going
> to be "protobuf with different varint".

If you take the records model, and then additionally drop the
whole-record uniqueness constraint, yes, though that seems pushing it
a bit by moving even more guarantees from the file format to
application level code. I'd like to hear opinions of other people who
have worked on implementations about changing this.

Cheers,

-- 
Pieter


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-26 15:33               ` matejcik
  2018-06-26 16:58                 ` William Casarin
  2018-06-26 20:30                 ` Pieter Wuille
@ 2018-06-26 21:56                 ` Achow101
  2018-06-27  6:09                   ` William Casarin
  2 siblings, 1 reply; 54+ messages in thread
From: Achow101 @ 2018-06-26 21:56 UTC (permalink / raw)
  To: matejcik, Bitcoin Protocol Discussion

Hi,

On June 26, 2018 8:33 AM, matejcik via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> ​​
> 
> hello,
> 
> in general, I agree with my colleague Tomas, the proposed changes are
> 
> good and achieve the most important things that we wanted. We'll review
> 
> the proposal in more detail later.
> 
> For now a couple minor things I have run into:
> 
> -   valid test vector 2 ("one P2PKH input and one P2SH-P2WPKH input")
>     
>     seems broken, at least its hex version; a delimiter seems to be missing,
>     
>     misplaced or corrupted

Fixed

>     
> -   at least the first signing vector is not updated, but you probably
>     
>     know that

I updated all of the tests yesterday so they should be correct now. I will be adding more tests
this week.

>     
> -   BIP32 derivation fields don't specify size of the "master public key",
>     
>     which would make it un-parsable :)

Oops, that's actually supposed to be master key fingerprint, not master public key. I have updated
the BIP to reflect this.

>     
> -   "Transaction format is specified as follows" and its table need to be
>     
>     updated.

Done.

>     
>     I'm still going to argue against the key-value model though.
>     
>     It's true that this is not significant in terms of space. But I'm more
>     
>     concerned about human readability, i.e., confusing future implementers.
>     
>     At this point, the key-value model is there "for historical reasons",
>     
>     except these aren't valid even before finalizing the format. The
>     
>     original rationale for using key-values seems to be gone (no key-based
>     
>     lookups are necessary). As for combining and deduplication, whether key
>     
>     data is present or not is now purely a stand-in for a "repeatable" flag.
>     
>     We could just as easily say, e.g., that the high bit of "type" specifies
>     
>     whether this record can be repeated.
>     
>     (Moreover, as I wrote previously, the Combiner seems like a weirdly
>     
>     placed role. I still don't see its significance and why is it important
>     
>     to correctly combine PSBTs by agents that don't understand them. If you
>     
>     have a usecase in mind, please explain.

There is a diagram in the BIP that explains this. The combiner's job is to combine two PSBTs that
are for the same transaction but contain different data such as signatures. It is easier to implement
a combiner that does not need to understand the types at all, and such combiners are forwards compatible,
so new types do not require new combiner implementations.

>     
>     ISTM a Combiner could just as well combine based on whole-record
>     
>     uniqueness, and leave the duplicate detection to the Finalizer. In case
>     
>     the incoming PSBTs have incompatible unique fields, the Combiner would
>     
>     have to fail anyway, so the Finalizer might as well do it. Perhaps it
>     
>     would be good to leave out the Combiner role entirely?)

A transaction that contains duplicate keys would be completely invalid. Furthermore, in the set of typed
records model, having more than one redeemScript and witnessScript should be invalid, so a combiner
would still need to understand what types are in order to avoid this situation. Otherwise it would produce
an invalid PSBT.

I also dislike the idea of having type specific things like "only one redeemScript" where a more generic
thing would work.

>     
>     There's two remaining types where key data is used: BIP32 derivations
>     
>     and partial signatures. In case of BIP32 derivation, the key data is
>     
>     redundant ( pubkey = derive(value) ), so I'd argue we should leave that
>     
>     out and save space. 

I think it is still necessary to include the pubkey as not all signers who can sign for a given pubkey may
know the derivation path. For example, if a privkey is imported into a wallet, that wallet does not necessarily
know the master key fingerprint for the privkey nor would it necessarily have the master key itself in
order to derive the privkey. But it still has the privkey and can still sign for it.

>     
>     Re breaking change, we are proposing breaking changes anyway, existing
>     
>     code will need to be touched, and given that this is a hand-parsed
>     
>     format, changing `parse_keyvalue` to `parse_record` seems like a small
>     
>     matter?

The point is to not make it difficult for existing implementations to change. Mostly what has been done now is just
moving things around, not an entire format change itself. Changing to a set of typed records model require more
changes and is a complete restructuring of the format, not just moving things around.

Additionally, I think that the current model is fairly easy to hand parse. I don't think a record set model would make
it easier to follow. Furthermore, moving to Protobuf would make it harder to hand parse as varints are slightly more
confusing in protobuf than with Compact size uints. And with the key-value model, you don't need to know the types
to know whether something is valid. You don't need to interpret any data.



Andrew




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-26 21:56                 ` [bitcoin-dev] BIP 174 thoughts Achow101
@ 2018-06-27  6:09                   ` William Casarin
  2018-06-27 13:39                     ` Andrea
  2018-06-27 17:55                     ` Achow101
  0 siblings, 2 replies; 54+ messages in thread
From: William Casarin @ 2018-06-27  6:09 UTC (permalink / raw)
  To: Achow101, Bitcoin Protocol Discussion


Hey Andrew,

If I'm reading the spec right: the way it is designed right now, you
could create hundreds of thousands of zero bytes in the input or output
key-value arrays. As far as I can tell this would be considered valid,
as it is simply a large array of empty dictionaries. Is this right? I'm
worried about buffer overflows in cases where someone sends a large blob
of zeros to an unsuspecting implementation.


Also, the extensibility section reads:

> Additional key-value maps with different types for the key-value pairs
> can be added on to the end of the format.

"different types for the key-value pairs", is this referring to new
types beyond the current global, input and output types?

> The number of each map that follows must be specified in the globals
> section

Is this out of date? Since there is only one type in the global section
now (tx).

> so that parsers will know when to use different definitions of the
> data types

I'm not sure what this means.


Thanks!

Will


--
https://jb55.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-27  6:09                   ` William Casarin
@ 2018-06-27 13:39                     ` Andrea
  2018-06-27 17:55                     ` Achow101
  1 sibling, 0 replies; 54+ messages in thread
From: Andrea @ 2018-06-27 13:39 UTC (permalink / raw)
  To: William Casarin, Bitcoin Protocol Discussion

Hi William, Andrew, list,

As noted by William there are some types missing in the global-types definition, because the number of each map for I/O must be known to the parser in order to use the correct definitions for the types. At the moment a parser reading a key-value record does not know whether it should read it as per-input type or per-output, a way to address this is to declare in advance the number of maps and ensure they are ordered (inputs first). If you haven't already worked out some types for that i propose using:

Number of inputs
- key (None, only the type): PSBT_GLOBAL_INPUT_NUMBER = 0x01  
- value: Varint 

Number of outputs
- key (none, only the type): PSBT_GLOBAL_OUTPUT_NUMBER = 0x02
- value: Varint

On another note I think we can set a hard limit on the size of the PSBT, currently is 'legal' to produce a very large PSBT with an excessive number of Inputs and Outputs. By excessive I mean that even in the best case scenario (using the smallest possible scriptPubKey as in P2WPKH) it is possible to create a PSBT that would certainly create an invalid transaction (because of its size) when finalized. I haven't found anything related to this in the previous discussions, please ignore this if it was already proposed/discussed.


Cheers, Andrea.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On June 27, 2018 8:09 AM, William Casarin via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> ​​
> 
> Hey Andrew,
> 
> If I'm reading the spec right: the way it is designed right now, you
> 
> could create hundreds of thousands of zero bytes in the input or output
> 
> key-value arrays. As far as I can tell this would be considered valid,
> 
> as it is simply a large array of empty dictionaries. Is this right? I'm
> 
> worried about buffer overflows in cases where someone sends a large blob
> 
> of zeros to an unsuspecting implementation.
> 
> Also, the extensibility section reads:
> 
> > Additional key-value maps with different types for the key-value pairs
> > 
> > can be added on to the end of the format.
> 
> "different types for the key-value pairs", is this referring to new
> 
> types beyond the current global, input and output types?
> 
> > The number of each map that follows must be specified in the globals
> > 
> > section
> 
> Is this out of date? Since there is only one type in the global section
> 
> now (tx).
> 
> > so that parsers will know when to use different definitions of the
> > 
> > data types
> 
> I'm not sure what this means.
> 
> Thanks!
> 
> Will
> 
> 
> ------------------------------------------------
> 
> https://jb55.com
> 
> bitcoin-dev mailing list
> 
> bitcoin-dev@lists•linuxfoundation.org
> 
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-26 20:30                 ` Pieter Wuille
@ 2018-06-27 14:04                   ` matejcik
  2018-06-27 15:06                     ` Pieter Wuille
  0 siblings, 1 reply; 54+ messages in thread
From: matejcik @ 2018-06-27 14:04 UTC (permalink / raw)
  To: Pieter Wuille, Bitcoin Protocol Discussion


[-- Attachment #1.1: Type: text/plain, Size: 4814 bytes --]

hello,

On 26.6.2018 22:30, Pieter Wuille wrote:
>> (Moreover, as I wrote previously, the Combiner seems like a weirdly
>> placed role. I still don't see its significance and why is it important
>> to correctly combine PSBTs by agents that don't understand them. If you
>> have a usecase in mind, please explain.
> 
> Forward compatibility with new script types. A transaction may spend
> inputs from different outputs, with different script types. Perhaps
> some of these are highly specialized things only implemented by some
> software (say HTLCs of a particular structure), in non-overlapping
> ways where no piece of software can handle all scripts involved in a
> single transaction. If Combiners cannot deal with unknown fields, they
> won't be able to deal with unknown scripts.

Record-based Combiners *can* deal with unknown fields. Either by
including both versions, or by including one selected at random. This is
the same in k-v model.

> combining must be done independently by Combiner implementations for
> each script type involved. As this is easily avoided by adding a
> slight bit of structure (parts of the fields that need to be unique -
> "keys"), this seems the preferable option.

IIUC, you're proposing a "semi-smart Combiner" that understands and
processes some fields but not others? That doesn't seem to change
things. Either the "dumb" combiner throws data away before the "smart"
one sees it, or it needs to include all of it anyway.

> No, a Combiner can pick any of the values in case different PSBTs have
> different values for the same key. That's the point: by having a
> key-value structure the choice of fields can be made such that
> Combiners don't need to care about the contents. Finalizers do need to
> understand the contents, but they only operate once at the end.
> Combiners may be involved in any PSBT passing from one entity to
> another.

Yes. Combiners don't need to care about the contents.
So why is it important that a Combiner properly de-duplicates the case
where keys are the same but values are different? This is a job that,
AFAICT so far, can be safely left to someone along the chain who
understands that particular record.

Say we have field F(key,value), and several Signers produce F(1,1),
F(1,2), F(1,3).

A key-based Combiner will pick exactly one to pass along. A record-based
Combiner will pass all three.

It seems that you consider the latter PSBT "invalid". But it is well
formed and doesn't contain duplicate records. A Finalizer, or a
different Combiner that understands field F, can as well have the rule
"throw away all but one" for this case.

To repeat and restate my central question:
Why is it important, that an agent which doesn't understand a particular
field structure, can nevertheless make decisions about its inclusion or
omission from the result (based on a repeated prefix)?

Actually, I can imagine the opposite: having fields with same "key"
(identifying data), and wanting to combine their "values" intelligently
without losing any of the data. Say, two Signers producing separate
parts of a combined-signature under the same common public key?

> In case of BIP32 derivation, computing the pubkeys is possibly
> expensive. A simple signer can choose to just sign with whatever keys
> are present, but they're not the only way to implement a signer, and
> even less the only software interacting with this format. Others may
> want to use a matching approach to find keys that are relevant;
> without pubkeys in the format, they're forced to perform derivations
> for all keys present.

I'm going to search for relevant keys by comparing master fingerprint; I
would expect HWWs generally don't have index based on leaf pubkeys.
OTOH, Signers with lots of keys probably aren't resource-constrained and
can do the derivations in case of collisions.

Also, you need to do the derivation and checking anyway, because what if
there is a mismatch between the key and the value?

I liked @achow101's idea about supporting non-derived keys, but I
assumed that you would match them based on the master fingerprint too?

I wouldn't be against including the full master public key (probably
without chaincode) instead of the fingerprint, as you proposed earlier.
But including both the leaf pubkey and the fingerprint seems weird.

> If you take the records model, and then additionally drop the
> whole-record uniqueness constraint, yes, though that seems pushing it
> a bit by moving even more guarantees from the file format to
> application level code.

The "file format" makes no guarantees, because the parsing code and
application code is the same anyway. You could say I'm proposing to
separate these concerns ;)

regards
m.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-26 16:58                 ` William Casarin
  2018-06-26 17:11                   ` Marek Palatinus
@ 2018-06-27 14:11                   ` matejcik
  1 sibling, 0 replies; 54+ messages in thread
From: matejcik @ 2018-06-27 14:11 UTC (permalink / raw)
  To: William Casarin; +Cc: Bitcoin Protocol Discussion


[-- Attachment #1.1: Type: text/plain, Size: 837 bytes --]

hello,

On 26.6.2018 18:58, William Casarin wrote:
> as a data point, I was able to build a simple serializer[1] in about an
> afternoon. I would much prefer to use this lib in say, clightning (my
> original goal), without having to have the larger protobuf dependency.

To drive my point home, here's a PR converting the `writer` part of your
code to a protobuf-compatible version. It took me less than an hour to
write, the bigger part of which was spent orienting myself in unfamiliar
code. I assume I could do `reader` in less than that, if your
deserialization code was complete.

https://github.com/jb55/libpsbt/pull/3/files

This code produces PSBTs that my "bip174 playground" can correctly parse.

regards
m.

> 
> Cheers,
> 
> [1] https://github.com/jb55/libpsbt
> 
> 
> --
> https://jb55.com
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-27 14:04                   ` matejcik
@ 2018-06-27 15:06                     ` Pieter Wuille
  2018-06-29  9:53                       ` matejcik
  0 siblings, 1 reply; 54+ messages in thread
From: Pieter Wuille @ 2018-06-27 15:06 UTC (permalink / raw)
  To: matejcik; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 7375 bytes --]

On Wed, Jun 27, 2018, 07:04 matejcik <jan.matejek@satoshilabs•com> wrote:

> hello,
>
> On 26.6.2018 22:30, Pieter Wuille wrote:
> >> (Moreover, as I wrote previously, the Combiner seems like a weirdly
> >> placed role. I still don't see its significance and why is it important
> >> to correctly combine PSBTs by agents that don't understand them. If you
> >> have a usecase in mind, please explain.
> >
> > Forward compatibility with new script types. A transaction may spend
> > inputs from different outputs, with different script types. Perhaps
> > some of these are highly specialized things only implemented by some
> > software (say HTLCs of a particular structure), in non-overlapping
> > ways where no piece of software can handle all scripts involved in a
> > single transaction. If Combiners cannot deal with unknown fields, they
> > won't be able to deal with unknown scripts.
>
> Record-based Combiners *can* deal with unknown fields. Either by
> including both versions, or by including one selected at random. This is
> the same in k-v model.
>

Yes, I wasn't claiming otherwise. This was just a response to your question
why it is important that Combiners can process unknown fields. It is not an
argument in favor of one model or the other.

> combining must be done independently by Combiner implementations for
> > each script type involved. As this is easily avoided by adding a
> > slight bit of structure (parts of the fields that need to be unique -
> > "keys"), this seems the preferable option.
>
> IIUC, you're proposing a "semi-smart Combiner" that understands and
> processes some fields but not others? That doesn't seem to change
> things. Either the "dumb" combiner throws data away before the "smart"
> one sees it, or it needs to include all of it anyway.
>

No, I'm exactly arguing against smartness in the Combiner. It should always
be possible to implement a Combiner without any script specific logic.

> No, a Combiner can pick any of the values in case different PSBTs have
> > different values for the same key. That's the point: by having a
> > key-value structure the choice of fields can be made such that
> > Combiners don't need to care about the contents. Finalizers do need to
> > understand the contents, but they only operate once at the end.
> > Combiners may be involved in any PSBT passing from one entity to
> > another.
>
> Yes. Combiners don't need to care about the contents.
> So why is it important that a Combiner properly de-duplicates the case
> where keys are the same but values are different? This is a job that,
> AFAICT so far, can be safely left to someone along the chain who
> understands that particular record.
>

That's because PSBTs can be copied, signed, and combined back together. A
Combiner which does not deduplicate (at all) would end up having every
original record present N times, one for each copy, a possibly large blowup.

For all fields I can think of right now, that type of deduplication can be
done through whole-record uniqueness.

The question whether you need whole-record uniqueness or specified-length
uniqueness (=what is offered by a key-value model) is a philosophical one
(as I mentioned before). I have a preference for stronger invariants on the
file format, so that it becomes illegal for a PSBT to contain multiple
signatures for the same key for example, and implementations do not need to
deal with the case where multiple are present.

It seems that you consider the latter PSBT "invalid". But it is well
> formed and doesn't contain duplicate records. A Finalizer, or a
> different Combiner that understands field F, can as well have the rule
> "throw away all but one" for this case.
>

It's not about considering. We're writing a specification. Either it is
made invalid, or not.

In a key-value model you can have dumb combiners that must pick one of the
keys in case of duplication, and remove the necessity of dealing with
duplication from all other implementations (which I consider to be a good
thing). In a record-based model you cannot guarantee deduplication of
records that permit repetition per type, because a dumb combiner cannot
understand what part is supposed to be unique. As a result, a record-based
model forces you to let all implementations deal with e.g. multiple partial
signatures for a single key. This is a minor issue, but in my view shows
how records are a less than perfect match for the problem at hand.

To repeat and restate my central question:
> Why is it important, that an agent which doesn't understand a particular
> field structure, can nevertheless make decisions about its inclusion or
> omission from the result (based on a repeated prefix)?
>

Again, because otherwise you may need a separate Combiner for each type of
script involved. That would be unfortunate, and is very easily avoided.

Actually, I can imagine the opposite: having fields with same "key"
> (identifying data), and wanting to combine their "values" intelligently
> without losing any of the data. Say, two Signers producing separate
> parts of a combined-signature under the same common public key?
>

That can always be avoided by using different identifying information as
key for these fields. In your example, assuming you're talking about some
form of threshold signature scheme, every party has their own "shard" of
the key, which still uniquely identifies the participant. If they have no
data that is unique to the participant, they are clones, and don't need to
interact regardless.

> In case of BIP32 derivation, computing the pubkeys is possibly
> > expensive. A simple signer can choose to just sign with whatever keys
> > are present, but they're not the only way to implement a signer, and
> > even less the only software interacting with this format. Others may
> > want to use a matching approach to find keys that are relevant;
> > without pubkeys in the format, they're forced to perform derivations
> > for all keys present.
>
> I'm going to search for relevant keys by comparing master fingerprint; I
> would expect HWWs generally don't have index based on leaf pubkeys.
> OTOH, Signers with lots of keys probably aren't resource-constrained and
> can do the derivations in case of collisions.
>

Perhaps you want to avoid signing with keys that are already signed with?
If you need to derive all the keys before even knowing what was already
signed with, you've already performed 80% of the work.

> If you take the records model, and then additionally drop the
> > whole-record uniqueness constraint, yes, though that seems pushing it
> > a bit by moving even more guarantees from the file format to
> > application level code.
>
> The "file format" makes no guarantees, because the parsing code and
> application code is the same anyway. You could say I'm proposing to
> separate these concerns ;)
>

Of course a file format can make guarantees. If certain combinations of
data in it do not satsify the specification, the file is illegal, and
implementations do not need to deal with it. Stricter file formats are
easier to deal with, because there are less edge cases to consider.

To your point: proto v2 afaik has no way to declare "whole record
uniqueness", so either you drop that (which I think is unacceptable - see
the copy/sign/combine argument above), or you deal with it in your
application code.

Cheers,

-- 
Pieter

[-- Attachment #2: Type: text/html, Size: 10221 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-27  6:09                   ` William Casarin
  2018-06-27 13:39                     ` Andrea
@ 2018-06-27 17:55                     ` Achow101
  2018-06-28 20:42                       ` Rodolfo Novak
  2018-07-05 19:20                       ` William Casarin
  1 sibling, 2 replies; 54+ messages in thread
From: Achow101 @ 2018-06-27 17:55 UTC (permalink / raw)
  To: William Casarin; +Cc: Bitcoin Protocol Discussion

Hi,​

On June 26, 2018 11:09 PM, William Casarin <jb55@jb55•com> wrote:

> ​​
> 
> Hey Andrew,
> 
> If I'm reading the spec right: the way it is designed right now, you
> 
> could create hundreds of thousands of zero bytes in the input or output
> 
> key-value arrays. As far as I can tell this would be considered valid,
> 
> as it is simply a large array of empty dictionaries. Is this right? I'm
> 
> worried about buffer overflows in cases where someone sends a large blob
> 
> of zeros to an unsuspecting implementation.

No, that is incorrect. That whole paragraph is actually outdated, it was intended
for the possibility of adding output maps, which we have already done. I have 
removed it from the BIP.

However, it is possible for a PSBT to contain very large unknown key-value pairs 
which could potentially cause a problem. But I do not think that large PSBTs should 
really be a problem as they are really something that the user has to enter rather 
than something received remotely without user interaction.



On June 27, 2018 6:39 AM, Andrea via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> ​​
> 
> Hi William, Andrew, list,
> 
> As noted by William there are some types missing in the global-types definition, because the number of each map for I/O must be known to the parser in order to use the correct definitions for the types. At the moment a parser reading a key-value record does not know whether it should read it as per-input type or per-output, a way to address this is to declare in advance the number of maps and ensure they are ordered (inputs first). If you haven't already worked out some types for that i propose using:
> 

Parsers actually do know because that information is present in the unsigned transaction 
at the beginning of each PSBT. Since each input and output must be accounted for,
a parser can just parse the unsigned transaction and from there it can know how
many inputs and outputs to expect. If it sees more or less, it should throw an error
as the transaction is invalid.

Of course this implies that implementations will need to parse the unsigned transaction,
but not all actually need to. Combiners do not need to, they just need to merge the
maps together and follow the key uniqueness rule. They don't really need to know
or care about the number of inputs and outputs, just that the PSBTs being merged
share the same unsigned transaction and have the same number of maps.

Other roles need to understand the unsigned transaction anyways, so they still need
to parse it thus this isn't really a problem for those roles.

>     
>     On another note I think we can set a hard limit on the size of the PSBT, currently is 'legal' to produce a very large PSBT with an excessive number of Inputs and Outputs. By excessive I mean that even in the best case scenario (using the smallest possible scriptPubKey as in P2WPKH) it is possible to create a PSBT that would certainly create an invalid transaction (because of its size) when finalized. I haven't found anything related to this in the previous discussions, please ignore this if it was already proposed/discussed.
>     

I don't think such a limitation is practical or useful. A transaction can currently have, at most,
~24000 inputs and ~111000 outputs (+/- a few hundred) which is well beyond any useful limit.
Additionally, such limits may not be as extensible for future work. It is hard to determine what
is a reasonable limit on transaction size, and I don't think it is useful to have a limit. At worst
we would simply create an invalid transaction if it were too large.


Andrew




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-27 17:55                     ` Achow101
@ 2018-06-28 20:42                       ` Rodolfo Novak
  2018-07-05 19:20                       ` William Casarin
  1 sibling, 0 replies; 54+ messages in thread
From: Rodolfo Novak @ 2018-06-28 20:42 UTC (permalink / raw)
  To: Achow101, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 4718 bytes --]

Hello Folks,

Thanks for expediting this debate, I understand some still disagree about how this final spec should look like.

1. Coldcard's plugin for Electrum using the original BIP spec is ready, https://github.com/spesmilo/electrum/pull/4470 <https://github.com/spesmilo/electrum/pull/4470>
2. Our hardware is ready with this spec (src will be public soon)
3. Samourai Wallet and Sentinel also are ready with the current spec.

We intend to upgrade it once the final spec is ready, but I need to ship Coldcard.

Cheers,

ℝ.

Rodolfo Novak  ||  Founder, Coinkite  ||  twitter @nvk  ||  GPG: B444CDDA

> On Jun 27, 2018, at 13:55, Achow101 via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:
> 
> Hi,​
> 
> On June 26, 2018 11:09 PM, William Casarin <jb55@jb55•com> wrote:
> 
>> ​​
>> 
>> Hey Andrew,
>> 
>> If I'm reading the spec right: the way it is designed right now, you
>> 
>> could create hundreds of thousands of zero bytes in the input or output
>> 
>> key-value arrays. As far as I can tell this would be considered valid,
>> 
>> as it is simply a large array of empty dictionaries. Is this right? I'm
>> 
>> worried about buffer overflows in cases where someone sends a large blob
>> 
>> of zeros to an unsuspecting implementation.
> 
> No, that is incorrect. That whole paragraph is actually outdated, it was intended
> for the possibility of adding output maps, which we have already done. I have 
> removed it from the BIP.
> 
> However, it is possible for a PSBT to contain very large unknown key-value pairs 
> which could potentially cause a problem. But I do not think that large PSBTs should 
> really be a problem as they are really something that the user has to enter rather 
> than something received remotely without user interaction.
> 
> 
> 
> On June 27, 2018 6:39 AM, Andrea via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:
> 
>> ​​
>> 
>> Hi William, Andrew, list,
>> 
>> As noted by William there are some types missing in the global-types definition, because the number of each map for I/O must be known to the parser in order to use the correct definitions for the types. At the moment a parser reading a key-value record does not know whether it should read it as per-input type or per-output, a way to address this is to declare in advance the number of maps and ensure they are ordered (inputs first). If you haven't already worked out some types for that i propose using:
>> 
> 
> Parsers actually do know because that information is present in the unsigned transaction 
> at the beginning of each PSBT. Since each input and output must be accounted for,
> a parser can just parse the unsigned transaction and from there it can know how
> many inputs and outputs to expect. If it sees more or less, it should throw an error
> as the transaction is invalid.
> 
> Of course this implies that implementations will need to parse the unsigned transaction,
> but not all actually need to. Combiners do not need to, they just need to merge the
> maps together and follow the key uniqueness rule. They don't really need to know
> or care about the number of inputs and outputs, just that the PSBTs being merged
> share the same unsigned transaction and have the same number of maps.
> 
> Other roles need to understand the unsigned transaction anyways, so they still need
> to parse it thus this isn't really a problem for those roles.
> 
>> 
>>    On another note I think we can set a hard limit on the size of the PSBT, currently is 'legal' to produce a very large PSBT with an excessive number of Inputs and Outputs. By excessive I mean that even in the best case scenario (using the smallest possible scriptPubKey as in P2WPKH) it is possible to create a PSBT that would certainly create an invalid transaction (because of its size) when finalized. I haven't found anything related to this in the previous discussions, please ignore this if it was already proposed/discussed.
>> 
> 
> I don't think such a limitation is practical or useful. A transaction can currently have, at most,
> ~24000 inputs and ~111000 outputs (+/- a few hundred) which is well beyond any useful limit.
> Additionally, such limits may not be as extensible for future work. It is hard to determine what
> is a reasonable limit on transaction size, and I don't think it is useful to have a limit. At worst
> we would simply create an invalid transaction if it were too large.
> 
> 
> Andrew
> 
> 
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists•linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


[-- Attachment #2: Type: text/html, Size: 6892 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [bitcoin-dev]  BIP 174 thoughts
  2018-06-27 15:06                     ` Pieter Wuille
@ 2018-06-29  9:53                       ` matejcik
  2018-06-29 19:12                         ` Achow101
  0 siblings, 1 reply; 54+ messages in thread
From: matejcik @ 2018-06-29  9:53 UTC (permalink / raw)
  To: Pieter Wuille; +Cc: Bitcoin Dev


[-- Attachment #1.1: Type: text/plain, Size: 9846 bytes --]

Short version:

- I propose that conflicting "values" for the same "key" are considered
invalid.
- Let's not optimize for invalid data.
- Given that, there's an open question on how to handle invalid data
when encountered

In general, I don't think it's possible to enforce correctness at the
format level. You still need application level checks - and that calls
into question what we gain by trying to do this on the format level.


Long version:


Let's look at this from a different angle.

There are roughly two possible "modes" for the format with regard to
possibly-conflicting data. Call them "permissive" and "restrictive".

The spec says:
"""
Keys within each scope should never be duplicated; all keys in the
format are unique. PSBTs containing duplicate keys are invalid. However
implementors will still need to handle events where keys are duplicated
when combining transactions with duplicated fields. In this event, the
software may choose whichever value it wishes.
"""
The last sentence of this paragraph sets the mode to permissive:
duplicate values are pretty much OK. If you see them, just pick one.

You seem to argue that Combiners, in particular simple ones that don't
understand field semantics, should merge _keys_ permissively, but
deduplicate _values_ restrictively.
IOW: if you receive two different values for the same key, just pick
whichever, but $deity forbid you include both!

This choice doesn't make sense to me.

What _would_ make sense is fully restrictive mode: receiving two
different values for the same key is a fatal condition with no recovery.
If you have a non-deterministic scheme, put a differentiator in the key.
Or all the data, for that matter.
(Incidentally, this puts key-aware and keyless Combiners on the same
footing. As long as all participants uphold the protocol, different
value = different key = different full record.)

Given that, it's nice to have the Combiner perform the task of detecting
this and failing. But not at all necessary. As the quoted paragraph
correctly notes, consumers *still need to handle* PSBTs with duplicate keys.
(In this context, your implied permissive/restrictive Combiner is
optimized for dealing with invalid data. That seems like a wrong
optimization.)

A reasonable point to decide is whether the handling at the consumer
should be permissive or restrictive. Personally I'm OK with either. I'd
go with the following change:
"""
In this event, the software MAY reject the transaction as invalid. If it
decides to accept it, it MUST choose the last value encountered.
"""
(deterministic way of choosing, instead of "whichever you like")

We could also drop the first part, explicitly allowing consumers to
pick, and simplifying the Combiner algorithm to `sort -u`.
Note that this sort of "picking" will probably be implicit. I'd expect
the consumer to look like this:
```
for key, value in parse(nextRecord()):
  data[key] = value
```

Or we could drop the second part and switch MAY to MUST, for a fully
restrictive mode - which, funnily enough, still lets the Combiner work
as `sort -u`.
To see why, remember that distinct values for the same key are not
allowed in fully restrictive mode. If a Combiner encounters two
conflicting values F(1) and F(2), it should fail -- but if it doesn't,
it includes both and the same failure WILL happen on the fully
restrictive consumer.

This was (or is) my point of confusion re Combiners: the permissive key
+ restrictive value mode of operation doesn't seem to help subsequent
consumers in any way.


Now, for the fully restrictive consumer, the key-value model is indeed
advantageous (and this is the only scenario that I can imagine in which
it is advantageous), because you can catch key duplication on the parser
level.

But as it turns out, it's not enough. Consider the following records:
key(<PSBT_IN_REDEEM_SCRIPT> + abcde), value(<some redeem script>)
and:
key(<PSBT_IN_REDEEM_SCRIPT> + fghij), value(<some other redeem script>)

A purely syntactic Combiner simply can't handle this case. The
restrictive consumer needs to know whether the key is supposed to be
repeating or not.
We could fix this, e.g., by saying that repeating types must have high
bit set and non-repeating must not. We also don't have to, because the
worst failure here is that a consumer passes an invalid record to a
subsequent one and the failure happens one step later.

At this point it seems weird to be concerned about the "unique key"
correctness, which is a very small subset of possibly invalid inputs. As
a strict safety measure, I'd instead propose that a consumer MUST NOT
operate on inputs or outputs, unless it understand ALL included fields -
IOW, if you're signing a particular input, all fields in said input are
mandatory. This prevents a situation where a simple Signer processes an
input incorrectly based on incomplete set of fields, while still
allowing Signers with different capabilities within the same PSBT.

(The question here is whether to have either a flag or a reserved range
for "optional fields" that can be safely ignored by consumers that don't
understand them, but provide data for consumers who do.)


>> To repeat and restate my central question: Why is it important, 
>> that an agent which doesn't understand a particular field 
>> structure, can nevertheless make decisions about its inclusion or 
>> omission from the result (based on a repeated prefix)?
>> 
> 
> Again, because otherwise you may need a separate Combiner for each 
> type of script involved. That would be unfortunate, and is very 
> easily avoided.

This is still confusing to me, and I would really like to get to the
same page on this particular thing, because a lot of the debate hinges
on it. I think I covered most of it above, but there are still pieces to
clarify.

As I understand it, the Combiner role (actually all the roles) is mostly
an algorithm, with the implication that it can be performed
independently by a separate agent, say a network node.

So there's two types of Combiners:

a) Combiner as a part of an intelligent consumer -- the usual scenario
is a Creator/Combiner/Finalizer/Extractor being one participant, and
Updater/Signers as other participants.

In this case, the discussion of "simple Combiners" is actually talking
about intelligent Combiners which don't understand new fields and must
correctly pass them on. I argue that this can safely be done without
loss of any important properties.

b) Combiner as a separate service, with no understanding of semantics.
Although parts of the debate seem to assume this scenario, I don't think
it's worth considering. Again, do you have an usecase in mind for it?

You also insist on enforcing a limited form of correctness on the
Combiner level, but that is not worth it IMHO, as discussed above.

Or am I missing something else?


> Perhaps you want to avoid signing with keys that are already signed 
> with? If you need to derive all the keys before even knowing what
> was already signed with, you've already performed 80% of the work.

This wouldn't concern me at all, honestly. If the user sends an already
signed PSBT to the same signer, IMHO it is OK to sign again; the
slowdown is a fault of the user/workflow. You could argue that signing
again is the valid response. Perhaps the Signer should even "consume"
its keys and not pass them on after producing a signature? That seems
like a sensible rule.


> To your point: proto v2 afaik has no way to declare "whole record 
> uniqueness", so either you drop that (which I think is unacceptable
> - see the copy/sign/combine argument above), or you deal with it in 
> your application code.

Yes. My argument is that "whole record uniqueness" isn't in fact an
important property, because you need application-level checks anyway.
Additionally, protobuf provides awareness of which fields are repeated
and which aren't, and implicitly implements the "pick last" resolution
strategy for duplicates.

The simplest possible protobuf-based Combiner will:
- assume all fields are repeating
- concatenate and parse
- deduplicate and reserialize.

More knowledgeable Combiner will intelligently handle non-repeating
fields, but still has to assume that unknown fields are repeating and
use the above algorithm.

For "pick last" strategy, a consumer can simply parse the message and
perform appropriate application-level checks.

For "hard-fail" strategy, it must parse all fields as repeating and
check that there's only one of those that are supposed to be unique.
This is admittedly more work, and yes, protobuf is not perfectly suited
for this task.

But:

One, this work must be done by hand anyway, if we go with a custom
hand-parsed format. There is a protobuf implementation for every
conceivable platform, we'll never have the same amount of BIP174 parsing
code.
(And if you're hand-writing a parser in order to avoid the dependency,
you can modify it to do the checks at parser level. Note that this is
not breaking the format! The modifed parser will consume well-formed
protobuf and reject that which is valid protobuf but invalid bip174 - a
correct behavior for a bip174 parser.)

Two, it is my opinion that this is worth it in order to have a standard,
well described, well studied and widely implemented format.


Aside: I ha that there is no advantage to a record-set based
custom format by itself, so IMHO the choice is between protobuf vs
a custom key-value format. Additionally, it's even possible to implement
a hand-parsable key-value format in terms of protobuf -- again, arguing
that "standardness" of protobuf is valuable in itself.

regards
m.




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-29  9:53                       ` matejcik
@ 2018-06-29 19:12                         ` Achow101
  2018-06-29 20:31                           ` Peter D. Gray
  2018-07-04 13:19                           ` matejcik
  0 siblings, 2 replies; 54+ messages in thread
From: Achow101 @ 2018-06-29 19:12 UTC (permalink / raw)
  To: matejcik, Bitcoin Protocol Discussion

Hi,

I do not think that protobuf is the way to go for this. Not only is it another dependency
which many wallets do not want to add (e.g. Armory has not added BIP 70 support because
of its dependency on protobuf), but it is a more drastic change than the currently proposed
changes. The point of this email thread isn't to rewrite and design a new BIP (which is effectively
what is currently going on). The point is to modify and improve the current one. In particular,
we do not want such drastic changes that people who have already implemented the current
BIP would have to effectively rewrite everything from scratch again.

I believe that this discussion has become bikeshedding and is really no longer constructive. Neither
of us are going to convince the other to use or not use protobuf. ASeeing how no one else
has really participated in this discussion about protobuf and key uniqueness, I do not think
that these suggested changes are really necessary nor useful to others. It boils down to personal preference
rather than technical merit. As such, I have opened a PR to the BIPs repo (https://github.com/bitcoin/bips/pull/694)
which contains the changes that I proposed in an earlier email.

Additionally, because there have been no objections to the currently proposed changes, I propose
to move the BIP from Draft to Proposed status.

Andrew


​​

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On June 29, 2018 2:53 AM, matejcik via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> ​​
> 
> Short version:
> 
> -   I propose that conflicting "values" for the same "key" are considered
>     
>     invalid.
>     
> -   Let's not optimize for invalid data.
> -   Given that, there's an open question on how to handle invalid data
>     
>     when encountered
>     
>     In general, I don't think it's possible to enforce correctness at the
>     
>     format level. You still need application level checks - and that calls
>     
>     into question what we gain by trying to do this on the format level.
>     
>     Long version:
>     
>     Let's look at this from a different angle.
>     
>     There are roughly two possible "modes" for the format with regard to
>     
>     possibly-conflicting data. Call them "permissive" and "restrictive".
>     
>     The spec says:
>     
>     """
>     
>     Keys within each scope should never be duplicated; all keys in the
>     
>     format are unique. PSBTs containing duplicate keys are invalid. However
>     
>     implementors will still need to handle events where keys are duplicated
>     
>     when combining transactions with duplicated fields. In this event, the
>     
>     software may choose whichever value it wishes.
>     
>     """
>     
>     The last sentence of this paragraph sets the mode to permissive:
>     
>     duplicate values are pretty much OK. If you see them, just pick one.
>     
>     You seem to argue that Combiners, in particular simple ones that don't
>     
>     understand field semantics, should merge keys permissively, but
>     
>     deduplicate values restrictively.
>     
>     IOW: if you receive two different values for the same key, just pick
>     
>     whichever, but $deity forbid you include both!
>     
>     This choice doesn't make sense to me.
>     
>     What would make sense is fully restrictive mode: receiving two
>     
>     different values for the same key is a fatal condition with no recovery.
>     
>     If you have a non-deterministic scheme, put a differentiator in the key.
>     
>     Or all the data, for that matter.
>     
>     (Incidentally, this puts key-aware and keyless Combiners on the same
>     
>     footing. As long as all participants uphold the protocol, different
>     
>     value = different key = different full record.)
>     
>     Given that, it's nice to have the Combiner perform the task of detecting
>     
>     this and failing. But not at all necessary. As the quoted paragraph
>     
>     correctly notes, consumers still need to handle PSBTs with duplicate keys.
>     
>     (In this context, your implied permissive/restrictive Combiner is
>     
>     optimized for dealing with invalid data. That seems like a wrong
>     
>     optimization.)
>     
>     A reasonable point to decide is whether the handling at the consumer
>     
>     should be permissive or restrictive. Personally I'm OK with either. I'd
>     
>     go with the following change:
>     
>     """
>     
>     In this event, the software MAY reject the transaction as invalid. If it
>     
>     decides to accept it, it MUST choose the last value encountered.
>     
>     """
>     
>     (deterministic way of choosing, instead of "whichever you like")
>     
>     We could also drop the first part, explicitly allowing consumers to
>     
>     pick, and simplifying the Combiner algorithm to `sort -u`.
>     
>     Note that this sort of "picking" will probably be implicit. I'd expect
>     
>     the consumer to look like this:
>     
> 
>     for key, value in parse(nextRecord()):
>       data[key] = value
>     
> 
> Or we could drop the second part and switch MAY to MUST, for a fully
> 
> restrictive mode - which, funnily enough, still lets the Combiner work
> 
> as `sort -u`.
> 
> To see why, remember that distinct values for the same key are not
> 
> allowed in fully restrictive mode. If a Combiner encounters two
> 
> conflicting values F(1) and F(2), it should fail -- but if it doesn't,
> 
> it includes both and the same failure WILL happen on the fully
> 
> restrictive consumer.
> 
> This was (or is) my point of confusion re Combiners: the permissive key
> 
> -   restrictive value mode of operation doesn't seem to help subsequent
>     
>     consumers in any way.
>     
>     Now, for the fully restrictive consumer, the key-value model is indeed
>     
>     advantageous (and this is the only scenario that I can imagine in which
>     
>     it is advantageous), because you can catch key duplication on the parser
>     
>     level.
>     
>     But as it turns out, it's not enough. Consider the following records:
>     
>     key(<PSBT_IN_REDEEM_SCRIPT> + abcde), value(<some redeem script>)
>     
> 
> and:
> 
> key(<PSBT_IN_REDEEM_SCRIPT> + fghij), value(<some other redeem script>)
> 
> A purely syntactic Combiner simply can't handle this case. The
> 
> restrictive consumer needs to know whether the key is supposed to be
> 
> repeating or not.
> 
> We could fix this, e.g., by saying that repeating types must have high
> 
> bit set and non-repeating must not. We also don't have to, because the
> 
> worst failure here is that a consumer passes an invalid record to a
> 
> subsequent one and the failure happens one step later.
> 
> At this point it seems weird to be concerned about the "unique key"
> 
> correctness, which is a very small subset of possibly invalid inputs. As
> 
> a strict safety measure, I'd instead propose that a consumer MUST NOT
> 
> operate on inputs or outputs, unless it understand ALL included fields -
> 
> IOW, if you're signing a particular input, all fields in said input are
> 
> mandatory. This prevents a situation where a simple Signer processes an
> 
> input incorrectly based on incomplete set of fields, while still
> 
> allowing Signers with different capabilities within the same PSBT.
> 
> (The question here is whether to have either a flag or a reserved range
> 
> for "optional fields" that can be safely ignored by consumers that don't
> 
> understand them, but provide data for consumers who do.)
> 
> > > To repeat and restate my central question: Why is it important,
> > > 
> > > that an agent which doesn't understand a particular field
> > > 
> > > structure, can nevertheless make decisions about its inclusion or
> > > 
> > > omission from the result (based on a repeated prefix)?
> > 
> > Again, because otherwise you may need a separate Combiner for each
> > 
> > type of script involved. That would be unfortunate, and is very
> > 
> > easily avoided.
> 
> This is still confusing to me, and I would really like to get to the
> 
> same page on this particular thing, because a lot of the debate hinges
> 
> on it. I think I covered most of it above, but there are still pieces to
> 
> clarify.
> 
> As I understand it, the Combiner role (actually all the roles) is mostly
> 
> an algorithm, with the implication that it can be performed
> 
> independently by a separate agent, say a network node.
> 
> So there's two types of Combiners:
> 
> a) Combiner as a part of an intelligent consumer -- the usual scenario
> 
> is a Creator/Combiner/Finalizer/Extractor being one participant, and
> 
> Updater/Signers as other participants.
> 
> In this case, the discussion of "simple Combiners" is actually talking
> 
> about intelligent Combiners which don't understand new fields and must
> 
> correctly pass them on. I argue that this can safely be done without
> 
> loss of any important properties.
> 
> b) Combiner as a separate service, with no understanding of semantics.
> 
> Although parts of the debate seem to assume this scenario, I don't think
> 
> it's worth considering. Again, do you have an usecase in mind for it?
> 
> You also insist on enforcing a limited form of correctness on the
> 
> Combiner level, but that is not worth it IMHO, as discussed above.
> 
> Or am I missing something else?
> 
> > Perhaps you want to avoid signing with keys that are already signed
> > 
> > with? If you need to derive all the keys before even knowing what
> > 
> > was already signed with, you've already performed 80% of the work.
> 
> This wouldn't concern me at all, honestly. If the user sends an already
> 
> signed PSBT to the same signer, IMHO it is OK to sign again; the
> 
> slowdown is a fault of the user/workflow. You could argue that signing
> 
> again is the valid response. Perhaps the Signer should even "consume"
> 
> its keys and not pass them on after producing a signature? That seems
> 
> like a sensible rule.
> 
> > To your point: proto v2 afaik has no way to declare "whole record
> > 
> > uniqueness", so either you drop that (which I think is unacceptable
> > 
> > -   see the copy/sign/combine argument above), or you deal with it in
> >     
> >     your application code.
> >     
> 
> Yes. My argument is that "whole record uniqueness" isn't in fact an
> 
> important property, because you need application-level checks anyway.
> 
> Additionally, protobuf provides awareness of which fields are repeated
> 
> and which aren't, and implicitly implements the "pick last" resolution
> 
> strategy for duplicates.
> 
> The simplest possible protobuf-based Combiner will:
> 
> -   assume all fields are repeating
> -   concatenate and parse
> -   deduplicate and reserialize.
>     
>     More knowledgeable Combiner will intelligently handle non-repeating
>     
>     fields, but still has to assume that unknown fields are repeating and
>     
>     use the above algorithm.
>     
>     For "pick last" strategy, a consumer can simply parse the message and
>     
>     perform appropriate application-level checks.
>     
>     For "hard-fail" strategy, it must parse all fields as repeating and
>     
>     check that there's only one of those that are supposed to be unique.
>     
>     This is admittedly more work, and yes, protobuf is not perfectly suited
>     
>     for this task.
>     
>     But:
>     
>     One, this work must be done by hand anyway, if we go with a custom
>     
>     hand-parsed format. There is a protobuf implementation for every
>     
>     conceivable platform, we'll never have the same amount of BIP174 parsing
>     
>     code.
>     
>     (And if you're hand-writing a parser in order to avoid the dependency,
>     
>     you can modify it to do the checks at parser level. Note that this is
>     
>     not breaking the format! The modifed parser will consume well-formed
>     
>     protobuf and reject that which is valid protobuf but invalid bip174 - a
>     
>     correct behavior for a bip174 parser.)
>     
>     Two, it is my opinion that this is worth it in order to have a standard,
>     
>     well described, well studied and widely implemented format.
>     
>     Aside: I ha that there is no advantage to a record-set based
>     
>     custom format by itself, so IMHO the choice is between protobuf vs
>     
>     a custom key-value format. Additionally, it's even possible to implement
>     
>     a hand-parsable key-value format in terms of protobuf -- again, arguing
>     
>     that "standardness" of protobuf is valuable in itself.
>     
>     regards
>     
>     m.
>     
> 
> bitcoin-dev mailing list
> 
> bitcoin-dev@lists•linuxfoundation.org
> 
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-29 19:12                         ` Achow101
@ 2018-06-29 20:31                           ` Peter D. Gray
  2018-07-04 13:19                           ` matejcik
  1 sibling, 0 replies; 54+ messages in thread
From: Peter D. Gray @ 2018-06-29 20:31 UTC (permalink / raw)
  To: Achow101, Bitcoin Protocol Discussion

...
> I believe that this discussion has become bikeshedding and is really no longer constructive.
...

Thanks for saying this Andrew! I agree with your point of view, and personally I'm ready to lock this BIP down ... or at least move it to the next level of approval.

...
>  I propose to move the BIP from Draft to Proposed status.

Yes please, all praise the BIP gods!

---
Peter D. Gray  ||  Founder, Coinkite  ||  Twitter: @dochex  ||  GPG: A3A31BAD 5A2A5B10


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-29 19:12                         ` Achow101
  2018-06-29 20:31                           ` Peter D. Gray
@ 2018-07-04 13:19                           ` matejcik
  2018-07-04 18:35                             ` Achow101
  2018-07-04 19:09                             ` Pieter Wuille
  1 sibling, 2 replies; 54+ messages in thread
From: matejcik @ 2018-07-04 13:19 UTC (permalink / raw)
  To: Achow101, Bitcoin Protocol Discussion


[-- Attachment #1.1: Type: text/plain, Size: 15797 bytes --]

hello,

we still have some concerns about the BIP as currently proposed - not
about the format or data contents, but more about strictness and
security properties. I have raised some in the previous e-mails, but
they might have been lost in the overall talk about format.

* Choosing from duplicate keys when combining.
We believe that "choose whichever value it wishes" is not a good
resolution strategy. We propose to either change this to "in case of
conflicts, software MUST reject the conflicting PSBTs", or explain in
more detail why picking at random is a safe choice.

* Signing records with unknown keys.
There's been some talk about this at start, but there should be a clear
strategy for Signers when unknown fields are encountered. We intend to
implement the rule: "will not sign an input with any unknown fields
present".
Maybe it is worth codifying this behavior in the standard, or maybe
there should be a way to mark a field as "optional" so that strict
Signers know they can _safely_ ignore the unknown field.


And two minor points:

* Fields with empty keys.
This might be inferred from the definition, but is probably worth
spelling out explicitly: If a field definition states that the key data
is empty, an implementation MUST enforce this and reject PSBTs that
contain non-empty data.
We suggest adding something to the effect of:
"If a key or value data in a field doesn't match the specified format,
the PSBT is invalid. In particular, if key data is specified as "none"
but the key contains data beyond the type specifier, implementation MUST
reject the PSBT."
(not sure about the languge, this should of course allow processing
unknown fields)

* "Combiner can detect inconsistencies"
Added in response to this comment [1], the current wording looks like
it's describing what the Combiner is _capable of_, as opposed to
prescribing what the combiner is _allowed to_ do.
We suggest changing to something like:
"For every field type that the Combiner understands, it MAY also refuse
to combine PSBTs that have inconsistencies in that field, or cause a
conflict when combined."

regards
m.

[1] https://github.com/bitcoin/bips/pull/694#discussion_r199232318

On 29.6.2018 21:12, Achow101 wrote:
> Hi,
> 
> I do not think that protobuf is the way to go for this. Not only is it another dependency
> which many wallets do not want to add (e.g. Armory has not added BIP 70 support because
> of its dependency on protobuf), but it is a more drastic change than the currently proposed
> changes. The point of this email thread isn't to rewrite and design a new BIP (which is effectively
> what is currently going on). The point is to modify and improve the current one. In particular,
> we do not want such drastic changes that people who have already implemented the current
> BIP would have to effectively rewrite everything from scratch again.
> 
> I believe that this discussion has become bikeshedding and is really no longer constructive. Neither
> of us are going to convince the other to use or not use protobuf. ASeeing how no one else
> has really participated in this discussion about protobuf and key uniqueness, I do not think
> that these suggested changes are really necessary nor useful to others. It boils down to personal preference
> rather than technical merit. As such, I have opened a PR to the BIPs repo (https://github.com/bitcoin/bips/pull/694)
> which contains the changes that I proposed in an earlier email.
> 
> Additionally, because there have been no objections to the currently proposed changes, I propose
> to move the BIP from Draft to Proposed status.
> 
> Andrew
> 
> 
> ​​
> 
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> 
> On June 29, 2018 2:53 AM, matejcik via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:
> 
>> ​​
>>
>> Short version:
>>
>> -   I propose that conflicting "values" for the same "key" are considered
>>     
>>     invalid.
>>     
>> -   Let's not optimize for invalid data.
>> -   Given that, there's an open question on how to handle invalid data
>>     
>>     when encountered
>>     
>>     In general, I don't think it's possible to enforce correctness at the
>>     
>>     format level. You still need application level checks - and that calls
>>     
>>     into question what we gain by trying to do this on the format level.
>>     
>>     Long version:
>>     
>>     Let's look at this from a different angle.
>>     
>>     There are roughly two possible "modes" for the format with regard to
>>     
>>     possibly-conflicting data. Call them "permissive" and "restrictive".
>>     
>>     The spec says:
>>     
>>     """
>>     
>>     Keys within each scope should never be duplicated; all keys in the
>>     
>>     format are unique. PSBTs containing duplicate keys are invalid. However
>>     
>>     implementors will still need to handle events where keys are duplicated
>>     
>>     when combining transactions with duplicated fields. In this event, the
>>     
>>     software may choose whichever value it wishes.
>>     
>>     """
>>     
>>     The last sentence of this paragraph sets the mode to permissive:
>>     
>>     duplicate values are pretty much OK. If you see them, just pick one.
>>     
>>     You seem to argue that Combiners, in particular simple ones that don't
>>     
>>     understand field semantics, should merge keys permissively, but
>>     
>>     deduplicate values restrictively.
>>     
>>     IOW: if you receive two different values for the same key, just pick
>>     
>>     whichever, but $deity forbid you include both!
>>     
>>     This choice doesn't make sense to me.
>>     
>>     What would make sense is fully restrictive mode: receiving two
>>     
>>     different values for the same key is a fatal condition with no recovery.
>>     
>>     If you have a non-deterministic scheme, put a differentiator in the key.
>>     
>>     Or all the data, for that matter.
>>     
>>     (Incidentally, this puts key-aware and keyless Combiners on the same
>>     
>>     footing. As long as all participants uphold the protocol, different
>>     
>>     value = different key = different full record.)
>>     
>>     Given that, it's nice to have the Combiner perform the task of detecting
>>     
>>     this and failing. But not at all necessary. As the quoted paragraph
>>     
>>     correctly notes, consumers still need to handle PSBTs with duplicate keys.
>>     
>>     (In this context, your implied permissive/restrictive Combiner is
>>     
>>     optimized for dealing with invalid data. That seems like a wrong
>>     
>>     optimization.)
>>     
>>     A reasonable point to decide is whether the handling at the consumer
>>     
>>     should be permissive or restrictive. Personally I'm OK with either. I'd
>>     
>>     go with the following change:
>>     
>>     """
>>     
>>     In this event, the software MAY reject the transaction as invalid. If it
>>     
>>     decides to accept it, it MUST choose the last value encountered.
>>     
>>     """
>>     
>>     (deterministic way of choosing, instead of "whichever you like")
>>     
>>     We could also drop the first part, explicitly allowing consumers to
>>     
>>     pick, and simplifying the Combiner algorithm to `sort -u`.
>>     
>>     Note that this sort of "picking" will probably be implicit. I'd expect
>>     
>>     the consumer to look like this:
>>     
>>
>>     for key, value in parse(nextRecord()):
>>       data[key] = value
>>     
>>
>> Or we could drop the second part and switch MAY to MUST, for a fully
>>
>> restrictive mode - which, funnily enough, still lets the Combiner work
>>
>> as `sort -u`.
>>
>> To see why, remember that distinct values for the same key are not
>>
>> allowed in fully restrictive mode. If a Combiner encounters two
>>
>> conflicting values F(1) and F(2), it should fail -- but if it doesn't,
>>
>> it includes both and the same failure WILL happen on the fully
>>
>> restrictive consumer.
>>
>> This was (or is) my point of confusion re Combiners: the permissive key
>>
>> -   restrictive value mode of operation doesn't seem to help subsequent
>>     
>>     consumers in any way.
>>     
>>     Now, for the fully restrictive consumer, the key-value model is indeed
>>     
>>     advantageous (and this is the only scenario that I can imagine in which
>>     
>>     it is advantageous), because you can catch key duplication on the parser
>>     
>>     level.
>>     
>>     But as it turns out, it's not enough. Consider the following records:
>>     
>>     key(<PSBT_IN_REDEEM_SCRIPT> + abcde), value(<some redeem script>)
>>     
>>
>> and:
>>
>> key(<PSBT_IN_REDEEM_SCRIPT> + fghij), value(<some other redeem script>)
>>
>> A purely syntactic Combiner simply can't handle this case. The
>>
>> restrictive consumer needs to know whether the key is supposed to be
>>
>> repeating or not.
>>
>> We could fix this, e.g., by saying that repeating types must have high
>>
>> bit set and non-repeating must not. We also don't have to, because the
>>
>> worst failure here is that a consumer passes an invalid record to a
>>
>> subsequent one and the failure happens one step later.
>>
>> At this point it seems weird to be concerned about the "unique key"
>>
>> correctness, which is a very small subset of possibly invalid inputs. As
>>
>> a strict safety measure, I'd instead propose that a consumer MUST NOT
>>
>> operate on inputs or outputs, unless it understand ALL included fields -
>>
>> IOW, if you're signing a particular input, all fields in said input are
>>
>> mandatory. This prevents a situation where a simple Signer processes an
>>
>> input incorrectly based on incomplete set of fields, while still
>>
>> allowing Signers with different capabilities within the same PSBT.
>>
>> (The question here is whether to have either a flag or a reserved range
>>
>> for "optional fields" that can be safely ignored by consumers that don't
>>
>> understand them, but provide data for consumers who do.)
>>
>>>> To repeat and restate my central question: Why is it important,
>>>>
>>>> that an agent which doesn't understand a particular field
>>>>
>>>> structure, can nevertheless make decisions about its inclusion or
>>>>
>>>> omission from the result (based on a repeated prefix)?
>>>
>>> Again, because otherwise you may need a separate Combiner for each
>>>
>>> type of script involved. That would be unfortunate, and is very
>>>
>>> easily avoided.
>>
>> This is still confusing to me, and I would really like to get to the
>>
>> same page on this particular thing, because a lot of the debate hinges
>>
>> on it. I think I covered most of it above, but there are still pieces to
>>
>> clarify.
>>
>> As I understand it, the Combiner role (actually all the roles) is mostly
>>
>> an algorithm, with the implication that it can be performed
>>
>> independently by a separate agent, say a network node.
>>
>> So there's two types of Combiners:
>>
>> a) Combiner as a part of an intelligent consumer -- the usual scenario
>>
>> is a Creator/Combiner/Finalizer/Extractor being one participant, and
>>
>> Updater/Signers as other participants.
>>
>> In this case, the discussion of "simple Combiners" is actually talking
>>
>> about intelligent Combiners which don't understand new fields and must
>>
>> correctly pass them on. I argue that this can safely be done without
>>
>> loss of any important properties.
>>
>> b) Combiner as a separate service, with no understanding of semantics.
>>
>> Although parts of the debate seem to assume this scenario, I don't think
>>
>> it's worth considering. Again, do you have an usecase in mind for it?
>>
>> You also insist on enforcing a limited form of correctness on the
>>
>> Combiner level, but that is not worth it IMHO, as discussed above.
>>
>> Or am I missing something else?
>>
>>> Perhaps you want to avoid signing with keys that are already signed
>>>
>>> with? If you need to derive all the keys before even knowing what
>>>
>>> was already signed with, you've already performed 80% of the work.
>>
>> This wouldn't concern me at all, honestly. If the user sends an already
>>
>> signed PSBT to the same signer, IMHO it is OK to sign again; the
>>
>> slowdown is a fault of the user/workflow. You could argue that signing
>>
>> again is the valid response. Perhaps the Signer should even "consume"
>>
>> its keys and not pass them on after producing a signature? That seems
>>
>> like a sensible rule.
>>
>>> To your point: proto v2 afaik has no way to declare "whole record
>>>
>>> uniqueness", so either you drop that (which I think is unacceptable
>>>
>>> -   see the copy/sign/combine argument above), or you deal with it in
>>>     
>>>     your application code.
>>>     
>>
>> Yes. My argument is that "whole record uniqueness" isn't in fact an
>>
>> important property, because you need application-level checks anyway.
>>
>> Additionally, protobuf provides awareness of which fields are repeated
>>
>> and which aren't, and implicitly implements the "pick last" resolution
>>
>> strategy for duplicates.
>>
>> The simplest possible protobuf-based Combiner will:
>>
>> -   assume all fields are repeating
>> -   concatenate and parse
>> -   deduplicate and reserialize.
>>     
>>     More knowledgeable Combiner will intelligently handle non-repeating
>>     
>>     fields, but still has to assume that unknown fields are repeating and
>>     
>>     use the above algorithm.
>>     
>>     For "pick last" strategy, a consumer can simply parse the message and
>>     
>>     perform appropriate application-level checks.
>>     
>>     For "hard-fail" strategy, it must parse all fields as repeating and
>>     
>>     check that there's only one of those that are supposed to be unique.
>>     
>>     This is admittedly more work, and yes, protobuf is not perfectly suited
>>     
>>     for this task.
>>     
>>     But:
>>     
>>     One, this work must be done by hand anyway, if we go with a custom
>>     
>>     hand-parsed format. There is a protobuf implementation for every
>>     
>>     conceivable platform, we'll never have the same amount of BIP174 parsing
>>     
>>     code.
>>     
>>     (And if you're hand-writing a parser in order to avoid the dependency,
>>     
>>     you can modify it to do the checks at parser level. Note that this is
>>     
>>     not breaking the format! The modifed parser will consume well-formed
>>     
>>     protobuf and reject that which is valid protobuf but invalid bip174 - a
>>     
>>     correct behavior for a bip174 parser.)
>>     
>>     Two, it is my opinion that this is worth it in order to have a standard,
>>     
>>     well described, well studied and widely implemented format.
>>     
>>     Aside: I ha that there is no advantage to a record-set based
>>     
>>     custom format by itself, so IMHO the choice is between protobuf vs
>>     
>>     a custom key-value format. Additionally, it's even possible to implement
>>     
>>     a hand-parsable key-value format in terms of protobuf -- again, arguing
>>     
>>     that "standardness" of protobuf is valuable in itself.
>>     
>>     regards
>>     
>>     m.
>>     
>>
>> bitcoin-dev mailing list
>>
>> bitcoin-dev@lists•linuxfoundation.org
>>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-07-04 13:19                           ` matejcik
@ 2018-07-04 18:35                             ` Achow101
  2018-07-05 17:23                               ` Jason Les
  2018-07-04 19:09                             ` Pieter Wuille
  1 sibling, 1 reply; 54+ messages in thread
From: Achow101 @ 2018-07-04 18:35 UTC (permalink / raw)
  To: matejcik; +Cc: Bitcoin Protocol Discussion

Hi,​


On July 4, 2018 6:19 AM, matejcik <jan.matejek@satoshilabs•com> wrote:

> ​​
> 
> hello,
> 
> we still have some concerns about the BIP as currently proposed - not
> 
> about the format or data contents, but more about strictness and
> 
> security properties. I have raised some in the previous e-mails, but
> 
> they might have been lost in the overall talk about format.
> 
> -   Choosing from duplicate keys when combining.
>     
>     We believe that "choose whichever value it wishes" is not a good
>     
>     resolution strategy. We propose to either change this to "in case of
>     
>     conflicts, software MUST reject the conflicting PSBTs", or explain in
>     
>     more detail why picking at random is a safe choice.

You cannot simply reject PSBTs for having conflicting values for the same key. Especially
for the Partial Signatures, you can have two signatures for the same pubkey that are both
completely valid. This situation could happen, for example, if a signer that does not use deterministic
k values can sign multiple inputs but one input is missing a UTXO so it doesn't sign it. So it receives
 one PSBT and signs the first input but not the second. It receives a PSBT for the same transaction
which has the second input's UTXO but does not have its signatures for the first input. The signer
would sign both inputs. When the two PSBTs are combined (suppose the first PSBT has other 
signatures too), you will have two keys that have different values. The different values are both
valid signatures, just with different k values since they were randomly generated instead of
deterministically. If we fail to merge these, then you could potentially have a situation where
nothing can be done with the PSBTs now, or now everyone has to resign and in some specific
order to avoid the conflict. That complicates things and is much more annoying to deal with.
So a simple solution is to allow the combiner to choose any value it wants as it is likely that
both values are valid.

Allowing combiners to choose any value also allows for intelligent combiners to choose the
correct values in the case of conflicts. A smart combiner could, when combining redeem scripts
and witness scripts, check that the redeem scripts and witness scripts match the hash provided
in the UTXO (or in the redeem script) and choose the correct redeem script and witness script
accordingly if there were, for some reason, a conflict there.

Can you explain why it would be unsafe for combiners to arbitrarily choose a value?

>     
> -   Signing records with unknown keys.
>     
>     There's been some talk about this at start, but there should be a clear
>     
>     strategy for Signers when unknown fields are encountered. We intend to
>     
>     implement the rule: "will not sign an input with any unknown fields
>     
>     present".
>     
>     Maybe it is worth codifying this behavior in the standard, or maybe
>     
>     there should be a way to mark a field as "optional" so that strict
>     
>     Signers know they can safely ignore the unknown field.

I think that requiring there to be no unknowns is a safe change.

>     
>     And two minor points:
>     
> -   Fields with empty keys.
>     
>     This might be inferred from the definition, but is probably worth
>     
>     spelling out explicitly: If a field definition states that the key data
>     
>     is empty, an implementation MUST enforce this and reject PSBTs that
>     
>     contain non-empty data.
>     
>     We suggest adding something to the effect of:
>     
>     "If a key or value data in a field doesn't match the specified format,
>     
>     the PSBT is invalid. In particular, if key data is specified as "none"
>     
>     but the key contains data beyond the type specifier, implementation MUST
>     
>     reject the PSBT."
>     
>     (not sure about the languge, this should of course allow processing
>     
>     unknown fields)

Agreed.

>     
> -   "Combiner can detect inconsistencies"
>     
>     Added in response to this comment [1], the current wording looks like
>     
>     it's describing what the Combiner is capable of, as opposed to
>     
>     prescribing what the combiner is allowed to do.
>     
>     We suggest changing to something like:
>     
>     "For every field type that the Combiner understands, it MAY also refuse
>     
>     to combine PSBTs that have inconsistencies in that field, or cause a
>     
>     conflict when combined."

Agreed.


Andrew


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-07-04 13:19                           ` matejcik
  2018-07-04 18:35                             ` Achow101
@ 2018-07-04 19:09                             ` Pieter Wuille
  2018-07-05 11:52                               ` matejcik
  1 sibling, 1 reply; 54+ messages in thread
From: Pieter Wuille @ 2018-07-04 19:09 UTC (permalink / raw)
  To: matejcik; +Cc: Bitcoin Protocol Discussion

On Wed, Jul 4, 2018 at 6:19 AM, matejcik <jan.matejek@satoshilabs•com> wrote:
> hello,
>
> we still have some concerns about the BIP as currently proposed - not
> about the format or data contents, but more about strictness and
> security properties. I have raised some in the previous e-mails, but
> they might have been lost in the overall talk about format.
>
> * Choosing from duplicate keys when combining.
> We believe that "choose whichever value it wishes" is not a good
> resolution strategy. We propose to either change this to "in case of
> conflicts, software MUST reject the conflicting PSBTs", or explain in
> more detail why picking at random is a safe choice.

Outlawing conflicting values would imply forcing all Signers to
implement fixed deterministic nonce generation, which I don't think it
very desirable. Otherwise PSBTs that got copied and signed and
combined again may fail. So I think we should see it the other way: we
choose the keys in such a way that picking arbitrarily is safe. If
there really is a future extension for which it would not be the case
that picking arbitrarily is acceptable, more data can be moved to the
keys, and leave the actual resolution strategy to the Finalizer. That
way Combiners can remain dumb and not need script-specific logic in
every interaction.

An alternative would be to have a fixed resolution strategy (for
example, when combining multiple PSBTs, pick the value from the first
one that has a particular key set), but I don't think this adds very
much - if picking the first is fine, picking a arbitrary one should be
fine too.

> * Signing records with unknown keys.
> There's been some talk about this at start, but there should be a clear
> strategy for Signers when unknown fields are encountered. We intend to
> implement the rule: "will not sign an input with any unknown fields
> present".
> Maybe it is worth codifying this behavior in the standard, or maybe
> there should be a way to mark a field as "optional" so that strict
> Signers know they can _safely_ ignore the unknown field.

Can you envision a situation in which this is needed? In every
scenario I can come up with, the worst that can happen is that the
resulting signature is just invalid. For example, if PSBT existed
before segwit, and then was later extended to support it, a pre-segwit
signer would not recognize that BIP143 would need to be used for
segwit inputs, and produce signatures using the old sighashing
algorithm. The result is just an invalid signature.

I believe that what you're trying to accomplish is preventing signing
something you don't understand, but that's an independent issue.
Signers generally will want to inspect the transaction they're
signing, or ask for confirmation w.r.t. fees or payment destinations
involved. The case where unknown fields are present for a reason you'd
want to withhold signing for will generally also just be the situation
where you don't understand the transaction you're signing.

Here is (perhaps far fetched) example of why it may not be desirable
to reject unknown fields when signing. Imagine an extension is defined
which adds pay-to-contract derivation for keys (Q = P + H(Q||C)G);
this would be a field similar to the current BIP32 derivation one, but
instead give a base key P and a contract C. Now say there is a 2-of-2
multisig in which you're one signer, and the other signer is (unknown
to you) using P2C. After the other party Updating, the input would
contain a P2C field which you don't understand - but it also isn't
something you care about or affects you.

I would not be opposed to having fields with an explicit flag bit that
says "Don't sign if you don't understand this", but I expect that that
can also be left for future extensions.

> * Fields with empty keys.
> This might be inferred from the definition, but is probably worth
> spelling out explicitly: If a field definition states that the key data
> is empty, an implementation MUST enforce this and reject PSBTs that
> contain non-empty data.
> We suggest adding something to the effect of:
> "If a key or value data in a field doesn't match the specified format,
> the PSBT is invalid. In particular, if key data is specified as "none"
> but the key contains data beyond the type specifier, implementation MUST
> reject the PSBT."
> (not sure about the languge, this should of course allow processing
> unknown fields)

Completely agree here. Any implementation that understands a
particular field must enforce whatever structure the field is known to
have.

> * "Combiner can detect inconsistencies"
> Added in response to this comment [1], the current wording looks like
> it's describing what the Combiner is _capable of_, as opposed to
> prescribing what the combiner is _allowed to_ do.
> We suggest changing to something like:
> "For every field type that the Combiner understands, it MAY also refuse
> to combine PSBTs that have inconsistencies in that field, or cause a
> conflict when combined."

Agree, just because Combiners are expected to work correctly on
unknown fields doesn't mean they can't enforce extra consistency
checks on known fields.

Cheers,

-- 
Pieter


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [bitcoin-dev]  BIP 174 thoughts
  2018-07-04 19:09                             ` Pieter Wuille
@ 2018-07-05 11:52                               ` matejcik
  2018-07-05 22:06                                 ` Pieter Wuille
  0 siblings, 1 reply; 54+ messages in thread
From: matejcik @ 2018-07-05 11:52 UTC (permalink / raw)
  To: Pieter Wuille; +Cc: Bitcoin Protocol Discussion


[-- Attachment #1.1: Type: text/plain, Size: 5762 bytes --]


On 4.7.2018 20:35, Achow101 wrote:
> You cannot simply reject PSBTs for having conflicting values for the same key. Especially
> for the Partial Signatures, you can have two signatures for the same pubkey that are both

(...)

> order to avoid the conflict. That complicates things and is much more annoying to deal with.
> So a simple solution is to allow the combiner to choose any value it wants as it is likely that
> both values are valid.
>
> Allowing combiners to choose any value also allows for intelligent combiners to choose the
> correct values in the case of conflicts. A smart combiner could, when combining redeem scripts
> and witness scripts, check that the redeem scripts and witness scripts match the hash provided
> in the UTXO (or in the redeem script) and choose the correct redeem script and witness script
> accordingly if there were, for some reason, a conflict there.
>
> Can you explain why it would be unsafe for combiners to arbitrarily choose a value?

We're worried that the "pick one of non-deterministic signatures" is a
special case and that most fields don't have this property:

* conflicts in UTXOs, sighash type, redeem/witness scripts, derivation
paths, are at best a recoverable error, usually an unrecoverable error,
at worst malicious activity.

* conflict in finalized scripts, in case more than one valid
finalization exists, might indicate that the Finalizers picked different
ND signatures, or it might indicate two possible interpretations of the
transaction (see next point). Picking arbitrarily in the latter case
would be an error.

* even for partial signatures: if two Signers with the same public key
use different sighash types, the Combiner shouldn't pick the winning one
arbitrarily.

It seems generally safer to default to rejecting conflicts, and
explicitly allowing the Combiner to process them intelligently if it
understands the relevant fields.


On 4.7.2018 21:09, Pieter Wuille wrote:
> combined again may fail. So I think we should see it the other way: we
> choose the keys in such a way that picking arbitrarily is safe. If
> there really is a future extension for which it would not be the case
> that picking arbitrarily is acceptable, more data can be moved to the
> keys, and leave the actual resolution strategy to the Finalizer.

I like this explanation and I think that if nothing else, this should be
spelled out explicitly in the spec.

But I don't think it answers the above points very well.


> An alternative would be to have a fixed resolution strategy (for
> example, when combining multiple PSBTs, pick the value from the first
> one that has a particular key set), but I don't think this adds very
> much - if picking the first is fine, picking a arbitrary one should be
> fine too.

Agreed.


>> * Signing records with unknown keys.
>> There's been some talk about this at start, but there should be a clear
>> strategy for Signers when unknown fields are encountered. We intend to
>> implement the rule: "will not sign an input with any unknown fields
>> present".
>> Maybe it is worth codifying this behavior in the standard, or maybe
>> there should be a way to mark a field as "optional" so that strict
>> Signers know they can _safely_ ignore the unknown field.
> 
> Can you envision a situation in which this is needed? In every
> scenario I can come up with, the worst that can happen is that the
> resulting signature is just invalid.

(...)

> I believe that what you're trying to accomplish is preventing signing
> something you don't understand, but that's an independent issue.

We're actually trying to prevent signing something we don't _intend_.

I agree with your response, and I also think that in technical sense,
the worst that can happen is an invalid signature. Our concern is twofold:

1. the produced signature is most likely valid, _for a different
transaction_ than the Creator intended. It is a transaction that the
Signer must have authorized, so we could argue that they would not mind
if that unintended transaction was published. Nevertheless, this opens
an attack surface.

2. defence in depth: the "worst that can happen" assumption is only
valid if the rest of the protocol does things right.

At an intersection lies an example: say there's a fork that changes
format of inputs in the network serialized tx, in a way that happens to
be invisible to PSBT (because scripts must be set to empty). To
differentiate, we add a "Fork ID", but old Signers will still produce
signatures valid on the original chain - and, importantly, this will be
invisible to users.

This is of course contrived and several mistakes would have to happen at
the same time, but that's what defence in depth is for.


> Here is (perhaps far fetched) example of why it may not be desirable
> to reject unknown fields when signing. Imagine an extension is defined

This is definitely worth taking into consideration. But I'd argue that
some way of signalling "optionalness" is a better match here.
Alternately, a pre-processor with the appropriate knowledge can strip
the new fields for a legacy Signer - that's what I expect to happen in
practice.

> I would not be opposed to having fields with an explicit flag bit that
> says "Don't sign if you don't understand this", but I expect that that
> can also be left for future extensions.

It seems safer to consider this flag be on by default, and leave it to a
future extension to allow non-mandatory fields. The worst case here is
that legacy Signers can't natively process new PSBTs (solvable by a
preprocessor) - as opposed to legacy Signers signing unintended values.

regards
m.






[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-07-04 18:35                             ` Achow101
@ 2018-07-05 17:23                               ` Jason Les
  0 siblings, 0 replies; 54+ messages in thread
From: Jason Les @ 2018-07-05 17:23 UTC (permalink / raw)
  To: Achow101, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 434 bytes --]

Has there been any thought to standardizing file names used when creating
.psbt files? Maybe something that gives some reliability of being collision
resistant and descriptive. For example:

[8 char trim of hash of unsigned tx]+[Role that created file (Ex:
Signer)]+[4 char trim of hash of data unique to that role (Ex: partial sig)]

It may be useful to especially the combiner to have some idea of what files
they have.

-Jason Les

[-- Attachment #2: Type: text/html, Size: 528 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-06-27 17:55                     ` Achow101
  2018-06-28 20:42                       ` Rodolfo Novak
@ 2018-07-05 19:20                       ` William Casarin
  2018-07-06 18:59                         ` Achow101
  1 sibling, 1 reply; 54+ messages in thread
From: William Casarin @ 2018-07-05 19:20 UTC (permalink / raw)
  To: Achow101; +Cc: Bitcoin Protocol Discussion


I have another concern with the format. (my original bip comment for some context: [1])

It looks like the one of the reasons I was confused is because you can
only parse the format properly by first deserializing the transaction.
Since there is no "length" field for the key-value map arrays, you must
count the number of transaction input/outputs, and use that as the
number of kv maps to parse.

This is pretty brittle, because now if a Combiner writes the wrong
number of key-value maps that don't align with the number of inputs and
outputs in the transaction, then the psbt will not be able to be
deserialized properly, but is still a valid PSBT. It can't even detect
these situations, because the input and output types share the same enum
values. I don't see anywhere that says the number of key value maps MUST
match the number of inputs/outputs, perhaps it's implied?

I think I think we should either make this explicit in the BIP, add an
array length prefix, or make all (global/input/output) types share the
same enum.

Cheers,
William

[1] https://github.com/bitcoin/bips/pull/694#issuecomment-402812041


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-07-05 11:52                               ` matejcik
@ 2018-07-05 22:06                                 ` Pieter Wuille
  2018-07-10 12:10                                   ` matejcik
  0 siblings, 1 reply; 54+ messages in thread
From: Pieter Wuille @ 2018-07-05 22:06 UTC (permalink / raw)
  To: matejcik; +Cc: Bitcoin Protocol Discussion

On Thu, Jul 5, 2018 at 4:52 AM, matejcik <jan.matejek@satoshilabs•com> wrote:
>> Allowing combiners to choose any value also allows for intelligent combiners to choose the
>> correct values in the case of conflicts. A smart combiner could, when combining redeem scripts
>> and witness scripts, check that the redeem scripts and witness scripts match the hash provided
>> in the UTXO (or in the redeem script) and choose the correct redeem script and witness script
>> accordingly if there were, for some reason, a conflict there.
>>
>> Can you explain why it would be unsafe for combiners to arbitrarily choose a value?
>
> We're worried that the "pick one of non-deterministic signatures" is a
> special case and that most fields don't have this property:
>
> * conflicts in UTXOs, sighash type, redeem/witness scripts, derivation
> paths, are at best a recoverable error, usually an unrecoverable error,
> at worst malicious activity.
>
> * conflict in finalized scripts, in case more than one valid
> finalization exists, might indicate that the Finalizers picked different
> ND signatures, or it might indicate two possible interpretations of the
> transaction (see next point). Picking arbitrarily in the latter case
> would be an error.
>
> * even for partial signatures: if two Signers with the same public key
> use different sighash types, the Combiner shouldn't pick the winning one
> arbitrarily.
>
> It seems generally safer to default to rejecting conflicts, and
> explicitly allowing the Combiner to process them intelligently if it
> understands the relevant fields.

So consider two possible topologies for a multiparty signing:

A) Creator and Updater produce a PSBT T. T is sent to signer 1 who
turns it into PSBT T1. T1 is then forwarded to Signer 2 who turns it
into T12. A Finalizer extracts the transaction.
B) Creator and Updater produce a PSBT T. T is sent to signer 1 and 2
simultaneously, who then produce T1 and T2 respectively. A Combiner
combines those into T12. A Finalizer extracts the transaction.

The only case where "malicious" conflicting values can occur is when
one of the Signers produces an invalid signature, or modifies any of
the other fields already present in the PSBT for consumption by
others. If this were an issue, it would be an issue regardless of the
Combiner's operation, as in topology A no Combiner is even present.
This is generally true I think - Combiners can always be replaced with
just a different (and possibly less parallel) topology of data flow.

So the question is independent of Combiners IMHO, and really about how
we deal with roles that intentionally or unintentionally produce
invalid values. I believe this is mostly not an issue. Let's go over
the cases:
* If a partial signature is invalid, the resulting transaction will be invalid.
* if a non-witness UTXO is incorrect, you'll fail to sign because the
txid mismatches the input's prevout (which you do have to check)
* If a witness UTXO is incorrect, the resulting signature will be invalid.
* If a derivation path is incorrect, the signer will fail to find the
key, or sign with the wrong key resulting in an invalid transaction.
* If a witnessscript or redeemscript is incorrect, the resulting
signature will be invalid (as those go into the scriptCode of the
sighash, and affect the type of sighashing)
* If a sighash type is incorrect, the resulting transaction may be
useless for its intended purpose (but still something every signer
agreed to sign).

So all of this boils down to dealing with the possibility that there
can be roles which intentionally or unintentionally create incorrect
fields in a PSBT, and the solution is (a) checking that prevout txids
match non-witness utxos (b) checking that the transaction you're
signing is one you want to sign (including sighash type) (c) worst
case accepting that the resulting transaction may be invalid.

Now, (c) can sometimes be caught early, by implementing additional
sanity checks for known fields. For example, rejecting PSBTs with
partial signatures that are invalid (feed them through a verifier).
This is something a Combiner can of course optionally do, but so can a
Signer or any other role.

The bottom line is that a Combiner which picks arbitrarily in case of
conflicts will never end up with something worse than what you already
need to deal with. If you disregard the case of invalid fields
(because the result will just be an invalid transaction), then any
choice the Combiner makes is fine, because all the values it can pick
from are valid.

> I agree with your response, and I also think that in technical sense,
> the worst that can happen is an invalid signature. Our concern is twofold:
>
> 1. the produced signature is most likely valid, _for a different
> transaction_ than the Creator intended. It is a transaction that the
> Signer must have authorized, so we could argue that they would not mind
> if that unintended transaction was published. Nevertheless, this opens
> an attack surface.

If you're worried about attack surface, I don't believe rejecting
invalid fields ever matters. An attacker can always drop the fields
you don't understand before giving you the PSBT, making your behavior
identical to one where you'd have ignore those fields in the first
place.

At best, you can make it protect against accidental mistakes that
would result in invalid transactions anyway.

If there is a way to sign a message in a way that can be
misinterpreted as a signature on a different message with a different
meaning, then that is a huge flaw in Bitcoin itself, and not going to
be solved by rejecting to sign unknown fields.

With regard to defense in depth:

>> I would not be opposed to having fields with an explicit flag bit that
>> says "Don't sign if you don't understand this", but I expect that that
>> can also be left for future extensions.
>
> It seems safer to consider this flag be on by default, and leave it to a
> future extension to allow non-mandatory fields. The worst case here is
> that legacy Signers can't natively process new PSBTs (solvable by a
> preprocessor) - as opposed to legacy Signers signing unintended values.

There could be some rule like "if the highest bit of the field type is
set, don't sign", but I don't think there is any current field where
such a flag would be necessary right now.

Cheers,

-- 
Pieter


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-07-05 19:20                       ` William Casarin
@ 2018-07-06 18:59                         ` Achow101
  0 siblings, 0 replies; 54+ messages in thread
From: Achow101 @ 2018-07-06 18:59 UTC (permalink / raw)
  To: William Casarin; +Cc: Bitcoin Protocol Discussion

Hi,

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On July 5, 2018 12:20 PM, William Casarin <jb55@jb55•com> wrote:

> ​​
> 
> I have another concern with the format. (my original bip comment for some context: [1])
> 
> It looks like the one of the reasons I was confused is because you can
> 
> only parse the format properly by first deserializing the transaction.
> 
> Since there is no "length" field for the key-value map arrays, you must
> 
> count the number of transaction input/outputs, and use that as the
> 
> number of kv maps to parse.

I don't think this is really a problem.

Almost all roles have to deserialize the unsigned tx anyways before they can do anything.
The only role that doesn't is a simple combiner (a combiner that does sanity checks would
still have to deserialize the unsigned tx), and even then it doesn't matter. It just shoves
key value pairs together and doesn't need to know whether the map is for an input or for
an output.

> 
> This is pretty brittle, because now if a Combiner writes the wrong
> 
> number of key-value maps that don't align with the number of inputs and
> 
> outputs in the transaction, then the psbt will not be able to be
> 
> deserialized properly, but is still a valid PSBT. It can't even detect
> 
> these situations, because the input and output types share the same enum
> 
> values. 

If a combiner writes the wrong number of key-value maps, then it would simply be invalid
to the next person that receives the PSBT. It would not deserialize properly because the
key value pairs would have incorrect values for their types. Not deserializing properly means
that the PSBT is simply invalid. The same numerical types might
be shared, but their meanings are different between the input and output types.

I don't see anywhere that says the number of key value maps MUST
> 
> match the number of inputs/outputs, perhaps it's implied?

I have added that to the BIP.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On July 5, 2018 10:23 AM, Jason Les <jasonles@gmail•com> wrote:

> Has there been any thought to standardizing file names used when creating .psbt files? Maybe something that gives some reliability of being collision resistant and descriptive. For example:
> 
> [8 char trim of hash of unsigned tx]+[Role that created file (Ex: Signer)]+[4 char trim of hash of data unique to that role (Ex: partial sig)]
> 
> It may be useful to especially the combiner to have some idea of what files they have.
> 
> -Jason Les

I haven't considered this, but I'm not sure if it is really useful. I don't think it is really necessary
for any role to know who created the PSBT. If it did, this information would generally come
out-of-band anyways as someone has to give the PSBT to that person.



Andrew




^ permalink raw reply	[flat|nested] 54+ messages in thread

* [bitcoin-dev]  BIP 174 thoughts
  2018-07-05 22:06                                 ` Pieter Wuille
@ 2018-07-10 12:10                                   ` matejcik
  2018-07-11 18:27                                     ` Pieter Wuille
  0 siblings, 1 reply; 54+ messages in thread
From: matejcik @ 2018-07-10 12:10 UTC (permalink / raw)
  To: Pieter Wuille; +Cc: Bitcoin Protocol Discussion


[-- Attachment #1.1: Type: text/plain, Size: 2826 bytes --]

On 6.7.2018 00:06, Pieter Wuille wrote:> The only case where "malicious"
conflicting values can occur is when
> one of the Signers produces an invalid signature, or modifies any of
> the other fields already present in the PSBT for consumption by
> others. If this were an issue, it would be an issue regardless of the
> Combiner's operation, as in topology A no Combiner is even present.
> This is generally true I think - Combiners can always be replaced with
> just a different (and possibly less parallel) topology of data flow.

This is an interesting thesis, and also an unspoken assumption ISTM. It
seems worth adding something like this to the spec:
"""
In general, the result of Combiner combining two PSBTs from independent
participants A and B should be functionally equivalent to a result
obtained from processing the original PSBT by A and then B in a sequence.
or, for participants performing fA(psbt) and fB(psbt):
Combine(fA(psbt), fB(psbt)) == fA(fB(psbt)) == fB(fA(psbt))
"""

(...)

> The bottom line is that a Combiner which picks arbitrarily in case of
> conflicts will never end up with something worse than what you already
> need to deal with. If you disregard the case of invalid fields
> (because the result will just be an invalid transaction), then any
> choice the Combiner makes is fine, because all the values it can pick
> from are valid.

This sounds reasonable and IMHO it would be good to have a summary of
this argument in the Rationale section.


> If you're worried about attack surface, I don't believe rejecting
> invalid fields ever matters. An attacker can always drop the fields
> you don't understand before giving you the PSBT, making your behavior
> identical to one where you'd have ignore those fields in the first
> place.

Modifying the PSBT requires an active attacker. A passive attacker could
possibly sniff the invalid signatures and misuse them.
Where an active attacker can likely do more than drop fields.


In general, this comes down to a philosophical difference again. I'm
reluctant to sign an input with unknown data, on the premise that there
could be *anything* in that data; the fact that right now I can't come
up with a field that would be problematic does not mean that tomorrow
won't bring one. (in particular, a potential failure here is silent,
invisible to the user)

We are most likely to implement the "do not sign with unknown fields"
rule in any case (technically a whitelist of "known OK" field types),
and resolve potential problems as they arise. I raised this point mainly
because I think discussing this explicitly in the spec is beneficial: a
distinction between mandatory and optional fields is one way, mentioning
or prescribing possible signing strategies is another.

regards
m.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-07-10 12:10                                   ` matejcik
@ 2018-07-11 18:27                                     ` Pieter Wuille
  2018-07-11 20:05                                       ` Gregory Maxwell
  0 siblings, 1 reply; 54+ messages in thread
From: Pieter Wuille @ 2018-07-11 18:27 UTC (permalink / raw)
  To: matejcik; +Cc: Bitcoin Protocol Discussion

On Tue, Jul 10, 2018 at 5:10 AM, matejcik <jan.matejek@satoshilabs•com> wrote:
> On 6.7.2018 00:06, Pieter Wuille wrote:> The only case where "malicious"
> conflicting values can occur is when
>> one of the Signers produces an invalid signature, or modifies any of
>> the other fields already present in the PSBT for consumption by
>> others. If this were an issue, it would be an issue regardless of the
>> Combiner's operation, as in topology A no Combiner is even present.
>> This is generally true I think - Combiners can always be replaced with
>> just a different (and possibly less parallel) topology of data flow.
>
> This is an interesting thesis, and also an unspoken assumption ISTM. It
> seems worth adding something like this to the spec:
> """
> In general, the result of Combiner combining two PSBTs from independent
> participants A and B should be functionally equivalent to a result
> obtained from processing the original PSBT by A and then B in a sequence.
> or, for participants performing fA(psbt) and fB(psbt):
> Combine(fA(psbt), fB(psbt)) == fA(fB(psbt)) == fB(fA(psbt))
> """

Adding that sounds like a good idea, indeed.

>> The bottom line is that a Combiner which picks arbitrarily in case of
>> conflicts will never end up with something worse than what you already
>> need to deal with. If you disregard the case of invalid fields
>> (because the result will just be an invalid transaction), then any
>> choice the Combiner makes is fine, because all the values it can pick
>> from are valid.
>
> This sounds reasonable and IMHO it would be good to have a summary of
> this argument in the Rationale section.

Sounds good.

>> If you're worried about attack surface, I don't believe rejecting
>> invalid fields ever matters. An attacker can always drop the fields
>> you don't understand before giving you the PSBT, making your behavior
>> identical to one where you'd have ignore those fields in the first
>> place.
>
> I'm reluctant to sign an input with unknown data, on the premise that there could be *anything* in that data

But the point is: you are not signing an input with unknown data. You
are signing your own interpretation (since you compute the sighash
yourself), which doesn't include what you don't understand. If that
interpretation doesn't match reality, the signature is at worst
useless. Who cares that someone added information about a transaction
that doesn't affect what you sign?

> We are most likely to implement the "do not sign with unknown fields"
> rule in any case (technically a whitelist of "known OK" field types),
> and resolve potential problems as they arise. I raised this point mainly
> because I think discussing this explicitly in the spec is beneficial: a
> distinction between mandatory and optional fields is one way, mentioning
> or prescribing possible signing strategies is another.

I don't think that's a particularly useful policy, but certainly
Signers are allowed to implement any policy they like about what they
accept in signing.

Cheers,

-- 
Pieter


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts
  2018-07-11 18:27                                     ` Pieter Wuille
@ 2018-07-11 20:05                                       ` Gregory Maxwell
  2018-07-11 20:54                                         ` [bitcoin-dev] BIP 174 thoughts on graphics vv01f
  0 siblings, 1 reply; 54+ messages in thread
From: Gregory Maxwell @ 2018-07-11 20:05 UTC (permalink / raw)
  To: Pieter Wuille, Bitcoin Protocol Discussion

On Wed, Jul 11, 2018 at 6:27 PM, Pieter Wuille via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
> I don't think that's a particularly useful policy, but certainly
> Signers are allowed to implement any policy they like about what they
> accept in signing.

Do we really want the specification to permit conforming
implementations to refuse to sign because there is extra metadata?
ISTM this would make it very hard to implement new features that
require extra data. For example, say you have a checkmultisig with one
key in it using schnorr multisignature which require the extra rounds
to establish an R, and the other keys are legacy stuff.  If the other
signer(s) suddenly stop working when there are extra fields irrelevant
to them, then this will create a substantial pressure to not extend
the PSBT in the intended way, but to instead find places to stuff the
extra data where it won't interfere with already deployed signers.
This would be really unfortunate since PSBT was created specifically
to avoid field stuffing (otherwise we could have accomplished all the
intended goals by field stuffing a bitcoin transaction encoding).

Obviously no signer should be signing data they don't understand,  but
extra data that they ignore which doesn't change their signature
should not stop them.  Another way of looking at it, perhaps somewhere
someplace some idiot defined signatures starting with 0xdead to give
away all the users funds or whatever.  That's something you "can't
understand" either, ... but no one would conclude because something
could happen somewhere that you don't know about that you just can't
sign at all... yet it is possible. :)

If someone wants to make a non-conforming signer, that is cool too and
they may have good reason to do so-- but I think it would be sad if
new applications get gunked up, slowed down or forced to use ugly
hacks, due to the intentional extension support in the protocol being
blocked by things claiming to support the spec.  The whole reason the
spec doesn't lock in exactly the possible fields and allow no others
is to allow extensions without breaking compatibility.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [bitcoin-dev] BIP 174 thoughts on graphics
  2018-07-11 20:05                                       ` Gregory Maxwell
@ 2018-07-11 20:54                                         ` vv01f
  0 siblings, 0 replies; 54+ messages in thread
From: vv01f @ 2018-07-11 20:54 UTC (permalink / raw)
  To: bitcoin-dev


[-- Attachment #1.1: Type: text/plain, Size: 335 bytes --]

this is intended to fix the graphics

* as not scaleable bitmap/png
* with broken capitalization
* not easy editable plaintext for git

have a view[1] on the suggestion for an example[0].


[0]:
https://github.com/bitcoin/bips/blob/master/bip-0174/coinjoin-workflow.png
[1]: https://de.sharelatex.com/read/hrvvwcvhrbyz



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 898 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2018-07-11 20:54 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-15 23:34 [bitcoin-dev] BIP 174 thoughts Pieter Wuille
2018-06-16 15:00 ` Peter D. Gray
2018-06-19  9:38 ` Jonas Schnelli
2018-06-19 14:20 ` matejcik
2018-06-19 15:20   ` Jonas Schnelli
2018-06-21 20:28     ` Peter D. Gray
2018-06-19 17:16   ` Pieter Wuille
2018-06-21 11:29     ` matejcik
2018-06-21 17:39       ` Pieter Wuille
2018-06-21 11:44     ` Tomas Susanka
2018-06-19 14:22 ` matejcik
2018-06-21  0:39 ` Achow101
2018-06-21 14:32   ` Tomas Susanka
2018-06-21 15:40     ` Greg Sanders
2018-06-21 19:56     ` Peter D. Gray
2018-06-21 21:39       ` Gregory Maxwell
2018-06-22 19:10       ` Pieter Wuille
2018-06-22 22:28         ` Achow101
2018-06-23 17:00           ` William Casarin
2018-06-23 20:33             ` Andrew Chow
2018-06-24  8:19               ` Andrea
2018-06-24  8:28                 ` Andrew Chow
2018-06-24  9:00                   ` Andrea
2018-06-23 18:27           ` Peter D. Gray
2018-06-25 19:47           ` Tomas Susanka
2018-06-25 20:10             ` Jonas Schnelli
2018-06-25 20:30             ` Achow101
2018-06-26 15:33               ` matejcik
2018-06-26 16:58                 ` William Casarin
2018-06-26 17:11                   ` Marek Palatinus
2018-06-27 14:11                   ` matejcik
2018-06-26 20:30                 ` Pieter Wuille
2018-06-27 14:04                   ` matejcik
2018-06-27 15:06                     ` Pieter Wuille
2018-06-29  9:53                       ` matejcik
2018-06-29 19:12                         ` Achow101
2018-06-29 20:31                           ` Peter D. Gray
2018-07-04 13:19                           ` matejcik
2018-07-04 18:35                             ` Achow101
2018-07-05 17:23                               ` Jason Les
2018-07-04 19:09                             ` Pieter Wuille
2018-07-05 11:52                               ` matejcik
2018-07-05 22:06                                 ` Pieter Wuille
2018-07-10 12:10                                   ` matejcik
2018-07-11 18:27                                     ` Pieter Wuille
2018-07-11 20:05                                       ` Gregory Maxwell
2018-07-11 20:54                                         ` [bitcoin-dev] BIP 174 thoughts on graphics vv01f
2018-06-26 21:56                 ` [bitcoin-dev] BIP 174 thoughts Achow101
2018-06-27  6:09                   ` William Casarin
2018-06-27 13:39                     ` Andrea
2018-06-27 17:55                     ` Achow101
2018-06-28 20:42                       ` Rodolfo Novak
2018-07-05 19:20                       ` William Casarin
2018-07-06 18:59                         ` Achow101

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox