[bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization
@ 2023-04-28  8:29 Weiji Guo
  2023-04-30  2:15 ` ZmnSCPxj
  0 siblings, 1 reply; 8+ messages in thread
From: Weiji Guo @ 2023-04-28  8:29 UTC (permalink / raw)
  To: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 14819 bytes --]

Hey everyone,

I am writing this email to propose a new opcode to enable zero knowledge
based spending authorization for Bitcoin. This new opcode OP_ZKP will
enable the Bitcoin network to authorize spending based on off-chain
computation, provided acceptable proof is supplied. This will not only
equip the Bitcoin script with Turing completeness, but also enable the
building of payment channels more flexible, stablecoin, decentralized
exchange, DeFi, etc. directly over the Bitcoin network, or even a layer 2.
All these could be accomplished with a soft fork (and follow-up building).

Before any BIP could be requested, I’d like to discuss all the aspects in
more detail, to cover as many corners as possible, and hopefully to build
consensus among developers and the community.

*### 0. Overview*

Here are what I am currently considering, listed as starting points for
discussion. I hope that I have covered all major issues, but please do feel
free to add more.

   1. How it works: we add OP_ZKP and OP_ZKPVERIFY using some unused OP_NOP
   codes. OP_ZKP works similarly to OP_MULTISIG, with a number parameter to
   indicate how many public inputs are to be read;
   2. Security: to bind the spending conditions to certain UTXO set,
   amount, and recipients, the hash of all this information shall be used as
   public input to the proof;
   3. Dealing with system limitation: verification keys could be very long
   and exceed the MAX_SCRIPT_ELEMENT_SIZE (520 bytes). They could be put into
   configurations and only use their hash in the scriptPubKey. The
   configuration information such as new verification keys could be propagated
   through P2P messages (we might need a separate BIP for this);
   4. Scalability: arrangement for miners or computing power vendors to
   aggregate some proofs for scalability and fee reduction. Alternatively,
   this could be accomplished with recursive proof.
   5. ZKP scheme and curve choices.
   6. Potential uses are unlimited, although there is no apparent secret
   key (private key or seed) in play.
   7. Ecology implications. Impacts on wallets, etc.

Below, I will introduce/discuss the above-listed item one by one. Mostly I
have to keep the discussion minimal and brief. It is still a bit lengthy,
please bear with me.

*### 1. How it works*

Consider below script:

scriptPubKey: <hash of the verification key> <Groth16_BN254> OP_ZKP

scriptSig: <pubInput_1> <pubInput_2> … <pubInput_n> <n> <proof>

<Groth16_BN254> is only an example in place of an indicator for the ZKP
scheme and curve parameters. Other combinations are possible. Further
discussion is provided in section 5.

<hash of the verification key> - the node implementation should look up a
verification key that hashed to this value. Verification keys tend to be
long and exceed the limitation imposed by MAX_SCRIPT_ELEMENT_SIZE (520
bytes). And for various schemes/circuits, the size might be different. So a
hash is used here. Further discussion is covered in section 3.

<proof> refers to the proof data to be verified. Its size is also subject
to MAX_SCRIPT_ELEMENT_SIZE. This might limit our choices of the ZKP scheme,
although proof data could be split into several parts.

<n> refers to how many public inputs are to be read. It should depend on
the circuit in use. However, this is *not* the actual number. There is also
an implicit public input, which is the hash of the transaction, calculated
according to the script context.

The evaluation is pretty straightforward. The implementation of OP_ZKP
reads in (and removes) all its parameters from the stack, calculates the
implicit public input from the transaction (UTXO inputs, amount,
recipients, etc.), and then leave a true or false in the stack after ZKP
verification. The security reason behind this implicit public input design
is discussed in the next section.

OP_ZKPVERIFY works the same except for the last step, where it leaves
nothing or fails the verification.

*### 2. Security: replay protection*

Proof should not be replayed to authorize spending of another set of UTXO.
Therefore it is critical to bind the proof to the transaction. This is
similar to OP_CHECKSIG: the transaction hash is needed to verify the proof
for OP_ZKP. To calculate the hash, just follow what the context requires.
For a SegWit transaction, BIP 0143 seems natural. And most of the time this
will be the case. But if those opcodes are indeed used outside of SegWit,
then the hashing algorithm before SegWit should be fine.

Binding proof to the transaction, especially the UTXO entries, also
protects from proof malleability, since once spent, the UTXO entries will
be removed from the available UTXO set.

On the other hand, since proof could be calculated by anybody given the
witness, circuit developers must take this into account. The private inputs
should be kept confidential. In some other uses, there might not be any
private inputs but only an authoritative signature, then it is the
authority’s responsibility to only sign valid information, and circuit
developers’ responsibility to ensure proper protection. Further discussion
is deferred for now.

*### 3. Dealing with system limitations: verification key*

The size of proof data or verification key could exceed the limitation
imposed by MAX_SCRIPT_ELEMENT_SIZE, which is 520 (bytes). In this section,
we discuss its implications for the verification key.

The size of a verification key varies with different schemes and circuits.
However, the number of total different circuits will be rather limited, at
least at the Bitcoin network level. Therefore, using the hash of the
verification key instead of the key itself renders fixed-size data, and
flexibility without worrying if the verification key gets too long.

Further, it is not advisable to hard code the verification key in the code
base of nodes, as new verification keys arise every few days or months.
Rather, P2P messaging seems a suitable way to propagate new verification
keys. Once received, a node could save it to a configuration file, and also
forward it to other nodes. If an unknown hash is encountered, the node
could also request it from its connected peers. If the node cannot gather a
new verification key in time, then it has to fail the transaction
verification. If too many nodes cannot get the new verification key in
time, the new block containing the offending transaction will not be
accepted by the Bitcoin network. It is suggested that a separate BIP be
requested to address these issues.

*### 4. Scalability*

ZKP verification usually costs at least tens of milli-seconds, and might
also cost more data size. So, it should be fine to have a few dozen ZKP
transactions in a block. But a few thousand will cost too much time to
verify and too much block space. In this section, we discuss how this could
be mitigated.

There are two options: proof aggregation (batching), and recursive
verification.

Some ZKP proofs are known to be capable of aggregation, however, with
certain limitations. For example, aggregating Groth16 proofs seems to
require that all proofs are of the same circuit, so that the value of γ and
δ remain the same across different proofs, as both parameters are from
phase 2 setup, thus circuit specific. We can also aggregate KZG polynomial
commitment scheme-based proofs. Still, the aggregator needs to ensure that
the hash of each transaction has been verified as well.

On the other hand, recursive verification can handle these transaction hash
verification, along with verifying the correctness of each proof. It might
be very computationally intensive to generate a recursive proof for a few
thousand proofs but there are lots of room for further optimization.

It is also possible to develop a hybrid strategy, that some proofs are
aggregated, with additional verification being proved, then both the
aggregation and additional proof could be verified by a recursive proof.

We might need a new SegWit version so that transactions can refer to its
proper proof once aggregated or recursively verified.

Replay protection: once aggregated or proved via recursion to the network,
a transaction cannot be played to the Bitcoin network again, as the UTXO
entries have been spent.

*### 5. ZKP scheme and curve choice*

A major concern is the size of the script, proof, cost of verification, and
the feasibility to aggregate/batching many proofs.

Groth16 checks out here with short proof size, cheap verification, and
feasibility to aggregate. For example, the standard draft here ([1]) shows
an implementation that can aggregate 1024 proofs in 2s and verifies them in
23ms. Although it seems to require that all proofs belong to the same
circuit ( due to γ and δ, which are circuit-specific). Still, the short
proof size and cheap verification cost make Groth16 suitable to verify
other proofs recursively. And when the upper layer proving schemes (and
setup) is the same, so are the verification circuits, therefore those
proofs could be aggregated.

On the other hand, KZG commitment supports batching. So the zkSnarks based
on KZG should support as well. Further, this paper by Boneh, etc. ([2])
enhances KZG commitment to allow one or two group element opening for
multiple polynomials each evaluated at a different set of points.

This is not an exhaustive list, not even a list. Overall, if we can
aggregate/batch many proofs together, then proof size is of smaller
concern. In this case, Plonk might be a better scheme than Groth16, if we
can implement a batching technique to batch proofs from different circuits
together. But still, in the beginning, there might not be many proofs to be
batched/aggregated at all.

We shall designate a parameter to OP_ZKP and OP_ZKPVERIFY to indicate which
scheme (and curve) to use.

For curve, BN254 is a natural choice as it has been adopted by many
blockchain projects. However, this is largely an open issue.

All being said, as a proposal, it is my opinion that we should support
Groth16-BN254 at the beginning. Based on that we could build recursive
verification circuits for other scheme, which does not require a soft fork.
Other schemes and curves, if much desired by the community, could be
included as well. Maybe we can designate a separate BIP for each
scheme-curve pair for direct support (instead of recursive verification).

*### 6. Potential uses*

The potential uses are unlimited: payment aggregation for a fee reduction
or anonymity, stablecoin, DeFi, decentralized exchange, DAO, NFT, etc.,
just to name a few. But still, we could take as an example some uses to
check if any issues arise and discuss how to address them.

A. ZKP-based smart contract. We could reasonably assume ZKP-based smart
contracts to be rather different from consensus-based smart contracts
(although OP_ZKP is part of a script). A notable difference is that with
Bitcoin’s UTXO model instead of account-model, account-balance type of
contracts are hard to build without involving external storage or state
management. And even with external storage involved to manage the states,
the security of the external storage is also a concern. Data might be lost
or corrupted. Bitcoin’s security does not extend to those contracts
automatically. It takes innovations to address these issues.

On the other hand, there could be smart contracts without a need for
explicit state management. This could be gaming or decentralized identity,
that a ZKP proof is used to prove ”something happens or exists according to
designated algorithm or criteria“.

B. Payment aggregation, or Bitcoin Layer 2. Every address can participate
in such a protocol by sending certain Bitcoin to the designated address
(the fund-holding address). Once signed up, participants can send Bitcoin
to each other with very little fee or pays a higher fee to send to Layer 1.
Bookkeeping each participant’s “balance” is a challenge, but might be
achievable via some complex circuits proving the ownership of certain UTXO
entries. Once proven, the fund-holding address can send Bitcoins to the
address as requested by the participant. It is another challenge to
maintain those UTXO entries off the Bitcoin network with high security. We
might also need to segregate those UXTOs from the Bitcoin network so that
no transaction can be replayed to the Bitcoin network or another Layer 2
protocol.

*### 7. Ecology implications*

Proof generation, computing power services/vendors. Some proof might be
very computationally intensive to generate, for example, recursive
verification of thousands of OP_ZKP transactions. In that case, computation
power services or vendors can work with miners to speed up the generation,
or simply propose a bundle of transactions to be included in a block. By
paying a higher fee miners are motivated to include the bundle, while the
service or vendor can still profit a lot. The immediate reward could
incentivize lots of efforts to further optimize the implementation,
engineering, algorithms, or even ZKP schemes.

Contract composability. We need to figure out a systematic way for one
smart contract to call another, either on-chain or assisted-off-chain. For
ZKP-based smart contracts, proof will have to be generated off-chain and
then used later to call the target contract. Off-chain vendors could
monitor closely related signals, generate the proof, and submit it to the
Bitcoin network. There should be incentives to ensure the proper running of
the system. It is still an open issue about how to define a cross-contract
API, how to signal the call, and how to incentivize computation power
vendors.

Impact on wallet applications. Generating proof to authorize spending seems
a heavy-duty computation task, and does not seem to fit into mobile
applications or hardware wallets, especially those with SoC and powered by
a small battery. However, we argue that this is not the case. If no secret
(private key or seed) is involved, there is no need to involve a wallet
that is tasked with safekeeping secrets. In such cases, the ZKP proof could
be proof that something has indeed happened or exists (including a
signature issued from a wallet), which makes up spending conditions.

That's it for now. I hope you are interested in this idea. I look forward
to your comments.

Thanks,

Weiji

[1] -
https://docs.zkproof.org/pages/standards/accepted-workshop4/proposal-aggregation.pdf

[2] - https://eprint.iacr.org/2020/081.pdf

[-- Attachment #2: Type: text/html, Size: 25229 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization
  2023-04-28  8:29 [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization Weiji Guo
@ 2023-04-30  2:15 ` ZmnSCPxj
  2023-05-01 12:46   ` Weiji Guo
  0 siblings, 1 reply; 8+ messages in thread
From: ZmnSCPxj @ 2023-04-30  2:15 UTC (permalink / raw)
  To: Weiji Guo, Bitcoin Protocol Discussion

Good morning Weiji,

Have not completed reading, but this jumped out to me:

> 3.  Dealing with system limitation: verification keys could be very long and exceed the MAX_SCRIPT_ELEMENT_SIZE (520 bytes). They could be put into configurations and only use their hash in the scriptPubKey. The configuration information such as new verification keys could be propagated through P2P messages (we might need a separate BIP for this);

`scriptPubKey` is consensus-critical, and these new P2P messages would have to be consensus-critical.

As all nodes need to learn the new verification keys, we should consider how much resources are spent on each node just to maintain and forever remember verification keys.

Currently our resource-tracking methodology is via the synthetic "weight units" computation.
This reflects resources spent on acquiring block data, as well as maintaining the UTXO database.
For instance, the "witness discount" where witness data (i.e. modern equivalent of `scriptSig`) is charged 1/4 the weight units of other data, exists because spending a UTXO reduces the resources spent in the UTXO database, although still consumes resources in downloading block data (hence only a discount, not free or negative/rebate).

Similarly, any propagation of verification keys would need a similar adjustment for weight units.

As verification keys MUST be seen by all nodes before they can validate an `OP_ZKP`, I would suggest that it is best included in block data (which similarly needs to be seen by all nodes), together with some weight unit adjustment for that data, depending on how much resources verification keys would consume.
This is similar to how `scriptPubKey`s and amounts are included in block data, as those data are kept in the UTXO database, which nodes must maintain in order to validate the blockchain.

If verification keys are permanent, they should probably be weighted heavier than `scriptPubKey`s and amounts --- UTXOs can theoretically be deleted later by spending the UTXO (which reduces UTXO database size), while any data that must be permanently stored in a database must correspondingly be weighted higher.

Similarly, my understanding is that the CPU resources needed by validation of generic ZKPs is higher than that required for validation of ECC signatures.
Much of the current weight calculation assumes that witness data is primarily ECC signatures, so if ZKP witnesses translate to higher resource consumption, the weighting of ZKP witnesses should also be higher (i.e. greater than the 1/4 witness-discounted weight of current witness data).

Regards,
ZmnSCPxj

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization
  2023-04-30  2:15 ` ZmnSCPxj
@ 2023-05-01 12:46   ` Weiji Guo
  2023-05-02 15:01     ` ZmnSCPxj
  0 siblings, 1 reply; 8+ messages in thread
From: Weiji Guo @ 2023-05-01 12:46 UTC (permalink / raw)
  To: ZmnSCPxj; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 3360 bytes --]

Hi ZmnSCPxj,

Thank you very much for your insights. You are definitely right about
making the verification keys consensus-critical and about how the weight
units. I totally agree that the weighting of ZKP the witness should be
higher. We will carry out some benchmarking to recommend a reasonable
weight when we start to develop a GitHub PR.

Meanwhile, as we can potentially aggregate many proofs or recursively
verify even more, the average cost might still be manageable.

Regards,
Weiji

On Sun, Apr 30, 2023 at 10:16 AM ZmnSCPxj <ZmnSCPxj@protonmail•com> wrote:

> Good morning Weiji,
>
> Have not completed reading, but this jumped out to me:
>
>
>
> > 3.  Dealing with system limitation: verification keys could be very long
> and exceed the MAX_SCRIPT_ELEMENT_SIZE (520 bytes). They could be put into
> configurations and only use their hash in the scriptPubKey. The
> configuration information such as new verification keys could be propagated
> through P2P messages (we might need a separate BIP for this);
>
> `scriptPubKey` is consensus-critical, and these new P2P messages would
> have to be consensus-critical.
>
> As all nodes need to learn the new verification keys, we should consider
> how much resources are spent on each node just to maintain and forever
> remember verification keys.
>
> Currently our resource-tracking methodology is via the synthetic "weight
> units" computation.
> This reflects resources spent on acquiring block data, as well as
> maintaining the UTXO database.
> For instance, the "witness discount" where witness data (i.e. modern
> equivalent of `scriptSig`) is charged 1/4 the weight units of other data,
> exists because spending a UTXO reduces the resources spent in the UTXO
> database, although still consumes resources in downloading block data
> (hence only a discount, not free or negative/rebate).
>
> Similarly, any propagation of verification keys would need a similar
> adjustment for weight units.
>
> As verification keys MUST be seen by all nodes before they can validate an
> `OP_ZKP`, I would suggest that it is best included in block data (which
> similarly needs to be seen by all nodes), together with some weight unit
> adjustment for that data, depending on how much resources verification keys
> would consume.
> This is similar to how `scriptPubKey`s and amounts are included in block
> data, as those data are kept in the UTXO database, which nodes must
> maintain in order to validate the blockchain.
>
> If verification keys are permanent, they should probably be weighted
> heavier than `scriptPubKey`s and amounts --- UTXOs can theoretically be
> deleted later by spending the UTXO (which reduces UTXO database size),
> while any data that must be permanently stored in a database must
> correspondingly be weighted higher.
>
> Similarly, my understanding is that the CPU resources needed by validation
> of generic ZKPs is higher than that required for validation of ECC
> signatures.
> Much of the current weight calculation assumes that witness data is
> primarily ECC signatures, so if ZKP witnesses translate to higher resource
> consumption, the weighting of ZKP witnesses should also be higher (i.e.
> greater than the 1/4 witness-discounted weight of current witness data).
>
> Regards,
> ZmnSCPxj
>

[-- Attachment #2: Type: text/html, Size: 3711 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization
  2023-05-01 12:46   ` Weiji Guo
@ 2023-05-02 15:01     ` ZmnSCPxj
  2023-05-04 15:31       ` Weiji Guo
  0 siblings, 1 reply; 8+ messages in thread
From: ZmnSCPxj @ 2023-05-02 15:01 UTC (permalink / raw)
  To: Weiji Guo; +Cc: Bitcoin Protocol Discussion

Good morning Weiji,

> Meanwhile, as we can potentially aggregate many proofs or recursively verify even more, the average cost might still be manageable.

Are miners supposed to do this aggregation?

If miners do this aggregation, then that implies that all fullnodes must also perform the **non**-aggregated validation as transactions flow from transaction creators to miners, and that is the cost (viz. the **non**-aggregated cost) that must be reflected in the weight.
We should note that fullnodes are really miners with 0 hashpower, and any cost you impose on miners is a cost you impose on all fullnodes.

If you want to aggregate, you might want to do that in a separate network that does ***not*** involve Bitcoin fullnodes, and possibly allow for some kind of extraction of fees to do aggregation, then have already-aggregated transactions in the Bitcoin mempool, so that fullnodes only need validate already-aggregated transactions.

Remember, validation is run when a transaction enters the mempool, and is **not** re-run when an in-mempool transaction is seen in a block (`blocksonly` of course does not follow this as it has no mempool, but most fullnodes are not `blocksonly`).
If you intend to aggregate transactions in the mempool, then at the worst case a fullnode will be validating every non-aggregated transaction, and that is what we want to limit by increasing the weight of heavy-validation transactions.

Regards,
ZmnSCPxj

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization
  2023-05-02 15:01     ` ZmnSCPxj
@ 2023-05-04 15:31       ` Weiji Guo
  2023-05-04 17:13         ` ZmnSCPxj
  0 siblings, 1 reply; 8+ messages in thread
From: Weiji Guo @ 2023-05-04 15:31 UTC (permalink / raw)
  To: ZmnSCPxj; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 3913 bytes --]

Hi ZmnSCPxj,

I do mean to have specialized computing power vendors, which could happen
to be miners, or not. Optiming ZKP computations is rather different from
Bitcoin mining so I expect those vendors to be from more research-driven
teams focused in cryptographic engineering.

I am open to whether to put those transactions in mempool or not. I
apologize for giving an inaccurate number earlier about the verification
cost. I just ran gnark-bench on my Mac M2, it turns out the cost for
Groth16 verification could be as fast as 1ms. For Plonk it is around 1.6ms.
So it seems even a common fullnode could handle thousands of OP_ZKP
transactions. In that case, the ZKP transactions could be put into mempool,
and be open to be aggregated by some vendor. Fullnodes should verify these
transactions as well. It does not seem a good idea to treat them with
special rules as there is no guarantee that certain OP_ZKP transactions
will be aggregated or recursively verified. Of course, the weighting should
be well benchmarked and calculated. The cost for those *standalone* OP_ZKP
transactions might be higher due to more data and/or higher weighting. This
incentivizes vendors to develop aggregation / recursive verification
services to drive down the fee requirements and profit from doing so (fee
extraction). I also expect to see an open market where various vendors can
compete against each other, so it makes sense to have these transactions
openly visible to all participants.

Meanwhile, some transactions are meant to be off-chain. For example, a
would-be smart contract can aggregate many related transactions in a OP_ZKP
transaction. Those aggregated transactions should *not* be transmitted
within the Bitcoin network. They could even be *not* valid Bitcoin
transactions. Usually the smart contract operator or its community could
host such a service.

Consider a potential situation in a few years: there are thousands of
active smart contracts based on OP_ZKP, and each block contains a few
hundred OP_ZKP transactions, each one of them aggregates / recursively
verifies many transactions. The effective TPS of the Bitcoin network could
far exceed the current value, reaching the range of thousands or even more.

Hope this clarifies.
Weiji

On Tue, May 2, 2023 at 11:01 PM ZmnSCPxj <ZmnSCPxj@protonmail•com> wrote:

>
> Good morning Weiji,
>
> > Meanwhile, as we can potentially aggregate many proofs or recursively
> verify even more, the average cost might still be manageable.
>
> Are miners supposed to do this aggregation?
>
> If miners do this aggregation, then that implies that all fullnodes must
> also perform the **non**-aggregated validation as transactions flow from
> transaction creators to miners, and that is the cost (viz. the
> **non**-aggregated cost) that must be reflected in the weight.
> We should note that fullnodes are really miners with 0 hashpower, and any
> cost you impose on miners is a cost you impose on all fullnodes.
>
> If you want to aggregate, you might want to do that in a separate network
> that does ***not*** involve Bitcoin fullnodes, and possibly allow for some
> kind of extraction of fees to do aggregation, then have already-aggregated
> transactions in the Bitcoin mempool, so that fullnodes only need validate
> already-aggregated transactions.
>
> Remember, validation is run when a transaction enters the mempool, and is
> **not** re-run when an in-mempool transaction is seen in a block
> (`blocksonly` of course does not follow this as it has no mempool, but most
> fullnodes are not `blocksonly`).
> If you intend to aggregate transactions in the mempool, then at the worst
> case a fullnode will be validating every non-aggregated transaction, and
> that is what we want to limit by increasing the weight of heavy-validation
> transactions.
>
> Regards,
> ZmnSCPxj
>

[-- Attachment #2: Type: text/html, Size: 4285 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization
  2023-05-04 15:31       ` Weiji Guo
@ 2023-05-04 17:13         ` ZmnSCPxj
  2023-05-05 23:06           ` Weiji Guo
  0 siblings, 1 reply; 8+ messages in thread
From: ZmnSCPxj @ 2023-05-04 17:13 UTC (permalink / raw)
  To: Weiji Guo; +Cc: Bitcoin Protocol Discussion

Good morning Weiji,

The issue here is that non-aggregated transaction are a potential attack vector.

As the network is pseudonymous, an anonymous attacker can flood the fullnode mempool network with large numbers of non-aggregated transactions, then in cooperation with a miner confirm a single aggregated transaction with lower feerate than what it put in the several non-aggregated transactions.
The attacker ends up paying lower for the single confirmed transaction, even though it cost the fullnode network a significant amount of CPU to process and validate all the non-aggregated transactions.

Once the single aggregate transaction is confirmed, the fullnodes will remove the non-aggregated transactions from the mempool, clearing out their mempool limit.
Then the attacker can once again flood the fullnode mempool network with more non-aggregated transactions, and again repeat with an aggregated transaction that pays below the total of the non-aggregated transactions, repeatedly increasing the load on the mempool.

Thus, we should really make transactions that could appear in the mempool non-aggregatable with other transactions in the mempool.
You should arrange for aggregation before the blockchain-level transaction hits the mempool.

One can compare cross-input signature aggregation designs.
Signature aggregation is only allowed within a single blockchain-level transaction, not across transactions, precisely so that a transaction that appears in the mempool cannot have its signatures aggregated with other transactions, and preventing the above attack.
Anyone trying to take advantage of signature aggregation needs to cooperatively construct the blockchain-level transaction outside of the mempool with other cooperating actors, all of which perform the validation themselves before anything hits the mempool.

Similarly I can imagine that cross-input ZKP aggregation would be acceptable, but not cross-transaction ZKP aggregation.
(And if you want to push for ZKP aggregation, you should probably push for cross-input signature aggregation first, as you would probably need to solve similar problems in detail and I imagine signature aggregation is simpler than general ZKP aggregation.)

Always expect that the blockchain and its supporting network is attackable.
Do ***NOT*** focus on blocks --- focus on the load on the mempool (the block weight limit is a limit on the mempool load, not a limit on the block CPU load!).
The mempool is a free service, we should take care not to make it abusable.
On the other hand, blockspace is a paid service, so load on it is less important; it is already paid for.
I strongly recommend **DISALLOWING** aggregation of ZKPs once a transaction is in a form that could potentially hit the mempool, and to require paid services for aggregation, outside of the unpaid, free mempool.

Regards,
ZmnSCPxj

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization
  2023-05-04 17:13         ` ZmnSCPxj
@ 2023-05-05 23:06           ` Weiji Guo
  2023-05-06  2:51             ` ZmnSCPxj
  0 siblings, 1 reply; 8+ messages in thread
From: Weiji Guo @ 2023-05-05 23:06 UTC (permalink / raw)
  To: ZmnSCPxj; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 5079 bytes --]

Hi ZmnSCPxy,

> As the network is pseudonymous, an anonymous attacker can flood the
fullnode mempool network with large numbers of non-aggregated transactions,
then in cooperation with a miner confirm a single aggregated transaction
with lower feerate than what it put in the several non-aggregated
transactions.

Arguably this is hardly a feasible attack. Let's suppose the attacker
creates 1000 such transactions, and attaches each transaction with a small
amount of transaction fee X. The total fee will be 1000*X collectible by
the aggregation vendor, who pays the miner a fee Y. We can reasonably
assume that 1000*X is much larger than Y, yet X is much smaller than Y.
Note that Y is already much larger than the regular fee for other
transactions as the aggregated transaction should contain many inputs and
many outputs, thus very large in size.

Now, the attacker will have to generate proofs for these 1000 transactions,
which is non-trivial; and pay for 1000*X upfront. The aggregation vendor
has to spend more computing power doing the aggregation (or recursive
verification) and take (1000*X - Y) as profit. Miner gets Y.

Miners are unlikely to collude with the attacker. I don't think the vendor
would, given profit of 1000*X - Y. Or the attacker could play the vendor,
however, it is still not a trivial attack after spending lots of computing
power generating all the proofs and aggregation/recursion, and paying at
least Y, which is also non-trivial given the size.

All that being said, let's focus on the OP_ZKP for now and leave
aggregation or recursive verification for future discussion. I brought up
the scalability issue just to stress that there is potential room for
further improvements. The research and implementation might take much
longer. As far as I know, CISA (cross input signature aggregation) is still
experimental. Again, thank you very much for detailed analysis and replies.

Regards,
Weiji

On Fri, May 5, 2023 at 1:13 AM ZmnSCPxj <ZmnSCPxj@protonmail•com> wrote:

> Good morning Weiji,
>
> The issue here is that non-aggregated transaction are a potential attack
> vector.
>
> As the network is pseudonymous, an anonymous attacker can flood the
> fullnode mempool network with large numbers of non-aggregated transactions,
> then in cooperation with a miner confirm a single aggregated transaction
> with lower feerate than what it put in the several non-aggregated
> transactions.
> The attacker ends up paying lower for the single confirmed transaction,
> even though it cost the fullnode network a significant amount of CPU to
> process and validate all the non-aggregated transactions.
>
> Once the single aggregate transaction is confirmed, the fullnodes will
> remove the non-aggregated transactions from the mempool, clearing out their
> mempool limit.
> Then the attacker can once again flood the fullnode mempool network with
> more non-aggregated transactions, and again repeat with an aggregated
> transaction that pays below the total of the non-aggregated transactions,
> repeatedly increasing the load on the mempool.
>
> Thus, we should really make transactions that could appear in the mempool
> non-aggregatable with other transactions in the mempool.
> You should arrange for aggregation before the blockchain-level transaction
> hits the mempool.
>
> One can compare cross-input signature aggregation designs.
> Signature aggregation is only allowed within a single blockchain-level
> transaction, not across transactions, precisely so that a transaction that
> appears in the mempool cannot have its signatures aggregated with other
> transactions, and preventing the above attack.
> Anyone trying to take advantage of signature aggregation needs to
> cooperatively construct the blockchain-level transaction outside of the
> mempool with other cooperating actors, all of which perform the validation
> themselves before anything hits the mempool.
>
> Similarly I can imagine that cross-input ZKP aggregation would be
> acceptable, but not cross-transaction ZKP aggregation.
> (And if you want to push for ZKP aggregation, you should probably push for
> cross-input signature aggregation first, as you would probably need to
> solve similar problems in detail and I imagine signature aggregation is
> simpler than general ZKP aggregation.)
>
> Always expect that the blockchain and its supporting network is attackable.
> Do ***NOT*** focus on blocks --- focus on the load on the mempool (the
> block weight limit is a limit on the mempool load, not a limit on the block
> CPU load!).
> The mempool is a free service, we should take care not to make it abusable.
> On the other hand, blockspace is a paid service, so load on it is less
> important; it is already paid for.
> I strongly recommend **DISALLOWING** aggregation of ZKPs once a
> transaction is in a form that could potentially hit the mempool, and to
> require paid services for aggregation, outside of the unpaid, free mempool.
>
> Regards,
> ZmnSCPxj
>

[-- Attachment #2: Type: text/html, Size: 5528 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization
  2023-05-05 23:06           ` Weiji Guo
@ 2023-05-06  2:51             ` ZmnSCPxj
  0 siblings, 0 replies; 8+ messages in thread
From: ZmnSCPxj @ 2023-05-06  2:51 UTC (permalink / raw)
  To: Weiji Guo; +Cc: Bitcoin Protocol Discussion

Good Morning Weiji,


> Hi ZmnSCPxy,
> > As the network is pseudonymous, an anonymous attacker can flood the fullnode mempool network with large numbers of non-aggregated transactions, then in cooperation with a miner confirm a single aggregated transaction with lower feerate than what it put in the several non-aggregated transactions.
> 
> Arguably this is hardly a feasible attack. Let's suppose the attacker creates 1000 such transactions, and attaches each transaction with a small amount of transaction fee X. The total fee will be 1000*X collectible by the aggregation vendor, who pays the miner a fee Y. We can reasonably assume that 1000*X is much larger than Y, yet X is much smaller than Y. Note that Y is already much larger than the regular fee for other transactions as the aggregated transaction should contain many inputs and many outputs, thus very large in size.
> 
> Now, the attacker will have to generate proofs for these 1000 transactions, which is non-trivial; and pay for 1000*X upfront. The aggregation vendor has to spend more computing power doing the aggregation (or recursive verification) and take (1000*X - Y) as profit. Miner gets Y.

The entire point is that there has to be a separate, paid aggregator, in order to ensure that the free mempool service is not overloaded.
Basically, keep the aggregation outside the mempool, not in the mempool.
If aggregation is paid for, that is indeed sufficient to stop the attack, as you noted.

Regards,
ZmnSCPxj


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-05-06  2:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-28  8:29 [bitcoin-dev] proposal: new opcode OP_ZKP to enable ZKP-based spending authorization Weiji Guo
2023-04-30  2:15 ` ZmnSCPxj
2023-05-01 12:46   ` Weiji Guo
2023-05-02 15:01     ` ZmnSCPxj
2023-05-04 15:31       ` Weiji Guo
2023-05-04 17:13         ` ZmnSCPxj
2023-05-05 23:06           ` Weiji Guo
2023-05-06  2:51             ` ZmnSCPxj

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox