(Apologies for the double-post -- I'm resending this message to the list with much of the quoted text trimmed, because my first message was placed in the moderation queue for being too large)

Hi,

Thanks for sharing your thoughts on packaged transaction relay.

The sole objective, as expressed in the OP proposal, is to:
"Propagate transactions that are incentive-compatible to mine, even if they don't meet minimum feerate alone."
 
I actually do think there are additional goals we should include in any protocol change involving transaction relay, such as ensuring that we minimize bandwidth waste as much as possible (as I mentioned in a previous message in this thread).

While I understand your proposal seeks to improve on an idea of static packages in favor of dynamic package construction based on knowledge a node should have of its peers, I think the main drawback of your proposal is that it doesn't take into account the complexities of what a peer's "minimum feerate" might mean.  The consequence of this is that it's not generally possible for a node to accurately guess whether a given transaction should be sent in a package to a given peer, or not, and so in addition to any "packaged transaction relay" mechanism that is implemented by a transaction announcer, we'd still need to add protocol support for a receiving peer to retrieve a package as well.

First of all, a node's feerate is a dynamic value.  BIP 133 allows for nodes to send feefilter messages at any time during the lifetime of a peer connection.  If we were to compare the feerate of ancestors of a relayed transaction to the feerate in place at a peer as indicated by feefilter messages, and use that determine whether those ancestors would have been successfully relayed or not, then doing so accurately would seem to require caching relay success for each transaction, for each peer, at the time such transaction is relayed (or perhaps caching feefilter values that are received from a peer?).  This seems likely to be undesirable, and, at any rate, is more complex than merely comparing a pair of feerates.

But even more fundamental than feerates being dynamic is that feerate itself is not a well-defined concept when applied to arbitrary transaction (sub-)graphs, and this is at the crux of the difficulty in coming up with proposals that would meet the objective of ensuring that transactions which are incentive-compatible to mine all get relayed successfully across the network.  Here are a couple examples that illustrate this:

- Let A be a low feerate transaction (below any peer's minimum feerate).  Let B and C be descendants of A (but are otherwise unrelated).  Suppose these transactions are relayed to a node in the order A, B, C.  In the algorithm you proposed, I believe the determination for whether C should be announced to a given peer as a package (A, C) or as a singleton would either (1) depend on whether the package (A, B) was sufficient to meet the peer's feerate, or (2) waste bandwidth by engaging in packaged relay whenever A was already successfully relayed as part of a package.  Both of these approaches seem undesirable.

- Let A be a high fee, but low feerate transaction.  Suppose B is a transaction that conflicts with A, has a high feerate, but lower total fee.  In this situation, two different nodes that learned of these two transactions in opposite order [(A, B) vs (B, A)] might be expected to have differing mempools -- this at least would be the case in the BIP 125 algorithm (which requires that both feerate and total fee must increase when replacing an existing transaction), and at any rate it's not obvious from the information given which would be more optimal to include in a block, as that depends on factors that go beyond just these two transactions.  Suppose further that a new transaction C is relayed on the network, which is a child of B and very high feerate, such that B + C have higher fee and higher feerate than A, when taken together.  In this case we'd want our relay protocol to ensure that even nodes which saw A first should still have a chance to consider transaction C, but the packaging design you propose (which would compare transaction B's feerate to the peer's, and conclude packaging is unnecessary because B's feerate might already exceed the peer's feerate) would not facilitate this.

To summarize, the point I'd make from these examples is that we should not expect that "feerate" (whatever that means) alone will be a sufficient predictor of what is in our peer's mempools.  So while there may be some situations where a transaction relayer might be able to usefully package up a transaction with its dependencies (perhaps in narrowly defined situations), there will also always be situations where this isn't possible, and what I conclude from that is that it should be helpful to add to the protocol some way for the recipient of a transaction to request the dependencies directly.

Taken together, I roughly understand Gloria's efforts here to be a combination of these two approaches: add some amount of packaged transaction relay for simple cases (ie where the transaction graph has been sufficiently narrowed, to minimize bandwidth waste while reducing latency), and also allow for a fallback mechanism where the recipient of a transaction can efficiently retrieve dependencies.  There seems to be a tradeoff involving latency, bandwidth, and robustness of these two approaches (and maybe also implementation complexity), so I think it's natural to expect that it will take some discussion and understanding of what practices are common on the network and what behaviors wallet or other software might adopt, given potential protocol changes, to figure out how best to balance these ideas.

On Wed, Jun 8, 2022 at 6:43 PM <eric@voskuil.org> wrote:
Hi Suhas/Gloria,

Good questions. I've started a new thread because it became something else...

Various ideas about packaging seem to be focused on the idea of an atomic message that is gossiped around the network like a transaction or block. From my perspective that seems to create a set of problems without good solutions, and it is not a proper analogy to those atomic structures. It may be worth taking the time to step back and take a close look at the underlying objective.

The sole objective, as expressed in the OP proposal, is to:

"Propagate transactions that are incentive-compatible to mine, even if they don't meet minimum feerate alone."

Effectively producing this outcome with an atomic packaging approach while at the same time maintaining network invariants seems unlikely, if not impossible.

Fees:

A node knows what fee rate a peer will accept, and announces individual txs that satisfy peer.feerate. Similarly a node knows its own feerate, and SHOULD drop any peer that announces txs that do not satisfy node.feerate.

Orphans:

A node MAY drop a peer that announces txs that the node sees as orphans against its DAG. It SHOULD drop the orphan tx and MAY request missing ancestors. Presumably after some amount of time connected to peer, node does not expect to see any more orphans from that peer, so these choices could evolve with the channel. However, the design that can only consider each tx in isolation will continue to cause orphan announcements on the channel. A below peer.feerate tx does not get announced to peer, and later a descendant high peer.feerate does get announced to the peer - as an orphan.

BIP133 (feefilter):

"There could be a small number of edge cases where a node's mempool min fee is actually less than the filter value a peer is aware of and transactions with fee rates between these values will now be newly inhibited."

https://github.com/bitcoin/bips/blob/master/bip-0133.mediawiki

Whether the problem is "small" or not depends on the disparity between node fee rates, which is not a matter of protocol. This is an existing problem that can and should be dealt with in packaging, as part of the above objective.

Packaged Transaction Relay:

One might instead think of packaging as a per-connection function, operating over its transaction (input->output) DAG and the feerate of its own node and that of the peer. Logically a "package" is nothing more than a set of transactions (optimized by announcement). Only a node can effectively determine the packaging required by each of its peers, since only the node is aware of peer.feerate.

The only way to avoid dead-ending packages (including individual transactions, as is the objective) is for a node to package txs for each peer. The origination of any package is then just a wallet peer doing what a node does - packaging transactions that satisfy peer.feerate (i.e. that of its node).

Current transaction relay (txB->txA):
===============================
Node0
txA.feerate > node.feerate, and not orphaned (accept txA)
txA.feerate > peer1.feerate (announce txA to peer1)
txA.feerate < peer2.feerate (do not announce txA to peer2)
-----
txB.feerate > node.feerate (accept txB)
txB.feerate > peer1.feerate (announce txB to peer1)
txB.feerate > peer2.feerate (announce txB to peer2)

Node1
Sees/accepts txA and txB.

Node2
Never sees txA, sees/rejects txB (as an orphan).

Packaged transaction relay (txB->txA):
===============================
Node0
txA.feerate > node.feerate, and not orphaned (accept txA)
txA.feerate > peer1.feerate (announce txA to peer1)
txA.feerate < peer2.feerate (do not announce txA to peer2)
-----
txB.feerate > node1.feerate (accept txB)
txB.feerate > peer1.feerate (announce txB to peer1)
txB.feerate > peer2.feerate (do not announce txB to peer2) <== avoid predictable orphan
txA.feerate + txB.feerate > peer2.feerate (announce pkg(A, B) to peer2) <= create minimal package

Node1
Sees/accepts txA and txB.

Node2
pkg(A, B) > node2.feerate (accept txA, txB)
txA.feerate > peer3.feerate (announce txA to peer3)
txB.feerate > peer3.feerate (announce txB to peer3)

Sees/accepts pkg(A, B).

Node3
Sees/accepts txA and txB. <= avoided unnecessary packaging

Summary:

In this design, any node that receives an announcement for a pkg (or tx) later determined to be less than node.feerate SHOULD drop the announcing peer. Unlike with existing tx relay, a node can become "current" and subsequently see few if any tx or pkg orphans, and MAY at some point decide to drop any peer that announces one. Notice that packages are created dynamically, and any package that doesn't need to be grouped gets trimmed down to individual transactions. Furthermore any tx that is "stuck" can be freed by simply sending another tx. The nodes at which the tx has become stuck will just package it up and relay it to peers. In other words, there is no impact on wallet implementation apart from raising the aggregate fee using a descendant transaction.

This is barely a protocol change - it's primarily implementation. All that should be required is an additional INV element type, such as MSG_TX_PACKAGE.

Additional constraints:

* All elements of MSG_TX_PACKAGE in one INV message MUST to be of the same package.
* A package MUST must define a set that can be mined into one block (size/sigops constraint).
* A package SHOULD not contain confirmed txs (a race may cause this).
* A package MUST minimally satisfy peer.feerate.
* A partial tx order, as in the manner of the block.txs ordering, MUST be imposed.
* A node SHOULD drop a peer that sends a package (or tx) below node.feerate.
* A node MAY drop a peer that sends a non-minimal package according to node.feerate.

The partial ordering of block.txs introduces an ordering constraint that precludes full parallelism in validating input attachment. This is an implementation artifact that made its way into consensus. However in the case of packaging, the set of txs is not presumed to be valid under the proof of work DoS guard. As such constraints should minimize the work/traffic required to invalidate the message. The partial order constraint ensures that the DAG can be built incrementally, dropping the attempt (and peer as desired) as soon as the first orphan is discovered. As a result the network traffic and work required is not materially different than with tx relay, with two exceptions.

These are the two central aspects of this approach (Avoiding Predictable Orphans and Creating Minimal Packages). These are graph search algorithms, some basic computer science. Minimality requires only that the package does not introduce txs that are not necessary to reach the peer.feerate (as these can always be packaged separately). It does not require that nodes all generate the same packages. It does not require negotiation, package identity, cryptography, or hashing. As a graph search it should be O(n) where n is the unconfirmed ancestry of the package, but should typically be much lower, if not a single step.

Sufficiently-low-fee nodes will see only single txs. Moderate-fee nodes may cause partial breakup of packages. Sufficiently high fee nodes will cause peers (having received and completed the acceptance of a tx/pkg with pkg.feerate < peer.feerate) to navigate from each tx/package external input until reaching txs above peer.feerate, or confirmed (both of which the peer is presumed to already have). If the pkg.feerate is sufficiently high to connect all external inputs to the intervening txs, they are added to the package and it is announced to the high fee peer. Note that the individual tx.feerate > peer.feerate is insufficient to ensure that the peer should have the tx, as there may be ancestor txs that do not, and for which the tx was insufficient to cause them to be packaged. So a non-caching algorithm must be able to chase each package external input to a confirmed tx (or cache the unconfirmed ancestry fee rate at each tx). Note that fee rates are not directly additive, both size/weight and fee are required for summation (and aggregate sigops should be considered).

This makes no assumptions about current implementations. The design would call for maintenance of a transaction (input->output) DAG with tx.feerate on each tx. This could be the unconfirmed tx graph (i.e. "memory pool") though it does not require maintenance of anything more than the parameters necessary to confirm a set of validated txs within a block. It is very reasonable to require this of any participating node. A simple version negotiation can identify a package-accepting/sending nodes.

I have thought about this for some time, but have not implemented either the graph search, source code, or BIP. Just wrote this off the top of my head. So I am sure there are some things I have incorrect or failed to consider. But I think it's worth discussing it at this point.

e