public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: <eric@voskuil•org>
To: "'Suhas Daftuar'" <sdaftuar@gmail•com>,
	"'Bitcoin Protocol Discussion'"
	<bitcoin-dev@lists•linuxfoundation.org>
Subject: Re: [bitcoin-dev] Packaged Transaction Relay
Date: Tue, 4 Oct 2022 17:01:04 -0700	[thread overview]
Message-ID: <02fc01d8d84d$90a7b120$b1f71360$@voskuil.org> (raw)
In-Reply-To: <CAFp6fsF=fLVq4=PSEpK+4yD+SZ+uVMJLM616q3F--zcuuqL3pg@mail.gmail.com>

> Hi,
> 
> Thanks for sharing your thoughts on packaged transaction relay.

Hello, thanks for the reply.

>> The sole objective, as expressed in the OP proposal, is to:
>> "Propagate transactions that are incentive-compatible to mine, even
>> if they don't meet minimum feerate alone."
> 
> I actually do think there are additional goals we should include in any protocol
> change involving transaction relay, such as ensuring that we minimize
> bandwidth waste as much as possible (as I mentioned in a previous message
> in this thread).

Yes - there is always the presumption of an optimally-performing protocol (not limited to bandwidth), this is just a restatement from the OP.

The OP fails to eliminate orphan announcement, fails to prevent packages with insufficient fee from getting stuck in the same manner as txs (without explicitly re-announcing them again in an even larger package of higher feerate), and results in orphaned package announcements for the same reason (a static package is effectively just a larger tx).

Due to the resulting orphaning, a node must allow its peer to continue to broadcast unverifiable orphans to it, potentially chasing ancestry. So in addition to bandwidth waste, there is also an inherent problem of bandwidth DOS. These are problems specifically addressed by packaged relay.

[Regarding bandwidth waste: I've pointed out in years past that breaking the Bitcoin versioning scheme creates a requirement that any unknown message type be considered valid. Up until a recently-deployed protocol change, it had always been possible to validate messages by type. I noticed recently that validating nodes have been dropping peers at an increasing rate (a consequence of that deployment). Despite being an undocumented compatibility break, it is now unfortunately a matter of protocol that a peer must allow its peers to waste its bandwidth to remain compatible - something which we should eliminate.]

> While I understand your proposal seeks to improve on an idea of static
> packages in favor of dynamic package construction based on knowledge a
> node should have of its peers, I think the main drawback of your proposal is
> that it doesn't take into account the complexities of what a peer's "minimum
> feerate" might mean. The consequence of this is that it's not generally
> possible for a node to accurately guess whether a given transaction should
> be sent in a package to a given peer, or not, and so in addition to any
> "packaged transaction relay" mechanism that is implemented by a
> transaction announcer, we'd still need to add protocol support for a receiving
> peer to retrieve a package as well.

It is certainly possible that there is ambiguity in BIP133 (and BIPs that modify it). However it's not clear to me which such ambiguity you are referring to. There is no guessing proposed.

> First of all, a node's feerate is a dynamic value.  BIP 133 allows for nodes to
> send feefilter messages at any time during the lifetime of a peer connection.
> If we were to compare the feerate of ancestors of a relayed transaction to
> the feerate in place at a peer as indicated by feefilter messages, and use that
> determine whether those ancestors would have been successfully relayed or
> not, then doing so accurately would seem to require caching relay success for
> each transaction, for each peer, at the time such transaction is relayed (or
> perhaps caching feefilter values that are received from a peer?).  This seems
> likely to be undesirable,

This is a possible implementation. What makes it undesirable?

> and, at any rate, is more complex than merely comparing a pair of feerates.

There are no necessary protocol changes (though a new INV type is ideal), so the relative complexity you are implying could only arise from implementation. While implementation considerations are relevant, achieving simplicity in the protocol is presumably the priority. Further, implementation complexity must be considered from what is necessary to actually achieve the objectives, and not from the perspective of any given implementation.

Merely comparing a pair of feerates produces the problems described above, which includes not resolving the central problem, so this is an apples-to-oranges comparison. It's also more complex than doing nothing, but that also doesn't resolve the problem.

> But even more fundamental than feerates being dynamic is that feerate
> itself is not a well-defined concept when applied to arbitrary transaction
> (sub-)graphs, and this is at the crux of the difficulty in coming up with
> proposals that would meet the objective of ensuring that transactions which
> are incentive-compatible to mine all get relayed successfully across the
> network.  Here are a couple examples that illustrate this:
> 
> - Let A be a low feerate transaction (below any peer's minimum feerate).  Let
> B and C be descendants of A (but are otherwise unrelated).  Suppose these
> transactions are relayed to a node in the order A, B, C.  In the algorithm you
> proposed, I believe the determination for whether C should be announced
> to a given peer as a package (A, C) or as a singleton would either (1) depend
> on whether the package (A, B) was sufficient to meet the peer's feerate,

Yes

> or (2) waste bandwidth by engaging in packaged relay whenever A was already
> successfully relayed as part of a package.  Both of these approaches seem
> undesirable.
> 
> - Let A be a high fee, but low feerate transaction.

Low feerate means low fee (as high/low can only be relative to size), it's not clear how these can both be true.

>  Suppose B is a transaction
> that conflicts with A, has a high feerate, but lower total fee.  In this situation,
> two different nodes that learned of these two transactions in opposite order
> [(A, B) vs (B, A)] might be expected to have differing mempools -- this at
> least would be the case in the BIP 125 algorithm (which requires that both
> feerate and total fee must increase when replacing an existing transaction),

As a node knows which txs it has relayed to its peer, order of arrival is inconsequential.

> and at any rate it's not obvious from the information given which would be
> more optimal to include in a block, as that depends on factors that go beyond
> just these two transactions. 

Block inclusion in not pertinent, either may be included.

> Suppose further that a new transaction C is
> relayed on the network, which is a child of B and very high feerate, such that
> B + C have higher fee and higher feerate than A, when taken together.  In
> this case we'd want our relay protocol to ensure that even nodes which saw
> A first should still have a chance to consider transaction C, but the packaging
> design you propose (which would compare transaction B's feerate to the
> peer's, and conclude packaging is unnecessary because B's feerate might
> already exceed the peer's feerate) would not facilitate this.

Given that A and B are conflicts, A and B+C are conflicts. This is no different than the original conflict of A and B, and decided based on the required BIP125 fee increment. If A arrives first, B must be an increment over A, and vice versa. Upon the arrival of C, given prior acceptance of both A and B, B+C must be an increment over A, as the conflict arises from A/B.

> To summarize, the point I'd make from these examples is that we should not
> expect that "feerate" (whatever that means) alone will be a sufficient
> predictor of what is in our peer's mempools.

Predicting a peer's set of confirmable transactions ("mempool") is not the objective or a requirement. As far as I can tell it is sufficient to achieve requirements to the extent there are no unpublished policies.

Upon startup and connection to a peer, a node may have an empty mempool. In this case there will be no orphan or redundant announcements from the peer.

A node with a populated mempool may connect to a peer. In that case the peer will receive announcements from the peer for txs which it may already have. This will also occur due to multiple peer connections. Redundancy is not within the scope of this or the OP proposals (there are other proposals for limiting this redundancy in both scenarios to the extent desirable).

Once a node has been connected to the network for some amount of time, there will be no orphans or connection-specific redundancies announced to it. This provides a basis for the node to drop non-conforming peers (that support packaged relay), improving protection against bandwidth-wasting peers.

Any node which implements unpublished policies can expect to receive orphan announcements. This raises the question of whether the protocol should incorporate a facility for such a node to chase down orphans in the case where it is orphaning them by deleting their ancestors, even though it publishes that it accepts them based on feerate/rbf. This would imply that there is some other discretionary aspect of transactions that, as a matter of protocol, should be considered for relay.

Any such aspect internal to a tx would be economically-irrational to consider (which includes censorship), in which case it would seem preferrable to let such nodes simply accept the fact that they are creating orphans for themselves.

Any such aspect external to the tx would also be economically-irrational (mining wants the greatest possible fee opportunity), but may be a consequence of resource limitations. Storing confirmable txs in RAM constrains such resources by orders of magnitude, and an assumption that an implementation must do this would be an error (some do not). Yet storage remains finite, so this remains a possible though marginal consideration given the presumption that all accepted txs are both confirmable and satisfy feerate/rbf policy. In the case where a node is discarding previously accepted and yet-unconfirmed txs, the node accepts the possibility that this will result in it receiving orphan announcements.

The rational way to reduce the size of the mempool is to raise the published feerate, discarding txs that no longer conform. While this was not specifically described in the fairly informal proposal, the receipt of a reduced peer feerate message can cause a node to update its state for that peer, eliminating the possibility of orphan announcements to it.

>  So while there may be some
> situations where a transaction relayer might be able to usefully package up a
> transaction with its dependencies (perhaps in narrowly defined situations),
> there will also always be situations where this isn't possible, and what I
> conclude from that is that it should be helpful to add to the protocol some
> way for the recipient of a transaction to request the dependencies directly.

I don't think this has been shown, but finding such issues is of course why we discuss it.

> Taken together, I roughly understand Gloria's efforts here to be a
> combination of these two approaches: add some amount of packaged
> transaction relay for simple cases (ie where the transaction graph has been
> sufficiently narrowed, to minimize bandwidth waste while reducing latency),
> and also allow for a fallback mechanism where the recipient of a transaction
> can efficiently retrieve dependencies.  There seems to be a tradeoff
> involving latency, bandwidth, and robustness of these two approaches (and
> maybe also implementation complexity), so I think it's natural to expect that
> it will take some discussion and understanding of what practices are common
> on the network and what behaviors wallet or other software might adopt,
> given potential protocol changes, to figure out how best to balance these
> ideas.

The problems that I see with static packaging are:

1) Does not fully resolve the problem - a static package itself can get stuck for the same reason as a tx.
2) Requires wallet formation of a static packages - to resolve what is fundamentally a network issue.
3) Adds a complex new protocol to achieve what can be accomplished with almost no protocol change.
4) Is not bandwidth optimal, as it continues to relay/chase orphans (singular and packaged).
5) Is not DOS optimal, as it requires allowance for orphans and redundancies.

Regarding complexity - searching and marking a DAG is basic CS, and there isn't much else to the necessary implementation. There are no new messages, just a version bump and a preferably a new INV type. In terms of performance and therefore any possible increase to relay latency, this is easily measured in terms of complexity. Whether this would be a latency increase or decrease in relation to the OP is unclear.

The complexity of the search for is linear in the size of the mempool subgraph (1) induced by the traversal of ancestors from the potential package's last tx, and (2) reduced by the set of txs known to the peer. (2) includes both txs sent to and received from the peer. This would appear to be a marginal cost.

A draft of the search algorithm is available here:

https://github.com/libbitcoin/libbitcoin-network/wiki/Iterative-Channel-Package-Computation

Best,
e



  reply	other threads:[~2022-10-05  0:01 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08 22:43 eric
2022-09-26 17:50 ` alicexbt
2022-09-26 21:19   ` eric
2022-09-27  9:29     ` alicexbt
2022-10-04 15:15 ` Suhas Daftuar
2022-10-05  0:01   ` eric [this message]
2022-10-05  6:55     ` Anthony Towns
     [not found] <A485FF21-3B14-49B4-BC53-99AFAA90E38D@voskuil.org>
2022-09-27 19:21 ` Eric Voskuil
2022-10-05 20:43 Eric Voskuil
2022-10-06  4:32 ` eric
2022-10-07  6:31   ` Anthony Towns
2022-10-08 19:58     ` eric
2022-10-09  5:52       ` Anthony Towns
2022-10-09  7:00         ` eric
2022-10-09 13:27           ` Anthony Towns
2022-10-10 22:05             ` eric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='02fc01d8d84d$90a7b120$b1f71360$@voskuil.org' \
    --to=eric@voskuil$(echo .)org \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    --cc=sdaftuar@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox