public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Gloria Zhao <gloriajzhao@gmail•com>
To: eric@voskuil•org
Cc: Bitcoin Protocol Discussion
	<bitcoin-dev@lists•linuxfoundation.org>,
	Anthony Towns <aj@erisian•com.au>
Subject: Re: [bitcoin-dev] Package Relay Proposal
Date: Tue, 7 Jun 2022 18:44:45 +0100	[thread overview]
Message-ID: <CAFXO6=LGX4zRN3rBPs89cgYKrM5H3kViR1QZRdMeyaS_HELPTQ@mail.gmail.com> (raw)
In-Reply-To: <001201d870ac$8d7a06a0$a86e13e0$@voskuil.org>

[-- Attachment #1: Type: text/plain, Size: 10022 bytes --]

Hi Eric, aj, all,

Sorry for the delayed response. @aj I'm including some paraphrased points
from our offline discussion (thanks).

> Other idea: what if you encode the parent txs as a short hash of the
wtxid (something like bip152 short ids? perhaps seeded per peer so
collisions will be different per peer?) and include that in the inv
announcement? Would that work to avoid a round trip almost all of the time,
while still giving you enough info to save bw by deduping parents?

> As I suggested earlier, a package is fundamentally a compact block (or
> block) announcement without the header. Compact block (BIP152)
announcement
> is already well-defined and widely implemented...

> Let us not reinvent the wheel and/or introduce accidental complexity. I
see
> no reason why packaging is not simply BIP152 without the 'header' field,
an
> updated protocol version, and the following sort of changes to names

Interestingly, "why not use BIP 152 shortids to save bandwidth?" is by far
the most common suggestion I hear (including offline feedback). Here's a
full explanation:

BIP 152 shortens transaction hashes (32 bytes) to shortids (6 bytes) to
save a significant amount of network bandwidth, which is extremely
important in block relay. However, this comes at the expense of
computational complexity. There is no way to directly calculate a
transaction hash from a shortid; upon receipt of a compact block, a node is
expected to calculate the shortids of every unconfirmed transaction it
knows about to find the matches (BIP 152: [1], Bitcoin Core: [2]). This is
expensive but appropriate for block relay, since the block must have a
valid Proof of Work and new blocks only come every ~10 minutes. On the
other hand, if we require nodes to calculate shortids for every transaction
in their mempools every time they receive a package, we are creating a DoS
vector. Unconfirmed transactions don't need PoW and, to have a live
transaction relay network, we should expect nodes to handle transactions at
a high-ish rate (i.e. at least 1000s of times more transactions than
blocks). We can't pre-calculate or cache shortids for mempool transactions,
since the SipHash key depends on the block hash and a per-connection salt.

Additionally, shortid calculation is not designed to prevent intentional
individual collisions. If we were to use these shortids to deduplicate
transactions we've supposedly already seen, we may have a censorship
vector. Again, these tradeoffs make sense for compact block relay (see
shortid section in BIP 152 [3]), but not package relay.

TLDR: DoSy if we calculate shortids on every package and censorship vector
if we use shortids for deduplication.

> Given this message there is no reason
> to send a (potentially bogus) fee rate with every package. It can only be

> validated by obtaining the full set of txs, and the only recourse is
> dropping (etc.) the peer, as is the case with single txs.

Yeah, I agree with this. Combined with the previous discussion with aj
(i.e. we can't accurately communicate the incentive compatibility of a
package without sending the full graph, and this whole dance is to avoid
downloading a few low-fee transactions in uncommon edge cases), I've
realized I should remove the fee + weight information from pkginfo. Yay for
less complexity!

Also, this might be pedantic, but I said something incorrect earlier and
would like to correct myself:

>> In theory, yes, but maybe it was announced earlier (while our node was
down?) or had dropped from our mempool or similar, either way we don't have
those txs yet.

I said "It's fine if they have Erlay, since a sender would know in advance
that B is missing and announce it as a package." But this isn't true since
we're only using reconciliation in place of flooding to announce
transactions as they arrive, not for rebroadcast, and we're not doing full
mempool set reconciliation. In any case, making sure a node receives the
transactions announced when it was offline is not something we guarantee,
not an intended use case for package relay, and not worsened by this.

Thanks for your feedback!

Best,
Gloria

[1]:
https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#cmpctblock
[2]:
https://github.com/bitcoin/bitcoin/blob/master/src/blockencodings.cpp#L49
[3]:
https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#short-transaction-id-calculation

On Thu, May 26, 2022 at 3:59 AM <eric@voskuil•org> wrote:

> Given that packages have no header, the package requires identity in a
> BIP152 scheme. For example 'header' and 'blockhash' fields can be replaced
> with a Merkle root (e.g. "identity" field) for the package, uniquely
> identifying the partially-ordered set of txs. And use of 'getdata' (to
> obtain a package by hash) can be eliminated (not a use case).
>
> e
>
> > -----Original Message-----
> > From: eric@voskuil•org <eric@voskuil•org>
> > Sent: Wednesday, May 25, 2022 1:52 PM
> > To: 'Anthony Towns' <aj@erisian•com.au>; 'Bitcoin Protocol Discussion'
> > <bitcoin-dev@lists•linuxfoundation.org>; 'Gloria Zhao'
> > <gloriajzhao@gmail•com>
> > Subject: RE: [bitcoin-dev] Package Relay Proposal
> >
> > > From: bitcoin-dev <bitcoin-dev-bounces@lists•linuxfoundation.org> On
> > Behalf
> > > Of Anthony Towns via bitcoin-dev
> > > Sent: Wednesday, May 25, 2022 11:56 AM
> >
> > > So the other thing is what happens if the peer announcing packages to
> us
> > is
> > > dishonest?
> > >
> > > They announce pkg X, say X has parents A B C and the fee rate is
> garbage.
> > But
> > > actually X has parent D and the fee rate is excellent. Do we request
> the
> > > package from another peer, or every peer, to double check? Otherwise
> > we're
> > > allowing the first peer we ask about a package to censor that tx from
> us?
> > >
> > > I think the fix for that is just to provide the fee and weight when
> > announcing
> > > the package rather than only being asked for its info? Then if one peer
> > makes
> > > it sound like a good deal you ask for the parent txids from them,
> dedupe,
> > > request, and verify they were honest about the parents.
> >
> > Single tx broadcasts do not carry an advertised fee rate, however the'
> > feefilter' message (BIP133) provides this distinction. This should be
> > interpreted as applicable to packages. Given this message there is no
> reason
> > to send a (potentially bogus) fee rate with every package. It can only be
> > validated by obtaining the full set of txs, and the only recourse is
> > dropping (etc.) the peer, as is the case with single txs. Relying on the
> > existing message is simpler, more consistent, and more efficient.
> >
> > > >> Is it plausible to add the graph in?
> > >
> > > Likewise, I think you'd have to have the graph info from many nodes if
> > you're
> > > going to make decisions based on it and don't want hostile peers to be
> > able to
> > > trick you into ignoring txs.
> > >
> > > Other idea: what if you encode the parent txs as a short hash of the
> wtxid
> > > (something like bip152 short ids? perhaps seeded per peer so collisions
> > will
> > > be different per peer?) and include that in the inv announcement? Would
> > > that work to avoid a round trip almost all of the time, while still
> giving
> > you
> > > enough info to save bw by deduping parents?
> >
> > As I suggested earlier, a package is fundamentally a compact block (or
> > block) announcement without the header. Compact block (BIP152)
> > announcement
> > is already well-defined and widely implemented. A node should never be
> > required to retain an orphan, and BIP152 ensures this is not required.
> >
> > Once a validated set of txs within the package has been obtained with
> > sufficient fee, a fee-optimal node would accept the largest subgraph of
> the
> > package that conforms to fee constraints and drop any peer that provides
> a
> > package for which the full graph does not.
> >
> > Let us not reinvent the wheel and/or introduce accidental complexity. I
> see
> > no reason why packaging is not simply BIP152 without the 'header' field,
> an
> > updated protocol version, and the following sort of changes to names:
> >
> > sendpkg
> > MSG_CMPCT_PKG
> > cmpctpkg
> > getpkgtxn
> > pkgtxn
> >
> > > > For a maximum 25 transactions,
> > > >23*24/2 = 276, seems like 36 bytes for a child-with-parents package.
> > >
> > > If you're doing short ids that's maybe 25*4B=100B already, then the
> above
> > is
> > > up to 36% overhead, I guess. Might be worth thinking more about, but
> > maybe
> > > more interesting with ancestors than just parents.
> > >
> > > >Also side note, since there are no size/count params,
> >
> > Size is restricted in the same manner as block and transaction
> broadcasts,
> > by consensus. If the fee rate is sufficient there would be no reason to
> > preclude any valid size up to what can be mined in one block (packaging
> > across blocks is not economically rational under the assumption that one
> > miner cannot expect to mine multiple blocks in a row). Count is
> incorporated
> > into BIP152 as 'shortids_length'.
> >
> > > > wondering if we
> > > >should just have "version" in "sendpackages" be a bit field instead of
> > > >sending a message for each version. 32 versions should be enough
> right?
> >
> > Adding versioning to individual protocols is just a reflection of the
> > insufficiency of the initial protocol versioning design, and that of the
> > various ad-hoc changes to it (including yet another approach in this
> > proposal) that have been introduced to compensate for it, though I'll
> > address this in an independent post at some point.
> >
> > Best,
> > e
> >
> > > Maybe but a couple of messages per connection doesn't really seem worth
> > > arguing about?
> > >
> > > Cheers,
> > > aj
> > >
> > >
> > > --
> > > Sent from my phone.
> > > _______________________________________________
> > > bitcoin-dev mailing list
> > > bitcoin-dev@lists•linuxfoundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>

[-- Attachment #2: Type: text/html, Size: 12920 bytes --]

  reply	other threads:[~2022-06-07 17:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-17 16:01 Gloria Zhao
2022-05-17 17:56 ` Greg Sanders
2022-05-17 20:45   ` Gloria Zhao
2022-05-18  0:35 ` Anthony Towns
2022-05-18 18:40   ` Gloria Zhao
2022-05-23 21:34     ` Anthony Towns
2022-05-24  1:13       ` Gloria Zhao
2022-05-24 19:48         ` Anthony Towns
2022-05-24 21:05           ` Gloria Zhao
2022-05-24 23:43             ` Eric Voskuil
2022-05-25 18:55             ` Anthony Towns
2022-05-25 20:52               ` eric
2022-05-26  2:59                 ` eric
2022-06-07 17:44                   ` Gloria Zhao [this message]
2022-06-08 15:59                     ` Suhas Daftuar
2022-06-14  9:59                       ` Gloria Zhao
2022-05-28  1:54               ` Gloria Zhao
2022-06-17 20:08 ` Antoine Riard
2022-11-01 18:03   ` Gloria Zhao
2023-05-10 15:12 Tom Trevethan
2023-05-10 15:42 ` Greg Sanders

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFXO6=LGX4zRN3rBPs89cgYKrM5H3kViR1QZRdMeyaS_HELPTQ@mail.gmail.com' \
    --to=gloriajzhao@gmail$(echo .)com \
    --cc=aj@erisian$(echo .)com.au \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    --cc=eric@voskuil$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox