Re: [bitcoin-dev] death to the mempool, long live the mempool

From: Gloria Zhao <gloriajzhao@gmail•com>
To: lisa neigut <niftynei@gmail•com>,
	 Bitcoin Protocol Discussion
	<bitcoin-dev@lists•linuxfoundation.org>
Subject: Re: [bitcoin-dev] death to the mempool, long live the mempool
Date: Tue, 26 Oct 2021 19:16:51 +0100	[thread overview]
Message-ID: <CAFXO6=Jk0MAqQ6u5JCrpC3eMv=bF3DT6wH6Y60zb_b-beU4mcg@mail.gmail.com> (raw)
In-Reply-To: <CAM1a7P04apCwwmqNp4VLRam5_uk59tVRWv74UVD_g0-DUGNghw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 9148 bytes --]

Hi Lisa,

Some background for people who are not familiar with mempool:

The mempool is a cache of unconfirmed transactions, designed in a way
to help miners efficiently pick the highest feerate packages to
include in new blocks. It stores more than a block's worth of
transactions because transaction volume fluctuates and any rational
miner would try to maximize the fees in their blocks; in a reorg, we
also don't want to completely forget what transactions were in the
now-stale tip.

In Bitcoin Core, full nodes keep a mempool by default. The additional
requirements for keeping a mempool are minimal (300MB storage, can be
configured to be lower) because anyone, anywhere in the world, should
be able to run a node and broadcast a Bitcoin payment without special
connectivity to some specific set of people or expensive/inaccessible
hardware. Perhaps connecting directly to miners can be a solution for
some people, but I don't think it's healthy for the network.

Some benefits of keeping a mempool as a non-mining node include:
- Fee estimation based on your node's knowledge of unconfirmed
transactions + historical data.
- Dramatically increased block validation (and thus propagation)
speed, since you cache signature and script results of transactions
before they are confirmed.
- Reduced block relay bandwidth usage (Bitcoin Core nodes use BIP152
compact block relay), as you don't need to re-request the block
transactions you already have in your mempool.
- Wallet ability to send/receive transactions that spend unconfirmed outputs.

> I had the realization that the mempool is obsolete and should be eliminated.

I assume you mean that the mempool should still exist but be turned
off for non-mining nodes. A block template producer needs to keep
unconfirmed transactions somewhere.
On Bitcoin Core today, you can use the -blocksonly config option to
ignore incoming transactions (effectively switching off your mempool),
but there are strong reasons not to do this:
- It is trivial for your peers to detect that all transactions
broadcasted by your node = from your wallet. Linking your node to your
transactions is a very bad privacy leak.
- You must query someone else for what feerate to put on your transaction.
- You can't use BIP152 compact block relay, so your network bandwidth
usage spikes at every block. You also can't cache validation results,
so your block validation speed slows down.

> Removing the mempool would greatly reduce the bandwidth requirement for running a node...

If you're having problems with your individual node's bandwidth usage,
you can also decrease the number of connections you make or turn off
incoming connections. There are efforts to reduce transaction relay
bandwidth usage network-wide [1].

> Removing the mempool would... keep intentionality of transactions private until confirmed/irrevocable...

I'm confused - what is the purpose of keeping a transaction private
until it confirms? Don't miners still see your transaction? A
confirmed transaction is not irrevocable; reorgs happen.

> Removing the mempool would... naturally resolve all current issues inherent in package relay and rbf rules.

Removing the mempool does not help with this. How does a miner decide
whether a conflicting transaction is an economically-advantageous
replacement or a DoS attack? How do you submit your CPFP if the parent
is below the mempool minimum feerate? Do they already have a different
relay/mempool implementation handling all of these problems but don't
aren't sharing it with the rest of the community?

> Initial feerate estimation would need to be based on published blocks, not pending transactions (as this information would no longer be available), or from direct interactions with block producers.

There are many reasons why using only published blocks for fee
estimates is a flawed design, including:

- The miner of a block can artificially inflate the feerate of the
transactions in their mempool simply by including a few of their own
transactions that pay extremely high feerates. This costs them
nothing, as they collect the fees.
- A miner constructs a block based on the transactions in their
mempool. Your transaction's feerate may have been enough to be
included 2 blocks ago or a week ago, but it will be compared to the
other unconfirmed transactions available to the miner now. They can
tell you what's in their mempool or what the next-block feerate is,
but you would be a fool to believe them.

See also [2],[3].

> Provided the number of block template producing actors remains beneath, say 1000, it’d be quite feasible to publish a list of tor endpoints that nodes can independently  + directly submit their transactions to. In fact, merely allowing users to select their own list of endpoints to use alternatively to the mempool would be a low effort starting point for the eventual replacement.

As a thought experiment, let's imagine we have some public registry of
mining nodes' tor endpoints and we use it for this secondary
direct-to-miner transaction relay network. If the registry is
maintained (by who?) and accurate (based on whose word?), it is a
point of failure for transaction censorship and deanonymization, as
well as an additional barrier to becoming a miner, encouraging
centralization.
The other possibility is that the registry is not accurate. In fact,
unless the registry requires miners to identify themselves (which
others on this thread have already pointed out is ill-advised), this
should be treated similarly to regular addr gossip. We would never
automatically trust that the entity behind the endpoint provides the
service it advertises, is an honest node that won't simply blackhole
our transaction, or even belongs to a Bitcoin node at all.

Best,
Gloria

[1]: https://arxiv.org/pdf/1905.10518.pdf
[2]: https://bitcointechtalk.com/an-introduction-to-bitcoin-core-fee-estimation-27920880ad0
[3]: https://gist.github.com/morcos/d3637f015bc4e607e1fd10d8351e9f41

On Tue, Oct 26, 2021 at 8:38 AM lisa neigut via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:

> Hi all,
>
> In a recent conversation with @glozow, I had the realization that the
> mempool is obsolete and should be eliminated.
>
> Instead, users should submit their transactions directly to mining pools,
> preferably over an anonymous communication network such as tor. This can
> easily be achieved by mining pools running a tor onion node for this
> express purpose (or via a lightning network extension etc)
>
> Mempools make sense in a world where mining is done by a large number of
> participating nodes, eg where the block template is constructed by a
> majority of the participants on the network. In this case, it is necessary
> to socialize pending transaction data to all participants, as you don’t
> know which participant will be constructing the winning block template.
>
> In reality however, mempool relay is unnecessary where the majority of
> hashpower and thus block template creation is concentrated in a
> semi-restricted set.
>
> Removing the mempool would greatly reduce the bandwidth requirement for
> running a node, keep intentionality of transactions private until
> confirmed/irrevocable, and naturally resolve all current issues inherent in
> package relay and rbf rules. It also resolves the recent minimum relay
> questions, as relay is no longer a concern for unmined transactions.
>
> Provided the number of block template producing actors remains beneath,
> say 1000, it’d be quite feasible to publish a list of tor endpoints that
> nodes can independently  + directly submit their transactions to. In fact,
> merely allowing users to select their own list of endpoints to use
> alternatively to the mempool would be a low effort starting point for the
> eventual replacement.
>
> On the other hand, removing the mempool would greatly complicate solo
> mining and would also make BetterHash proposals, which move the block
> template construction away from a centralized mining pool back to the
> individual miner, much more difficult. It also makes explicit the target
> for DoS attacks.
>
> A direct communication channel between block template construction venues
> and transaction proposers also provides a venue for direct feedback wrt
> acceptable feerates at the time, which both makes transaction confirmation
> timelines less variable as well as provides block producers a mechanism for
> (independently) enforcing their own minimum security budget. In other
> words, expressing a minimum acceptable feerate for continued operation.
>
> Initial feerate estimation would need to be based on published blocks, not
> pending transactions (as this information would no longer be available), or
> from direct interactions with block producers.
>
>
> ~niftynei
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists•linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>

[-- Attachment #2: Type: text/html, Size: 10289 bytes --]