Hi all,
I'm writing a report to disclose a new transaction-relay jamming attack
affecting bitcoin time-sensitive contracting protocols by exploiting the transaction
selection, announcement and propagation mechanisms of base-layer full-nodes.
Back in 2020, amid numerous technical conversations among bitcoin protocol developers
about how bip125 replace-by-fee rules could be abused by a lightning channel counterparty
with a competing interest (a.k.a pinning attack, see BIP431's motivation for more), it
was thought that few other components of the bitcoin tx-relay p2p stack could be also
adversarially exploited [0] (In the lack of familiarity with lightning / contracting
protocol security risks, go to read the old post pointed by footnote).
Mid-2023, I did privately message a group of long-term bitcoin and lightning
developers detailing those new "transaction-relay throughput attacks" concerns,
especially the different exploitations and the technical feasibility of such attack.
While this attack was deemed worthy of more investigations and practical testing, it
was relatively put in the background due to more severe security issues affecting
lightning at the time (i.e 2023). Recent conversations this year (i.e 2024) about
free relay attacks of full-node bandwidths made me think to reconsider the practical
cost of transaction-relay throughput attacks, and as such proceeds to a full disclosure.
Since my personal awareness in 2020 of this potential vector of attack enabling to
steal funds from lightning channel funds, I've not seen exploitations in the wild or
corresponding hints that could reveal such new transaction-relay jamming attack
being practically used.
In this report, 2 variations of those new "transaction-relay throughput attacks"
are presented, i.e respectively the "high overflow" and the "low overflow" variant.
There is a proof-of-concept available for the "high overflow" variation tested on
bitcoin core v27.0. The "high overflow" variation has been tested under few topological
configurations, though no under real-world workload. Coming with a proof-of-concept
to demonstrate the feasibility of the 2nd variation stays an open subject of investigation.
(see corresponding subsection for more).
After conversation with few lightning maintainers, it arises as a main conclusion
that currently deployed random rebroadcast of time-sensitive transactions was probably
one of the most effective mitigation that could be solely deployed by lightning
implementation themselves. In this report, 2 variations of those new "transaction-relay
throughput attacks" are presented, i.e respectively the "high overflow" and "low overflow"
variants.
At the time of writing, the "high overflow" attack cost is evaluated to be around
$80k to target real-world lightning channels with safety timelocks in the the order
of four dozen of blocks.
In my reasonable opinion, I believe this new transaction-relay jamming attack can
be as much severe than already known state of the art non-mitigated attacks against
lightning channels while minding the junk transaction traffic cost. However more
investigations of this transaction-relay throughput attack vector can be very valuable.
This new off-chain protocol attack vector is currently tracked by the CVE
Request 1780258 shared to the MITRE, under a single identifier for a "multiple
software, same protocol / attack vector". Report should be updated when a stable
identifier is definitely assigned.
## Background: Transaction-Relay Selection, Announcement & Propagation
At the protocol-level, summarily the transaction relay flow works in the following way.
On initially broadcasting a transaction, the full-node will issue an INV(inventory_set)
to all its connected peers. Considering strictly transaction-relay messages, this inventory
set is composed of either MSG_TX (non-wtxidrelay peers) or MSG_WTX (bip339 wtxidrelay peers),
up to the inventory set limit (if the full-node implements any such inventory set limit).
For all the transaction identifiers announced in the inventory set, that are not normally
not already present in the peer inventory or caches at large, the peer will reply to the
node with a GETDATA(requested_set(MSG_TX)) or GETDATA(requested_set(MSG_WTX)) message. For
each element in the requested set of transactions, the node can forward an individual final
TX message to the peer.
The background description is realized using bitcoin core v28.0rc1 - commit 88f0419c1,
and it is assumed to be run similarly by the node and it's connected transaction-relay peers.
At the implementation-level, bitcoin core presents a number of interfaces to natively
relay a transaction in the peer-to-peer network, though I'll only give the background
for 2 of them. Both those interfaces jointly drop transactions identifiers in the same
buffer, i.e the "forward transaction inventory" (`m_tx_inventory_to_send`).
The 1st) interface examined is the RPC method `sendrawtransaction()`, where a raw
transaction is shared for broadcast by the node (e.g by the wallet). The `BroadcastTransaction()`
is called and within this method correspondingly the logic to accept the single
message with a TX message (`ProcessTransaction()` / `AcceptToMemoryPool()`) and the
logic to relay the transaction over the peer-to-peer network (`RelayTransaction()`).
The second part, i.e the transaction relay logic is the one inserting the transaction
identifier in the forward transaction inventory of each connected transaction-relay peer.
The 2nd) interface examined is the TX message reception (`NetMsgType::TX`), where
a raw transaction is received from any connected peer. The transaction should be processed
by the mempool logic (`ProcessTransaction()`) and if it is accepted it's going to
be propagated to the remainder of the network (`ProcessValidTx()` / RelayTransaction()`),
by inserting the transaction identifier in the "forward transaction inventory".
Come after the transaction selection phase that yield the set of transactions
to be the object of an INV message to the connected transaction-relay peers. This
phase is modulated for each connected peer and happens at periodic ticks (`m_next_inv_send_time`).
During this transaction selection phase, the transactions are topologically ordered
and fee-rate sorted out from the peer's "forward transaction inventory" (`m_tx_inventory_to_send`).
This sorted inventory set is then drained up until reaching the INVENTORY_BROADCAST_MAX
limit (at max: 1000 transactions), and the transaction identifiers are assembled in
an INV message (either using MSG_TX or MSG_WTX) to be then communicated to the connected
transaction-relay peers.
After this selection phase, comes the announcement phase on the side of the node's peer.
At reception of an INV message, the inventory_set received is sorted out, and if the
inventory element is a transaction (`IsGenTxMsg()`), the transaction identifier is
processed (`AddTxAnnouncement()`), and if the MAX_PEER_TX_ANNOUNCEMENTS limit is not
reached, the transaction is queued by the transaction requester (`TxRequestTracker`).
If the transaction has been uniquely announced by the node, the peer should issue
a GETDATA download in the future time, driven by the requester logic (`GetRequestable()`).
If the transactions are announced by the node, faster than the peer can fetch and
download them, the MAX_PEER_TX_ANNOUNCEMENTS limit provokes transaction to be dropped
on the floor.
The two limits INVENTORY_BROADCAST_MAX and MAX_PEER_TX_ANNOUNCEMENTS are the targeted
components by "transaction-throughput overflow attacks" and there will be further analyzed.
## Problem: Pre-Signed Time-Sensitive Transactions and Bounded Transaction-Relay Throughput
There are 2 variations of transaction-relay throughput overflow attacks identified so far,
respectively a "high overflow" attack and a "low overflow" attack. The explanation is started
by a "high overflow" attack.
### "High-Overflow" Transaction-Relay Throughput Attack
In a "high overflow" attack, the goal of the adversary is to leverage the fee-rate sort
of the "forward transaction inventory" and the INVENTORY_BROADCAST_MAX to artificially
stuck down a subset of received transactions in the "forward transaction inventory".
The attack works in the following fashion against a Lightning routing node:
- there is the Mallet, Alice, Mallory channel topology
- Mallet routes a 0.5 BTC HTLC to Mallory through Alice with an interval timelock
- Alice and Mallory pre-signs the commitment + HTLC 2nd stage transaction at 5 sat / vbytes
- at the base-layer level, Alice full-node is connected by 45 inbound peers controlled by Malicia
- at the base-layer level, Alice is connected with a honest transaction-relay peer Bob
- Malicia injects 2300 small-size transactions at 10 sat / vbytes for each of the 45 inbound peers
- at time of the outbound HTLC expiration, Mallory breaks the channel with Alice, commitment + HTLC 2nd stage transactions are broadcast by Alice
- the constantly injected transaction traffic prevents lightning channel transactions to propagate on the Alice-Bob link
- after the interval timelock of blocks have been swallowed, the inbound HTLC is timeout by Mallet, the outbound HTLC is claimed by Mallory
Independently of transaction rebroadcast done by Alice lightning node, as long as it's done
at the same feerate, it still falls under the feerate threshold in the "forward transaction
inventory" generated by the throughput overflow traffic.
There is a simplified proof-of-concept of the high overflow of the feerate sorting
of an alice node's "forward transaction inventory" available here. Tested on a v27.0
bitcoin core software - commit d822839:
https://github.com/ariard/bitcoin/commit/8fc559a4bcd0a2ef8bb4c50aa540a5c4c61a310aNote, in this example the feerate parameters are different ('effective-feerate'), i.e
"0.00010416" for the jammed transaction target and "0.00002083" and for the junk traffic
originating from the feeder connection nodes. Running the test for a few attempt, yield
around ~2000 transactions, which is consistent with the INVENTORY_BROADCAST_PER_SECOND limit
of 7 transactions second, or i.e 4200 transactions per average block.
Let's assume safety targeted timelocks are around 40 blocks (which is done by default by
some lightning network implementations), and the time-sensitive transactions are pre-signed
at 5 sat / virtual byte. If the individual transaction are of size 100 bytes, the throughput
overflow traffic is pre-signed at 10 sat / virtual byte, the attack liquidity cost is
estimated to be at 2_100_000 or $2k per block of delay. For a duration of 40 blocks, the
cost is estimated to be at 840_00_000 satoshis or $80_000.
There are 3 observations to be made on this attack liquidity cost.
Firstly, the adversary can partition the victim's full-node mempool from the rest
of the peer-to-peer network to avoid the throughput overflow transaction traffic to
be effectively burned as on-chain miners fees. All the overflow transaction traffic
can be children junk that have a parent in the victim's people (Alice), but not in
its connected transaction-relay peers (Bob).
Secondly, the adversary can batch the liquidity cost to target a concurrent number
of lightning channels as lightning routing nodes are operating channels dissociately
in the lightning network. The throughput overflow limits, if bitcoin core is run
by default as a full-nodes, can be reached with a single contingent of junk transactions
originating from a unique UTXO set.
Thirdly, the adversary could recycle the traffic of junk children transactions
to lower the liquidity cost of the overall attack to few blocks of buffer, by
periodically spending the parent in the partitioned mempool to a new confirmed
UTXO, and re-generate a layout of junk children transactions.
### "Low-Overflow" Transaction-Relay Throughput Attack
In a "low overflow" attack, the goal of the adversary is to leverage the
MAX_PEER_TX_ANNOUCEMENTS limit until the inbound transactions received by
the peer from a node overflows this limit and subsequently provokes a drop
of the ulterior inbound transaction traffic.
This attack has been tested under few topological configurations of the
transaction-relay peer-to-peer network, where the MAX_PEER_TX_ANNOUNCEMENTS
inbound limit of a target transaction-relay link is attempted to be overflowed.
Let's consider the following transaction-relay network topology: Alice <---> Bob.
Alice and Bob are 2 transaction-relay peers and both of them are running v28.0rc1.
It is assumed there are supporting wtxidrelay on their connections and that Alice
opened the connection to Bob (from Alice's node viewpoint Bob is an outbound peer
or an OUTBOUND_FULL_RELAY).
The name of the game for a "low-overflow" adversary is to open many inbound connections
to Alice node, and inject a high-number of junk transactions to reach Alice's "ingress"
limit of MAX_PEER_TX_ANNOUNCEMENTS on each of those inbound connections. As such trying
to attain Bob's own MAX_PEER_TX_ANNOUNCEMENTS's limit on Alice and Bob link.
Reconsidering the Alice and Bob's topology, let's enrich it with a number of
puppets nodes, and there will be the following topology:
- Mallet <---> Alice
- Mallory <---> Alice
- Malicia <---> Alice
All those 3 puppets inbound nodes are supporting wtxidrelay on theirs connections
and they are the ones opening connections to Alice (from Alice's node viewpoint
Mallet, Mallory and Malicia are all inbound peers, from Bob's node viewpoint they
are completely unrelated peers an ADDR message might not even exist for them).
Each of this peer can announce and transaction-relay up to 5000 unique and
new transactions to Alice so an "ingress" total of 15k transactions, while
the "egress" on the Alice-Bob link is still only 5000 transactions.
Repeating the trick in the 1st observation about partitioning full-node mempools,
all the transactions might not connect on Bob's mempool, as such the only way
for Bob to "discover" it is through announcements on the Alice-Bob link.
This network topology configuration has been tested, while none so far
triggering the MAX_PEER_TX_ANNOUNCEMENT limit on the Alice - Bob connection.
This network topology configuration was initialized with 20 inbound peers
connected to Alice, with each a stock of 4600 transactions to announce to
Alice and the configuration ran for 1 hour. Bob's mempool was accusing
a deficit of ~10k of transactions at the end of the test.
At the 33th minute, the internal buffer of the `TxRequesterTracker` tracking
the number of announced transactions reached the max size of 1252, with an
initial size of 0 (minute 0 ) and a size of 780, at the end of the 60th minute.
https://github.com/ariard/bitcoin/commit/c958f22e4cb7c193378ad7cd1c05b2c331ad6bd5This network topology configuration had only one link to be processed by
Bob i.e the Alice-Bob transaction-relay link. More advanced experimental
tests of a "low-overflow throughput" attack could be, e.g (non-exhaustive):
- Bob's node being opened up to `DEFAULT_MAX_PEER_CONNECTIONS` with average tx traffic
- Bob's node receiving a valid block every 10 min
All those elements are busying the CPUs resources available to the Bob node,
and as such it should diminish processing threads available to process the
"egress" traffic received from Alice, let's say on modern x86_64 chips.
In my reasonable belief, this "low overflow", if it's practical feasibility
can be demonstrated, is indeed more severe than the "high overflow" variant,
as there is no external feerate to meet to thwart the propagation of a
target transaction (e.g a lightning time-sensitive justice transaction).
The liquidity cost is no more linear on the current value of the
MAX_PEER_TX_ANNOUNCEMENT limit and the minimal valid transaction size.
## Solutions
On the pure lightning-side / contracting protocol implementations and node
operators, there are 4 mitigations that can be envisioned. I believe the
mitigation d) can be the more efficient, namely deploying more full-nodes
singly connected to the transaction-relay initiating node to absorb the
overflow of transactions.
a) Random transaction rebroadcast: for "low-overflow", a lightning node
could more aggressively re-sign and rebroadcast time-sensitive transactions,
probabilistically diminishing the odds of the overflow reaching
MAX_PEER_TX_ANNOUNCEMENTS.
b) Aggressive fee-rebroadcasting: for "high-overflow" dynamic fee-rebroadcasting
can asymmetrically increase the attack cost. On the other hand, this is increasing
surface to miner harvesting attacks, where a miner could inject junk children
traffic to trigger a lightning node to fee-bump time-sensitive transactions [1].
c) Limitation of "Identical Finality" Time-Sensitive Transactions: for "low-overflow",
a lightning routing hop could limit the number of transactions with the "identical
finality" (i.e absolute or relative timelock being final at the same chain tip) to
limit the worst-case flow of time-sensitive transactions that have to propagate on
the network, at the same chain-defined time period.
d) Over-provisioning transaction-relay throughput with adjacent full-nodes: both
for "high-overflow" and "low-overflow", the initial transaction-relay node can
be made no listening for peers connections and block-relay-only. All the
transaction-relay connections can be opened to adjacent "trusted" full-nodes,
that can swallow more of the overflow traffic. Those adjacent full-nodes can
be regularly connected to the bitcoin network.
Mitigations at the base-layer directly in the transaction-relay stack were not
seriously considered during the embargo period. In my personal opinion, a
a robust mitigation against transaction-relay throughput attacks could be
more efficient at the base-layer, rather than at the lightning / contracting
protocols levels.
## Timeline
- 2023-06-05: Report of the finding to Bastien Teinturier, Olaoluwa Osuntokun,
Eugene Siegel, Anthony Towns, Gloria Zhao, Greg Sanders, Matt Corallo, Rusty
Russell, Suhas Daftuar
- 2023-06: Discussions among Lightning folks on real-exercise of the finding
- 2024-07-31: Proposition of a publication date for the finding for December
- 2024-08: Realistic and deployable mitigations discussed among Lightning folks
- 2024-11-05: Communication of an exact public disclosure date
- 2024-12-03: CVE ID Request shared to the MITRE
## Conclusion
In this report, a new transaction-relay jamming attack against off-chain protocols
was introduced by exploiting the throughput limits of a full-node transaction-relay
selection, announcement and propagation algorithms. This attack appears to be
plausible against lightning channel funds under real-world scenario, while it
deserves indeed more investigationĀ and experiment.
Two variations of the attacks have been presented, a "high-overflow" variant,
exploiting the transaction announcement fee-rate sorting algorithm on the
sender-side, and a "low-overflow" variant, exploiting the announcement processing
limits on the receiver-side. The first variant is presented with a minimal
proof-of-concept, while the second variant is left open for future research.
All mistakes and opinions are my own and please verify any information reported.
[0] "...should be always evaluated with regards to base layer network topology,
*tx-relay propagation rules*, mempools behaviors, consistent policy applied by majority
of nodes and ongoing blockspace demand. All these components are direct parameters of
LN security. Due to the network being public, a malicious channel counterparty do have
an incentive to tweak them to steal from you..."
cf. Pinning: The Good, the Bad, the Ugly:
https://gnusha.org/pi/bitcoindev/CALZpt+Ea=GyzEAfJBZzdFvus4_U=x73eA+=J=sN2LONq9_V5dw@mail.gmail.com/[1]
https://diyhpl.us/~bryan/irc/bitcoin/bitcoin-dev/linuxfoundation-pipermail/lightning-dev/2020-February/002569.txt