[Bitcoin-development] Bloom io attack effectiveness

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* [Bitcoin-development] Bloom io attack effectiveness
@ 2013-08-19  0:13 Peter Todd
  2013-08-19  0:59 ` Gavin Andresen
  2013-08-19  9:29 ` Mike Hearn
  0 siblings, 2 replies; 6+ messages in thread
From: Peter Todd @ 2013-08-19  0:13 UTC (permalink / raw)
  To: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

Did some tests with a varient of attack... In short it's fairly easy to
saturate a node's disk IO bandwidth and when that happens the node
quickly falls behind in consensus, not to mention becomes useless to
it's peers. Note that the particular varient I tried is different, and
less efficient in bandwidth, than others discussed privately.

Bandwidth required to, for example, take out a Amazon EC2 m1.small is
about 1KiB/second, and results in it getting multiple blocks behind in
consensus, or a delay on the order of minutes to tens of minutes. I had
similar results attacking a p2pool node I own that has a harddrive and
4GiB of ram - of course my orphan rate went to 100%

It'd be interesting to repeat the attack by distributing it from
multiple peers rather than from a single source. At that point the
attack could be made indistinguishable from a bunch of SPV wallets
rescanning the chain for old transactions.

In any case given that SPV peers don't contribute back to the network
they should obviously be heavily deprioritized and served only with
whatever resources a node has spare. The more interesting question is
how do you make it possible for SPV nodes to gain priority over an
attacker? It has to be some kind of limited resource - schemes that rely
on things like prioritizing long-lived identities fail against patient
attackers - time doesn't make an identity expensive if the identity is
free in the first place. Similarly summing up the fees paid by
transactions relayed from that peer also fail, because an attacker can
easily broadcast the same transaction to multiple peers at once - it's
not a limited resource. Bandwidth is limited, but orders of magnitude
cheaper for the attacker than a Android wallet on a dataplan.

-- 
'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bitcoin-development] Bloom io attack effectiveness
  2013-08-19  0:13 [Bitcoin-development] Bloom io attack effectiveness Peter Todd
@ 2013-08-19  0:59 ` Gavin Andresen
  2013-08-19  1:34   ` Peter Todd
  2013-08-19  2:53   ` John Dillon
  2013-08-19  9:29 ` Mike Hearn
  1 sibling, 2 replies; 6+ messages in thread
From: Gavin Andresen @ 2013-08-19  0:59 UTC (permalink / raw)
  To: Peter Todd; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 895 bytes --]

Peter said:
"In any case given that SPV peers don't contribute back to the network
they should obviously be heavily deprioritized and served only with
whatever resources a node has spare."

This seems very much like a "cut off your nose to spite your face" solution.

SPV peers are INCREDIBLY IMPORTANT to the growth of Bitcoin; much more
important than nodes that have the bandwidth and disk I/O capability of
being a full node.  Bitcoin will be just fine if there are never more than
10,000 big, beefy, full nodes forming the backbone of the network, but will
be NOTHING if we don't support tens of millions of lightweight SPV devices.

Ok, that's an exaggeration, Bitcoin would be just fine in an Electrum model
where tens of millions of lightweight devices rely 100% on a full node to
operate. But I would prefer the more decentralized, less-trust-required SPV
model.

-- 
--
Gavin Andresen

[-- Attachment #2: Type: text/html, Size: 1631 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bitcoin-development] Bloom io attack effectiveness
  2013-08-19  0:59 ` Gavin Andresen
@ 2013-08-19  1:34   ` Peter Todd
  2013-08-19  2:53   ` John Dillon
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Todd @ 2013-08-19  1:34 UTC (permalink / raw)
  To: Gavin Andresen; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 2185 bytes --]

On Mon, Aug 19, 2013 at 10:59:18AM +1000, Gavin Andresen wrote:
> Peter said:
> "In any case given that SPV peers don't contribute back to the network
> they should obviously be heavily deprioritized and served only with
> whatever resources a node has spare."
> 
> This seems very much like a "cut off your nose to spite your face" solution.
> 
> SPV peers are INCREDIBLY IMPORTANT to the growth of Bitcoin; much more
> important than nodes that have the bandwidth and disk I/O capability of
> being a full node.  Bitcoin will be just fine if there are never more than
> 10,000 big, beefy, full nodes forming the backbone of the network, but will
> be NOTHING if we don't support tens of millions of lightweight SPV devices.
> 
> Ok, that's an exaggeration, Bitcoin would be just fine in an Electrum model
> where tens of millions of lightweight devices rely 100% on a full node to
> operate. But I would prefer the more decentralized, less-trust-required SPV
> model.

Don't read too much into what I said; under normal circumstances when a
Bitcoin node isn't being attacked, there will be plenty of spare
capacity for SPV nodes. All I'm suggesting is that we make sure serving
those nodes doesn't come at the expense of maintaining consensus -
mainly distributing new blocks around the network so the blockchain
keeps moving forward. (something many SPV peers can help with anyway)

I'd much rather my Android wallet take a long time to sync up, than
blocks get re-organized because miners found themselves separated from
one another, let along someone clever using that to do double-spend
attacks during those re-orgs. After all, I can always find a private
node to connect to that won't be overloaded because it doesn't accept
connections from anonymous users. It's also easy to just use really
basic measures to limit abuse, like "sign up with your bitcointalk"
account, or "pay 10 cents for an account to discourage spammers"

Anyway all things less likely to be needed if attackers know all they
are going to manage to do is inconvenience people temporarily because
the network itself will keep running.

-- 
'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bitcoin-development] Bloom io attack effectiveness
  2013-08-19  0:59 ` Gavin Andresen
  2013-08-19  1:34   ` Peter Todd
@ 2013-08-19  2:53   ` John Dillon
  2013-08-19 21:57     ` Wendell
  1 sibling, 1 reply; 6+ messages in thread
From: John Dillon @ 2013-08-19  2:53 UTC (permalink / raw)
  To: Gavin Andresen; +Cc: Bitcoin Dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Mon, Aug 19, 2013 at 12:59 AM, Gavin Andresen
<gavinandresen@gmail•com> wrote:
> Peter said:
> "In any case given that SPV peers don't contribute back to the network
> they should obviously be heavily deprioritized and served only with
> whatever resources a node has spare."
>
> This seems very much like a "cut off your nose to spite your face" solution.
>
> SPV peers are INCREDIBLY IMPORTANT to the growth of Bitcoin; much more
> important than nodes that have the bandwidth and disk I/O capability of
> being a full node.  Bitcoin will be just fine if there are never more than
> 10,000 big, beefy, full nodes forming the backbone of the network, but will
> be NOTHING if we don't support tens of millions of lightweight SPV devices.
>
> Ok, that's an exaggeration, Bitcoin would be just fine in an Electrum model
> where tens of millions of lightweight devices rely 100% on a full node to
> operate. But I would prefer the more decentralized, less-trust-required SPV
> model.

So tell us how is your "vision" of 10,000 big beefy full nodes with SPV peers
any different from the Electrum model? These days Electrum clients have block
headers and verify that transactions have merkle paths to the block headers.
The only difference I see is that SPV uses bloom filtering and Electrum can
query by transaction. But Mike wants to add querying by transaction to full
nodes anyway, and one of the purported advantages of this UTXO proof stuff is
that you can query servers for UTXO's by address, so I see no difference at
all. A patch to do bloom filtering on Electrum would be amusing to me.

Here you have Peter talking about clever ways to actually get decentralization
by having SPV peers donate back to the network with spare bandwidth, like
relaying blocks, not to mention his partial UTXO set ideas, and you completely
ignore that. But I guess that would raise ugly questions when people realize
they can't now contribute back to Bitcoin, because the blocksize is a gigabyte
of microtransactions... It may also raise ugly questions with regulators that
may find the idea of "full node == data chokepoint == regulatory chokepoint" an
attractive notion. Why are there not any competent people other than Peter who
really have the guts to bring up these proposals? I've little luck getting
proof-of-concepts built for money anyway. Maybe we just have a darth of smart
competent people in this space.

You do a good job of signaling your priorities Gavin. The payment protocol
includes no notion that you may want to pay anyone but a SSL certified
merchant. Yes I know the crypto can be upgraded, but it says volumes that you
pushed for that first, without even the slightest token effort to allow
individuals to participate in any way. Sad given you have made things *less*
secure because there is no safe way to get money *into* my wallet with the
payment protocol, but could have been.

Tell me, when my decentralization pull-req is voted on, which way are you
planning on voting?

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBCAAGBQJSEYg/AAoJEEWCsU4mNhiPQBIH/A2cef0NDzu72CY0+N1HdPO+
fdixwncAg1ok6YdJj5WALjHbkhJ+QRVEZoRr6rHPxxxTywI+HiPN1oaopIrq3StP
bNpvouaWXLyw6xHMrMYefVOluHNZg3lu1akLdGuYA7rDHLwP/RhlF1FFzXSxKFsp
ANcw4WW7U5r5nfBHYc/a9xo8S6THI7Nv2NDW6WaRQO4X9sbSKSdwanoe75CLsRzE
E2cPNvwG4WA/MUgkl3Ao6dMsEPPa8dJK98LaS4BE/m9iFWQiV8t35/FQ0GAFQoJo
PQUs8aAWiI0caAxI0vddxKQ+YlwPw2m1QH6h19wUO7+KLKOtxMFmWoDu/OLdTRM=
=IfkA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bitcoin-development] Bloom io attack effectiveness
  2013-08-19  0:13 [Bitcoin-development] Bloom io attack effectiveness Peter Todd
  2013-08-19  0:59 ` Gavin Andresen
@ 2013-08-19  9:29 ` Mike Hearn
  1 sibling, 0 replies; 6+ messages in thread
From: Mike Hearn @ 2013-08-19  9:29 UTC (permalink / raw)
  To: Peter Todd; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 3894 bytes --]

On Mon, Aug 19, 2013 at 2:13 AM, Peter Todd <pete@petertodd•org> wrote:

> In any case given that SPV peers don't contribute back to the network
> they should obviously be heavily deprioritized and served only with
> whatever resources a node has spare.

Well, I'm glad we're making progress towards this kind of model :)

If I had to write a scoring function for node importance, I'd start by
making nodes I connected to more important than nodes that connected to me.
That should prevent the kind of attacks you're talking about. You can then
score within those subsets with greater subtlety, like using how long the
connection has been active (or extending that with signed timestamps).

This doesn't have any in-built bias against SPV nodes, which is probably
very hard to technically implement anyway. But it encodes the intuitive
notion that nodes I selected myself are less likely to be DoS attackers
than nodes which connected to me.

But the trick is to implement the prioritisation code. The usual way to do
this is to have a thread pool that pops requests off a queue. You can
either have multiple queues for different priority bands, or code that
locks the queue and re-orders it when something new is added. I tend to
find the multiple queues approach simpler, especially, it's simpler to
export statistics about that via RPC that make it easy to understand what's
going on underneath the hood.

So IMHO a patch to address I/O exhaustion should look something like this:

   1. Add a thread pool of 2-3 threads (to give the kernel room to overlap
   IO) which take in CBlock load requests and then do the load/parse/filter in
   the background.

   2. Each thread starts by blocking on a counting semaphore which
   represents the total number of requests.

   3. The network thread message loop is adjusted so it can receive some
   kind of futures/callbacks/closure object (I guess Boost provides this,
   alternatively we could switch to using C++11). The closures should also
   have the score of the node they were created for (note: score not a CNode*
   as that complicates memory management).

   4. At the start of the network loop a thread-local (or global) variable
   is set that contains the nodes current score, which is just an n-of-m score
   where M is the total number of connected nodes and N is the ranked
   importance. At that point any code that needs to prioritise nodes off
   against each other can just check that variable whilst doing work. The
   network loop looks at which file descriptors are select()able and their
   scores, which closures are pending execution and their scores, then decides
   whether to handle new network data or run a closure. If there is a draw
   between the scores, closures take priority to reduce memory pressure and
   lower latency.

   5. Handling of "getdata" then ends up calling a function that requests a
   load of a block from disk, and runs a closure when it's finished. The
   closure inherits the nodes current score, of course, so when the block load
   is completed execution of the rest of the getdata handling takes priority
   over handling new traffic from network nodes. When the closure executes, it
   writes the loaded/filtered data out over the network socket and deletes

The function that takes a CBlockIndex and yields a future<CBlock> or
closure or whatever would internally lock the job queue(s), add the new
task and then do a stable sort of the queue using the scoring function,
which in this case would simply use the node score as the job score.

It's a fair amount of work, but should ensure that "good" nodes outcompete
"bad" nodes for disk IO. Any other disk IO operations can be done in the
same way. Note that the bulk of LevelDB write work is already handled on a
background thread. The foreground thread only writes a log entry to disk
and updates some in-memory data structures.

[-- Attachment #2: Type: text/html, Size: 4443 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bitcoin-development] Bloom io attack effectiveness
  2013-08-19  2:53   ` John Dillon
@ 2013-08-19 21:57     ` Wendell
  0 siblings, 0 replies; 6+ messages in thread
From: Wendell @ 2013-08-19 21:57 UTC (permalink / raw)
  To: John Dillon; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 3230 bytes --]

John,

I for one support your rallying cry of decentralization.

If you are implying that even 10,000 full nodes seems far, far too few for a distributed system that may ultimately face a very well-connected and well-funded threat model, I agree with you completely. However, I took Gavin's statement to mean something like a factual statement about the load-bearing nature of that many nodes, rather than an actual target number for some future iteration of the network.

Partial UTXO sets sound like a great idea -- are they really being ignored? I am pretty new to the development process here, but I assumed (as with many open source projects) that ideation, debate and implementation take a while to churn. Has a prototype of that been developed already, are you implying that you funded something like that and it never got built? If there are some GitHub links that I missed, please send them over.

Maybe you should open that topic back up in its own thread, so we can bring it back into view?

-wendell

grabhive.com | twitter.com/grabhive | gpg: 6C0C9411

On Aug 19, 2013, at 4:53 AM, John Dillon wrote:

> So tell us how is your "vision" of 10,000 big beefy full nodes with SPV peers
> any different from the Electrum model? These days Electrum clients have block
> headers and verify that transactions have merkle paths to the block headers.
> The only difference I see is that SPV uses bloom filtering and Electrum can
> query by transaction. But Mike wants to add querying by transaction to full
> nodes anyway, and one of the purported advantages of this UTXO proof stuff is
> that you can query servers for UTXO's by address, so I see no difference at
> all. A patch to do bloom filtering on Electrum would be amusing to me.
> 
> Here you have Peter talking about clever ways to actually get decentralization
> by having SPV peers donate back to the network with spare bandwidth, like
> relaying blocks, not to mention his partial UTXO set ideas, and you completely
> ignore that. But I guess that would raise ugly questions when people realize
> they can't now contribute back to Bitcoin, because the blocksize is a gigabyte
> of microtransactions... It may also raise ugly questions with regulators that
> may find the idea of "full node == data chokepoint == regulatory chokepoint" an
> attractive notion. Why are there not any competent people other than Peter who
> really have the guts to bring up these proposals? I've little luck getting
> proof-of-concepts built for money anyway. Maybe we just have a darth of smart
> competent people in this space.
> 
> You do a good job of signaling your priorities Gavin. The payment protocol
> includes no notion that you may want to pay anyone but a SSL certified
> merchant. Yes I know the crypto can be upgraded, but it says volumes that you
> pushed for that first, without even the slightest token effort to allow
> individuals to participate in any way. Sad given you have made things *less*
> secure because there is no safe way to get money *into* my wallet with the
> payment protocol, but could have been.
> 
> Tell me, when my decentralization pull-req is voted on, which way are you
> planning on voting?


[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 841 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-08-19 22:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-19  0:13 [Bitcoin-development] Bloom io attack effectiveness Peter Todd
2013-08-19  0:59 ` Gavin Andresen
2013-08-19  1:34   ` Peter Todd
2013-08-19  2:53   ` John Dillon
2013-08-19 21:57     ` Wendell
2013-08-19  9:29 ` Mike Hearn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox