public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Jared Lee Richardson <jaredr26@gmail•com>
To: David Vorick <david.vorick@gmail•com>
Cc: Bitcoin Dev <bitcoin-dev@lists•linuxfoundation.org>
Subject: Re: [bitcoin-dev] Hard fork proposal from last week's meeting
Date: Fri, 31 Mar 2017 23:15:09 -0700	[thread overview]
Message-ID: <CAD1TkXuyAH++1nwzZ-4BYv=kT=-8jycMsYx3xFriBiYeLmwvNg@mail.gmail.com> (raw)
In-Reply-To: <CAFVRnyr-Z9YWtT3r+7-fGejzgxKhH3-kQuo8JQFqKDpyZNBBdg@mail.gmail.com>

> So your cluster isn't going to need to plan to handle 15k transactions per second, you're really looking at more like 200k or even 500k transactions per second to handle peak-volumes. And if it can't, you're still going to see full blocks.

When I first began to enter the blocksize debate slime-trap that we
have all found ourselves in, I had the same line of reasoning that you
have now.  It is clearly untenable that blockchains are an incredibly
inefficient and poorly designed system for massive scales of
transactions, as I'm sure you would agree.  Therefore, I felt it was
an important point for people to accept this reality now and stop
trying to use Blockchains for things they weren't good for, as much
for their own good as anyone elses.  I backed this by calculating some
miner fee requirements as well as the very issue you raised.  A few
people argued with me rationally, and gradually I was forced to look
at a different question: Granted that we cannot fit all desired
transactions on a blockchain, how many CAN we effectively fit?

It took another month before I actually changed my mind.  What changed
it was when I tried to make estimations, assuming all the reasonable
trends I could find held, about future transaction fees and future
node costs.  Did they need to go up exponentially?  How fast, what
would we be dealing with in the future?  After seeing the huge
divergence in node operational costs without size increases($3 vs
$3000 after some number of years stands out in my memory), I tried to
adjust various things, until I started comparing the costs in BTC
terms.  I eventually realized that comparing node operational costs in
BTC per unit time versus transaction costs in dollars revealed that
node operational costs per unit time could decrease without causing
transaction fees to rise.  The transaction fees still had to hit $1 or
$2, sometimes $4, to remain a viable protection, but otherwise they
could become stable around those points and node operational costs per
unit time still decreased.

None of that may mean anything to you, so you may ignore it all if you
like, but my point in all of that is that I once used similar logic,
but any disagreements we may have does not mean I magically think as
you implied above.  Some people think blockchains should fit any
transaction of any size, and I'm sure you and I would both agree
that's ridiculous.  Blocks will nearly always be full in the future.
There is no need to attempt to handle unusual volume increases - The
fee markets will balance it and the use-cases that can barely afford
to fit on-chain will simply have to wait for awhile.  The question is
not "can we handle all traffic," it is "how many use-cases can we
enable without sacrificing our most essential features?"  (And for
that matter, what is each essential feature, and what is it worth?)

There are many distinct cut-off points that we could consider.  On the
extreme end, Raspberry Pi's and toasters are out.  Data-bound mobile
phones are out for at least the next few years if ever.  Currently the
concern is around home user bandwidth limits.  The next limit after
that may either be the CPU, memory, or bandwidth of a single top-end
PC.  The limit after that may be the highest dataspeeds that large,
remote Bitcoin mining facilities are able to afford, but after fees
rise and a few years, they may remove that limit for us.  Then the
next limit might be on the maximum amount of memory available within a
single datacenter server.

At each limit we consider, we have a choice of killing off a number of
on-chain usecases versus the cost of losing the nodes who can't reach
the next limit effectively.  I have my inclinations about where the
limits would be best set, but the reality is I don't know the numbers
on the vulnerability and security risks associated with various node
distributions.  I'd really like to, because if I did I could begin
evaluating the costs on each side.

> How much RAM do you need to process blocks like that?

That's a good question, and one I don't have a good handle on.  How
does Bitcoin's current memory usage scale?  It can't be based on the
UTXO, which is 1.7 GB while my node is only using ~450mb of ram.  How
does ram consumption increase with a large block versus small ones?
Are there trade-offs that can be made to write to disk if ram usage
grew too large?

If that proved to be a prohibitively large growth number, that becomes
a worthwhile number to consider for scaling.  Of note, you can
currently buy EC2 instances with 256gb of ram easily, and in 14 years
that will be even higher.

> So you have to rework the code to operate on a computer cluster.

I believe this is exactly the kind of discussion we should be having
14 years before it might be needed.  Also, this wouldn't be unique -
Some software I have used in the past (graphite metric collection)
came pre-packaged with the ability to scale out to multiple machines
split loads and replicate the data, and so could future node software.

> Further, are storage costs consistent when we're talking about setting up clusters? Are bandwidth costs consistent when we're talking about setting up clusters? Are RAM and CPU costs consistent when we're talking about setting up clusters? No, they aren't.

Bandwidth costs are, as intra-datacenter bandwidth is generally free.
The other ones warrant evaluation for the distant future.  I would
expect that CPU resources is the first thing we would have to change -
13 thousand transactions per second is an awful lot to process.  I'm
not intimately familiar with the processing - Isn't it largely
signature verification of the transaction itself, plus a minority of
time spent checking and updating utxo values, and finally a small
number of hashes to check block validity?  If signature verification
was controlling, a specialized asic chip(on a plug-in card) might be
able to verify signatures hundreds of times faster, and it could even
be on a cheap 130nm chipset like the first asic miners rushed to
market.  Point being, there are options and it may warrant looking
into after the risk to node reductions.

> You'd need a handful of experts just to maintain such a thing.

I don't think this is as big a deal as it first might seem.  The
software would already come written to be spanned onto multiple
machines - it just needs to be configured.  For the specific question
at hand, the exchange would already have IT staff and datacenter
capacity/operations for their other operations.  In the more general
case, the numbers involved don't work out to extreme concerns at that
level.  The highest cpu usage I've observed on my nodes is less than
5%, less than 1% for the time I just checked, handling ~3 tx/s.  So
being conservative, if it hits 100% on one core at 60-120 tx/s, that
works out to ~25-50 8-core machines.  But again, that's a 2-year old
laptop CPU and we're talking about 14 years into the future.  Even if
it was 25 machines, that's the kind of operation a one or two man IT
team just runs on the side with their extra duties.  It isn't enough
to hire a fulltime tech for.

> Disks are going to be failing every day when you are storing multiple PB, so you can't just count a flat cost of $20/TB and expect that to work.

I mean, that's literally what Amazon does for you with S3, which was
even cheaper than the EBS datastore pricing I was looking at.  So....
Even disregarding that, raid operation was a solved thing more than 10
years ago, and hard drives 14 years out would be roughly ~110 TB for a
$240 hard drive at a 14%/year growth rate.  In 2034 the blockchain
would fit on 10 of those.  Not exactly a "failing every day" kind of
problem.  By 2040, you'd need *gasp* 22 $240 hard drives.  I mean, it
is a lot, but not a lot like you're implying.

> And you need a way to rebuild everything without taking the system offline.

That depends heavily upon the tradeoffs the businesses can make.  I
don't think node operation at an exchange is a five-nines uptime
operation.  They could probably tolerate 3 nines.  The worst that
happens is occasionally people's withdrawals and deposit are delayed
slightly.  It won't shut down trading.

> I'm sure there are a dozen other significant issues that one of the Visa architects could tell you about when dealing with mission-critical data at this scale.

Visa stores the only copy.  They can't afford to lose the data.
Bitcoin isn't like that, as others pointed out.  And for most
businesses, if their node must be rebooted periodically, it isn't a
huge deal.

> Once we grow the blocksize large enough that a single computer can't do all the processing all by itself we get into a world of much harder, much more expensive scaling problems.

Ok, when is that point, and what is the tradeoff in terms of nodes?
Just because something is hard doesn't mean it isn't worth doing.
That's just a defeatist attitude.  How big can we get, for what
tradeoffs, and what do we need to do to get there?

> You have to check each transaction against each other transaction to make sure that they aren't double spending eachother.

This is really not that hard.  Have a central database, update/check
the utxo values in block-store increments.  If a utxo has already been
used this increment, the block is invalid.  If the database somehow
got too big(not going to happen at these scales, but if it did), it
can be sharded trivially on the transaction information.  These are
solved problems, the free database software that's available is pretty
powerful.

> You have to be a lot more clever than that to get things working and consistent.

NO, NOT CLEVER.  WE CAN'T DO THAT.

Sorry, I had to. :)

> None of them have cost structures in the 6 digit range, and I'd bet (without actually knowing) that none of them have cost structures in the 7 digit range either.

I know of and have experience working with systems that handled
several orders of magnitude more data than this.  None of the issues
brought up above are problems that someone hasn't solved.  Transaction
commitments to databases?  Data consistency across multiple workers?
Data storage measured in exabytes?  Data storage and updates
approaching hundreds of millions of datapoints per second?  These
things are done every single day at numerous companies.

On Fri, Mar 31, 2017 at 11:23 AM, David Vorick <david.vorick@gmail•com> wrote:
> Sure, your math is pretty much entirely irrelevant because scaling systems
> to massive sizes doesn't work that way.
>
> At 400B transactions per year we're looking at block sizes of 4.5 GB, and a
> database size of petabytes. How much RAM do you need to process blocks like
> that? Can you fit that much RAM into a single machine? Okay, you can't fit
> that much RAM into a single machine. So you have to rework the code to
> operate on a computer cluster.
>
> Already we've hit a significant problem. You aren't going to rewrite Bitcoin
> to do block validation on a computer cluster overnight. Further, are storage
> costs consistent when we're talking about setting up clusters? Are bandwidth
> costs consistent when we're talking about setting up clusters? Are RAM and
> CPU costs consistent when we're talking about setting up clusters? No, they
> aren't. Clusters are a lot more expensive to set up per-resource because
> they need to talk to eachother and synchronize with eachother and you have a
> LOT more parts, so you have to build in redundancies that aren't necessary
> in non-clusters.
>
> Also worth pointing out that peak transaction volumes are typically 20-50x
> the size of typical transaction volumes. So your cluster isn't going to need
> to plan to handle 15k transactions per second, you're really looking at more
> like 200k or even 500k transactions per second to handle peak-volumes. And
> if it can't, you're still going to see full blocks.
>
> You'd need a handful of experts just to maintain such a thing. Disks are
> going to be failing every day when you are storing multiple PB, so you can't
> just count a flat cost of $20/TB and expect that to work. You're going to
> need redundancy and tolerance so that you don't lose the system when a few
> of your hard drives all fail within minutes of eachother. And you need a way
> to rebuild everything without taking the system offline.
>
> This isn't even my area of expertise. I'm sure there are a dozen other
> significant issues that one of the Visa architects could tell you about when
> dealing with mission-critical data at this scale.
>
> --------
>
> Massive systems operate very differently and are much more costly per-unit
> than tiny systems. Once we grow the blocksize large enough that a single
> computer can't do all the processing all by itself we get into a world of
> much harder, much more expensive scaling problems. Especially because we're
> talking about a distributed system where the nodes don't even trust each
> other. And transaction processing is largely non-parallel. You have to check
> each transaction against each other transaction to make sure that they
> aren't double spending eachother. This takes synchronization and prevents
> 500 CPUs from all crunching the data concurrently. You have to be a lot more
> clever than that to get things working and consistent.
>
> When talking about scalability problems, you should ask yourself what other
> systems in the world operate at the scales you are talking about. None of
> them have cost structures in the 6 digit range, and I'd bet (without
> actually knowing) that none of them have cost structures in the 7 digit
> range either. In fact I know from working in a related industry that the
> cost structures for the datacenters (plus the support engineers, plus the
> software management, etc.) that do airline ticket processing are above $5
> million per year for the larger airlines. Visa is probably even more
> expensive than that (though I can only speculate).


  parent reply	other threads:[~2017-04-01  6:15 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-29 19:33 Daniele Pinna
2017-03-29 20:28 ` Peter R
2017-03-29 22:17   ` Jared Lee Richardson
2017-03-29 20:28 ` David Vorick
2017-03-29 22:08   ` Jared Lee Richardson
2017-03-30  7:11     ` Luv Khemani
2017-03-30 17:16       ` Jared Lee Richardson
2017-03-31  4:21         ` Luv Khemani
2017-03-31  5:28           ` Jared Lee Richardson
2017-03-31  8:19             ` Luv Khemani
2017-03-31 15:59               ` Jared Lee Richardson
2017-03-31 16:14                 ` David Vorick
2017-03-31 16:46                   ` Jared Lee Richardson
2017-03-31 18:23                     ` David Vorick
2017-03-31 18:58                       ` Eric Voskuil
2017-04-01  6:15                       ` Jared Lee Richardson [this message]
  -- strict thread matches above, loose matches on Subject: below --
2017-03-31 21:23 Rodney Morris
2017-03-31 23:13 ` Eric Voskuil
     [not found]   ` <CABerxhGeofH4iEonjB1xKOkHcEVJrR+D4QhHSw5cWYsjmW4JpQ@mail.gmail.com>
2017-04-01  1:41     ` Rodney Morris
2017-04-01  6:18   ` Jared Lee Richardson
2017-04-01  7:41     ` Eric Voskuil
     [not found]       ` <CAAt2M1_sHsCD_AX-vm-oy-4tY+dKoDAJhfVUc4tnoNBFn-a+Dg@mail.gmail.com>
     [not found]         ` <CAAt2M19Gt8PmcPUGUHKm2kpMskpN4soF6M-Rb46HazKMV2D9mg@mail.gmail.com>
2017-04-01 14:45           ` Natanael
     [not found]       ` <CAD1TkXusCe-O3CGQkXyRw_m3sXS9grGxMqkMk8dOvFNXeV5zGQ@mail.gmail.com>
2017-04-01 18:42         ` Jared Lee Richardson
     [not found]   ` <CAAt2M1_kuCBQWd9dis5UwJX8+XGVPjjiOA54aD74iS2L0cYcTQ@mail.gmail.com>
     [not found]     ` <CAAt2M19Nr2KdyRkM_arJ=LBnqDQQyLQ2QQ-UBC8=gFnemCdPMg@mail.gmail.com>
2017-04-01 13:26       ` Natanael
2017-03-29 19:50 Raystonn .
2017-03-30 10:34 ` Tom Zander
2017-03-30 11:19   ` David Vorick
2017-03-30 21:42     ` Jared Lee Richardson
2017-03-30 11:24   ` Aymeric Vitte
2017-03-28 19:56 Paul Iverson
2017-03-28 20:16 ` Pieter Wuille
2017-03-28 20:43 ` Tom Zander
2017-03-28 20:53   ` Alphonse Pace
2017-03-28 21:06     ` Luke Dashjr
2017-03-28 16:59 Wang Chun
2017-03-28 17:13 ` Matt Corallo
2017-03-29  8:45   ` Jared Lee Richardson
2017-03-28 17:23 ` Alphonse Pace
2017-03-28 17:31   ` Wang Chun
2017-03-28 17:33     ` Jeremy
2017-03-28 17:50     ` Douglas Roark
2017-03-28 17:33   ` Juan Garavaglia
2017-03-28 17:53     ` Alphonse Pace
2017-03-28 22:36       ` Juan Garavaglia
2017-03-29  2:59         ` Luv Khemani
2017-03-29  6:24         ` Emin Gün Sirer
2017-03-29 15:34           ` Johnson Lau
2017-04-01 16:15             ` Leandro Coutinho
2017-03-29  9:16       ` Jared Lee Richardson
2017-03-29 16:00         ` Aymeric Vitte
2017-03-28 17:34 ` Johnson Lau
2017-03-28 17:46   ` Luke Dashjr
2017-03-28 20:50   ` Tom Zander
2017-03-29  4:21     ` Johnson Lau
2017-03-28 20:48 ` Tom Zander
2017-03-29  6:32 ` Bram Cohen
2017-03-29  9:37   ` Jorge Timón
2017-03-29 19:07     ` Jared Lee Richardson
2017-04-02 19:02       ` Staf Verhaegen
2017-03-29  7:49 ` Martin Lízner
2017-03-29 15:57   ` David Vorick
2017-03-29 16:08     ` Aymeric Vitte
     [not found]       ` <CAFVRnyo1XGNbq_F8UfqqJWHCVH14iMCUMU-R5bOh+h3mtwSUJg@mail.gmail.com>
2017-03-29 16:18         ` David Vorick
2017-03-29 16:20           ` Andrew Johnson
2017-03-29 16:25             ` David Vorick
2017-03-29 16:41               ` Andrew Johnson
2017-03-29 17:14                 ` Aymeric Vitte
2017-03-29 20:53               ` Jared Lee Richardson
2017-03-29 20:32           ` Jared Lee Richardson
2017-03-29 21:36             ` praxeology_guy
2017-03-29 22:33             ` Aymeric Vitte
2017-03-30  5:23               ` Ryan J Martin
2017-03-30 10:30                 ` Tom Zander
2017-03-30 16:44                   ` Jared Lee Richardson
2017-03-30 20:51                   ` Jared Lee Richardson
2017-03-30 21:57                     ` Tom Zander
     [not found]               ` <CAD1TkXvx=RKvjC8BUstwtQxUUQwG4eiU9XmF1wr=bU=xcVg5WQ@mail.gmail.com>
2017-03-30 10:13                 ` Aymeric Vitte
2017-03-29 19:46     ` Jared Lee Richardson
2017-03-29 19:10   ` Jared Lee Richardson
2017-03-29 19:36     ` praxeology_guy
2017-04-02 19:12     ` Staf Verhaegen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD1TkXuyAH++1nwzZ-4BYv=kT=-8jycMsYx3xFriBiYeLmwvNg@mail.gmail.com' \
    --to=jaredr26@gmail$(echo .)com \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    --cc=david.vorick@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox