Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
@ 2013-11-13 23:52 Gavin Andresen
  0 siblings, 0 replies; 16+ messages in thread
From: Gavin Andresen @ 2013-11-13 23:52 UTC (permalink / raw)
  To: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 2159 bytes --]

Couple of thoughts:

RE: the marvelous coincidence that the average fee these days is very close
to the modeled minimum orphan cost:

Engineers tend to underestimate the power of markets, even inefficient
markets, to arrive at the 'correct' price. It would not surprise me at all
if the messy, chaotic inefficient market with tens of thousands of
individual decisions ("which mining pool should I join" and "how high
should my dice site set fees" and "how large should the minimum payout be"
and "should I make my blocks bigger or smaller") might arrive at the
'correct' price, even if NOBODY involved has any clue how or why it
happened.

Or it might just be a coincidence.

RE: orphan rate:

The network-wide orphan rate has been very steady apart from the March
blockchain fork. Kudos to Ben Reeves for keeping track of the data and
giving us a nice chart:
  http://blockchain.info/charts/n-orphaned-blocks

RE: new block latency:

We should be able to reduce the size of new block announcements by about a
factor of ten with very little additional effort (transmit/relay as
"merkleblock" with full bloom filters-- the factor of 10 is because a
transaction id hash is 32 bytes, average transaction size is a few hundred
bytes).

Mining revenue is a fixed-size pie, so if EVERYBODY agreed to accept
(somewhat) higher orphan rates for more transaction volume then, in the
long run, there is no difference.  Well, except that more transaction
volume means more utility for Bitcoin as a whole, so everybody should
benefit from a higher bitcoin price.

That's a classic free-rider problem, though-- a miner could defect to try
to get a lower orphan rate.

This is one of the reasons why I think relaying all blocks in a race is
probably the right thing to do; if a miner is mildly punished (by losing
the occasional block race) for creating blocks that don't include "enough"
already-relayed transactions, that is a strong incentive to go along with
whatever consensus has been established.

The same argument applies for a miner producing too-large blocks, or blocks
with lots of transactions that were never relayed across the network.

-- 
--
Gavin Andresen

[-- Attachment #2: Type: text/html, Size: 3310 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-15 19:19     ` Peter Todd
@ 2013-11-20 10:01       ` Peter Todd
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Todd @ 2013-11-20 10:01 UTC (permalink / raw)
  To: Michael Gronager; +Cc: bitcoin-development

[-- Attachment #1: Type: text/plain, Size: 633 bytes --]

On Fri, Nov 15, 2013 at 02:19:56PM -0500, Peter Todd wrote:
> On Fri, Nov 15, 2013 at 12:47:46PM +0100, Michael Gronager wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > On 15/11/13, 11:32 , Peter Todd wrote:
> > 
> > > alpha = (1/113)*600s/134kBytes = 39.62uS/byte = 24kB/second
> > > 
> > > Which is atrocious... 
> > 
> > alpha = P_fork*t_block/S = 1/113*454000/134 = 29ms/kb
> 
> Huh? Where did 454000 come from?

Oh right, you're using the actual block interval, not the steady state
one.

-- 
'peter'[:-1]@petertodd.org
00000000000000056032432f186a8276d3feecb805d064c1def85905670a453b

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 685 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-15 11:47   ` Michael Gronager
@ 2013-11-15 19:19     ` Peter Todd
  2013-11-20 10:01       ` Peter Todd
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Todd @ 2013-11-15 19:19 UTC (permalink / raw)
  To: Michael Gronager; +Cc: bitcoin-development

[-- Attachment #1: Type: text/plain, Size: 1610 bytes --]

On Fri, Nov 15, 2013 at 12:47:46PM +0100, Michael Gronager wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 15/11/13, 11:32 , Peter Todd wrote:
> 
> > alpha = (1/113)*600s/134kBytes = 39.62uS/byte = 24kB/second
> > 
> > Which is atrocious... 
> 
> alpha = P_fork*t_block/S = 1/113*454000/134 = 29ms/kb

Huh? Where did 454000 come from?

> > , which shows you how little attention they are paying to
> > optimizing profits.  Right now mining pulls in $1.8 million/day, so
> > that's up to $16k wasted.
> 
> yup, but the relevant comparison is not 16k vs 1.8m, but the pool
> operator earnings which are on the order of 1% of the 1.8m so it is 18k
> vs 16k - I wouldn't mind doubling my income...

That's only true for a PPS pool though, not the more usual pools that
pay relative to blocks actually found. Heh, actually, that might be part
of the problem... also doesn't help how varience is going to make
noticing 1% hard.

> > In
> > theory anyway, could just as easily be the case that larger pools have
> > screwed up relaying still such that p2pool's forwarding wins.
> 
> Yeah, we should resurrect p2pool ;)

P2Pool has 1% hashing power right now; I mine on it myself with what
little hashing power I have.

The more interesting thing is how do you grow P2Pool - requiring a full
node is going to make that tricky. Also the once we start adding more
efficient block propagation by transmitting headers + txids p2pool's
current advantage goes away.

-- 
'peter'[:-1]@petertodd.org
0000000000000005fbc1840bbd5bd71d4d6cd2930e20da9e697710f58bd4f69d

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-15 11:58         ` Michael Gronager
@ 2013-11-15 19:09           ` Peter Todd
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Todd @ 2013-11-15 19:09 UTC (permalink / raw)
  To: Michael Gronager; +Cc: bitcoin-development

[-- Attachment #1: Type: text/plain, Size: 2081 bytes --]

On Fri, Nov 15, 2013 at 12:58:14PM +0100, Michael Gronager wrote:
> My Q and q are meant differently, I agree to your Q vs Q-1 argument,
> but the q is "me as a miner" participating in "a pool" Q. If I
> participate in a pool I pay the pool owner a fraction, e, but at the
> same time I become part of an economy of scale (well actually a math
> of scale...) and that can end up paying for the lost e. The question
> is what is the ratio q/Q where I should rather mine on my own ? This
> question is interesting as it will make bigger miners break away from
> pools into solo mining, but I also agree that from pure math the most
> advantageous scenario is the 100% mining rig.

The underlying issue is what is the pools expenses compared to yours.
There is an overhead to mining, you need to spend money and time (and
hence money) running and administering full nodes at the very minimum.
The pool can amortise that cost over many hashers; the solo miner can't.

Pools will of course have some profit margin, but why would you expect
that margin to not be sufficiently low to make it in a solo-miner's
interest to join the pool? Both the pool and the former solo-miner earn
more return after all if they centralize.

The fundemental issue is that in the design of Bitcoin there is an
incentive for miners to join into pools, and that incentive exists at
any amount of hashing power. Sure second order effects like regulation
and social pressure can counteract that incentive in some circumstances,
but that's not very strong protection.

> > The equations give an incentive to centralize all the way up to 1
> > miner with 100% hashing power.
> > 
> > Of course, if that one pool were p2pool, that might be ok!
> 
> Ha, yes, and then the math for p2pool starts... a math where we have
> much more stales...

However p2pool doesn't necessarily need a linear blockchain to function,
so there is a potential for stales to be much less relevant.

-- 
'peter'[:-1]@petertodd.org
000000000000000772f720b0a231150f22af20760c1463ef920f71ba3daab819

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-15 11:12       ` Peter Todd
@ 2013-11-15 11:58         ` Michael Gronager
  2013-11-15 19:09           ` Peter Todd
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Gronager @ 2013-11-15 11:58 UTC (permalink / raw)
  To: bitcoin-development

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>> 

>> Q = Total pool size (fraction of all mining power) q = My mining
>> power (do.) e = fraction of block fee that pool reserves
>> 
> 
> Unfortunately the math doesn't work that way. For any Q, a bigger
> Q gives you a higher return. Remember that the way I setup those
> equations in section 3.2 is such that I'm actually modeling two
> pools, one with Q hashing power and one with (1-Q) hashing power.
> Or maybe more accurately, it's irrelevant if the (1-Q) hashing
> power is or isn't a unified pool.

My Q and q are meant differently, I agree to your Q vs Q-1 argument,
but the q is "me as a miner" participating in "a pool" Q. If I
participate in a pool I pay the pool owner a fraction, e, but at the
same time I become part of an economy of scale (well actually a math
of scale...) and that can end up paying for the lost e. The question
is what is the ratio q/Q where I should rather mine on my own ? This
question is interesting as it will make bigger miners break away from
pools into solo mining, but I also agree that from pure math the most
advantageous scenario is the 100% mining rig.

> The equations give an incentive to centralize all the way up to 1
> miner with 100% hashing power.
> 
> Of course, if that one pool were p2pool, that might be ok!

Ha, yes, and then the math for p2pool starts... a math where we have
much more stales...


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJShgxWAAoJEKpww0VFxdGRoiwH/3RGTH503PJ8UWuyKjrxscb4
dG3TyThZCDs12DvtC+2TPKnIkQFinGx9442tZU/O+qmwsGJsNVoEcnGmKEYz/vlI
XzFF30ugslB4FKwHZYRqXELaKR4RvUtSzu6td8P3n+e6d0MZsuemMornpbXZkw3n
CbMlYuiG4h3iUAwTaOTS26cFbZoo6eyogydDjnS7Ogi2Ur85Rydi/Lj24rj7UxYB
+WUkYAv3bCqCzTkv1LxO7HwY1SICZDmoGRbuil5M7bJ+MftYt6Q6DVprGSVP0mOV
9eEVeMVY/WmMZCI/01ruXpzC3gxU60vOd/a3q9G2hd9Tn00HzugAllEXh7ZzzUs=
=unP8
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-15 10:32 ` Peter Todd
@ 2013-11-15 11:47   ` Michael Gronager
  2013-11-15 19:19     ` Peter Todd
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Gronager @ 2013-11-15 11:47 UTC (permalink / raw)
  To: bitcoin-development

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/11/13, 11:32 , Peter Todd wrote:

> alpha = (1/113)*600s/134kBytes = 39.62uS/byte = 24kB/second
> 
> Which is atrocious... 

alpha = P_fork*t_block/S = 1/113*454000/134 = 29ms/kb

or 272kbit pr second - if you assume this is a bandwidth then I agree it
is strikingly small (ISDN like), but this is not the case, the size
dependence of this number originates both from the limited network
bandwidth and from the validation and verification time of the blocks as
well as the latency in sending thee again.

The connection between propagation time and fork rate cannot be denied,
and the bandwidth can be deducted from that alone - see Decket et al.

t_0 on a 10000km link is on the order of 40ms, and that is only counting
the finite light speed in the fibers - if you ping the same distance you
get roughly 1-200ms (due to latencies in network equipment). at a size
of ~100kbyte t_0 hence becomes irrelevant.

> This also indicates that pools haven't taken the simple step of peering
> with each other using high-bandwidth nodes with restricted numbers of
> peers

agree

> , which shows you how little attention they are paying to
> optimizing profits.  Right now mining pulls in $1.8 million/day, so
> that's up to $16k wasted.

yup, but the relevant comparison is not 16k vs 1.8m, but the pool
operator earnings which are on the order of 1% of the 1.8m so it is 18k
vs 16k - I wouldn't mind doubling my income...

> 
> However, because miners don't orphan themselves, that $16k loss is born
> disproportionately by smaller miners... which also means the 24kB/sec
> bandwidth estimate is wrong, and the real number is even worse.

Yes, agree

> In
> theory anyway, could just as easily be the case that larger pools have
> screwed up relaying still such that p2pool's forwarding wins.

Yeah, we should resurrect p2pool ;)

> 
> 
> 
> ------------------------------------------------------------------------------
> DreamFactory - Open Source REST & JSON Services for HTML5 & Native Apps
> OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
> Free app hosting. Or install the open source package on any LAMP server.
> Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
> http://pubads.g.doubleclick.net/gampad/clk?id=63469471&iu=/4140/ostg.clktrk
> 
> 
> 
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists•sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJShgniAAoJEKpww0VFxdGRrwQIALKsOtBUaAaQTX9ikN+10mSE
pE2dp2VnUvfUqpXf3MgJtAvg2RFqHjziyBMYmpMw5tLJPpeUthpNXm6Vm/Yg0DdL
JXSESIrd4Pdb/xPk2Fh9OKHmR1SB/8VxtRL2Vj1HmzzBcBiCylcaBuKlRkizvGSF
KrUm3EOFUfzgGYFUnqNceZ3CuQHWFAXbsitNqU6Vop8JOTgiSLhUrvb7r3W7Ewuy
jM3H2KAk/PrdGXwna3sUfDXmmOxmPm1pBy6+OaBTHEv+ALkreD++XSUnLUUTky9N
nZt2g7eMEFHIkVooj/HOGiwAvVwd7r86etiyUi8c2Pd46ff2OP5h1uiP/Qr28MA=
=Bsv9
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-15 10:47     ` Michael Gronager
@ 2013-11-15 11:12       ` Peter Todd
  2013-11-15 11:58         ` Michael Gronager
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Todd @ 2013-11-15 11:12 UTC (permalink / raw)
  To: Michael Gronager; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 4171 bytes --]

On Fri, Nov 15, 2013 at 11:47:53AM +0100, Michael Gronager wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi Peter,
> 
> Love to see things put into formulas - nice work!
> 
> Fully agree on the your fist section: As latency determines maximum
> block earnings, define a 0-latency (big-miner never orphans his own
> blocks) island and growing that will of course result in increased earnings.
> 
> So build your own huge mining data center and you rock.
> 
> However, that is hardly the real work scenario today. Instead we have
> pools (Huge pools). It would be interesting to do the calculation:
> 
> 	Q = Total pool size (fraction of all mining power)
> 	q = My mining power (do.)
> 	e = fraction of block fee that pool reserves
> 
> It is pretty obvious that given your formulas small miners are better
> off in a pool (can't survive as solo miners), but there will be a
> threshold q_min above which you are actually better off on you own -
> depending also on e. (excluding here all benefits of a stable revenue
> stream provided by pools)

Unfortunately the math doesn't work that way. For any Q, a bigger Q
gives you a higher return. Remember that the way I setup those equations
in section 3.2 is such that I'm actually modeling two pools, one with Q
hashing power and one with (1-Q) hashing power. Or maybe more
accurately, it's irrelevant if the (1-Q) hashing power is or isn't a
unified pool.

The other thing is the fraction of the block fee the pool reserves
indicates you're talking about real-world costs... and the moment you do
that you find that pools themselves have economies of scale simply by
virtue of using a small overhead infrastructure, their nodes etc., for a
large number of miners. On that basis alone a small miner joining a
larger pool would always be financially advantageous modulo situations
where the large pool had legal restrictions that artificially increased
their overheads.

> Next interesting calculation would be bitcoin rate as a function of pool
> size, I expect a sharp dip somewhere in the 40%s of hardware controlled
> by one entity ;)

Bitcoin rate?

> Finally, as you mention yourselves, qualification of the various
> functions is needed. This could e.g. suggest if we are like to get 3 or
> 10 miners on the long run.

The equations give an incentive to centralize all the way up to 1 miner
with 100% hashing power.

Of course, if that one pool were p2pool, that might be ok!

> And now for section 2. You insert a definition of f(L) = a-bL. I think
> the whole idea of letting f depend on L is superfluous. As a miner you
> are always free to choose which transactions to include. You will always
> choose those with the biggest fee, so really it is only the average fee
> that is relevant: f(L) = c. Any dependence in L will be removed by the
> reshuffeling. To include an extra transaction will require either that
> it has a fee larger than another (kicking that out out) or that it has a
> fee so large that it covers for the other transaction too. Also recall
> that there is a logical minimum fee (as I have already shown), and a
> maximum optimal block size - that is until the bounty becomes 0 (which
> is where other effects kick in).

By defining f(L) you can model supply and demand, which can be relevant
in that a steep demand curve with a small number of high-fee
transactions can reduce centralization pressure in my model.

Of course, by defining f(L) = a-bL you also wind up with mathematica
spitting out some truly hideous polynomials. :P Setting f(L) = c as you
suggest is something I looked at, and results in equations that are more
reasonable, so I think I'll likely wind up doing that. You can make a
good argument anyway that the centralization would cause a flattening of
any demand curve anyway, as in the no-blocksize-limit case the larger
pools cost per transaction tends towards zero as their hashing power
increases - why pay high fees when the large pool will mine them almost
as fast?

-- 
'peter'[:-1]@petertodd.org
0000000000000000b4ff49cd2cad865d6cbca99828987a02f3d5f41067eab00a

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 685 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-15  9:54   ` Peter Todd
  2013-11-15  9:59     ` Gregory Maxwell
@ 2013-11-15 10:47     ` Michael Gronager
  2013-11-15 11:12       ` Peter Todd
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Gronager @ 2013-11-15 10:47 UTC (permalink / raw)
  Cc: Bitcoin Dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Peter,

Love to see things put into formulas - nice work!

Fully agree on the your fist section: As latency determines maximum
block earnings, define a 0-latency (big-miner never orphans his own
blocks) island and growing that will of course result in increased earnings.

So build your own huge mining data center and you rock.

However, that is hardly the real work scenario today. Instead we have
pools (Huge pools). It would be interesting to do the calculation:

	Q = Total pool size (fraction of all mining power)
	q = My mining power (do.)
	e = fraction of block fee that pool reserves

It is pretty obvious that given your formulas small miners are better
off in a pool (can't survive as solo miners), but there will be a
threshold q_min above which you are actually better off on you own -
depending also on e. (excluding here all benefits of a stable revenue
stream provided by pools)

Next interesting calculation would be bitcoin rate as a function of pool
size, I expect a sharp dip somewhere in the 40%s of hardware controlled
by one entity ;)

Finally, as you mention yourselves, qualification of the various
functions is needed. This could e.g. suggest if we are like to get 3 or
10 miners on the long run.

And now for section 2. You insert a definition of f(L) = a-bL. I think
the whole idea of letting f depend on L is superfluous. As a miner you
are always free to choose which transactions to include. You will always
choose those with the biggest fee, so really it is only the average fee
that is relevant: f(L) = c. Any dependence in L will be removed by the
reshuffeling. To include an extra transaction will require either that
it has a fee larger than another (kicking that out out) or that it has a
fee so large that it covers for the other transaction too. Also recall
that there is a logical minimum fee (as I have already shown), and a
maximum optimal block size - that is until the bounty becomes 0 (which
is where other effects kick in).

> Here's what I've got to date. The first two sections is just a
> relatively simple proof that mining is more profitable as centralization
> increases under any circumstance, even before any real-world factors are
> taken into account. (other than non-zero latency and bandwidth) Nice
> homework problem, and neat that you can easily get a solid proof, but
> academic because it doesn't say anything about the magnitude of the
> incentives.
> 
> The latter part is the actual derivation with proper model of
> supply-and-demand for fees. Or will be: while you can of course solve
> the equations with mathematica or similar - getting a horrid mess - I'm
> still trying to see if I can simplify them sanely in a way that's
> step-by-step understandable. Taking me longer than I'd like; sobering to
> realize how rusty I am. That said if any you do just throw it at
> Mathematica, looks like you get a result where the slope of your
> expected block return is at least quadratic with increasing hashing
> power. (though I spent all of five minutes eyeballing that result)
> 
> 
> \documentclass{article}
> \usepackage{url}
> \usepackage{mathtools}
> \begin{document}
> \title{Expected Return}
> \author{Peter Todd}
> \date{FIXME}
> \maketitle
> 
> \section{Expected return of a block}
> \label{sec:exp-return-of-a-block}
> 
> Let $f(L)$, a continuous function,\footnote{Transactions do of course give a
> discontinuous $f$. For a large $L$ the approximation error is negligible.} be
> the fee-per-byte available to a rational miner for the last transaction
> included in a block of size $L$. $f(L)$ is a continuous function defined for $L
> \ge 0$. Supply and demand dictates that:
> 
> \begin{equation}
>     f(L) \ge f(L+\epsilon) \label{eq:f-increases}
> \end{equation}
> 
> A reasonable example for $f$ might be $f(L) = kL$, representing the demand side
> of a linear supply and demand plot. For a block of size $L$ that is optimally
> filled with transactions the value of those fees is just the integral:
> 
> \begin{equation}
>     E_f(L) = \int_0^L f(l)\,dl
> \end{equation}
> 
> Let $P(Q,L)$, a continuous function, be the probability that a block of size
> $L$ produced by a miner with relative hashing power $Q$ will be orphaned.
> Because a miner will never orphan their own blocks the following holds true:
> 
> \begin{equation}
>     P(Q,L) \le P(Q + \epsilon,L) \label{eq:p-increases}
> \end{equation}
> 
> Similarly because larger blocks take longer to propagate and thus risk getting
> orphaned by another miner finding a block at the same time:
> 
> \begin{equation}
>     P(Q,L) \ge P(Q,L + \epsilon)
> \end{equation}
> 
> By combining $P(Q, L)$, $E_f(L)$ and the inflation subsidy $B$, gives us the
> expected return of a block for a given size and hashing power:\footnote{Note
> how real world marginal costs can be accommodated easily in the definitions of
> $f$ and $B$.}
> 
> \begin{equation}
>     E(Q,L) = P(Q,L)[E_f(L) + B]
> \end{equation}
> 
> The optimal size is simply the size $L$ at which $E(Q, L)$ no longer increases:
> 
> \begin{equation}
>     \frac{d}{dL}\big[E(Q, L(Q))\big] = 0
> \end{equation}
> 
> We will define the function $L(Q)$ as the optimal value for a given $Q$. A
> miner creating optimal blocks will thus have an expected return per block found
> of $E'(Q)=E(Q,L(Q))$. Note how this definition is per unit hashing power by
> virtue of being per block found.
> 
> 
> \section{Optimal return $E'$ vs. hashing power $Q$}
> 
> We want to know if a large miner has a larger return for a given amount of
> hashing power. We do this by taking the derivative with respect to $Q$ of the
> expected return given optimal strategy:
> 
> \begin{align*}
>     \frac{d}{dQ}\big[E'(Q)\big] &= \frac{d}{dQ}\big[P(Q,L(Q))\big]\big[E_f(L(Q)) + B\big] + P(Q,L(Q))\frac{d}{dQ}\big[E_f(L(Q))\big] \\
>                                 &= \frac{dL(Q)}{dQ}\Big[\frac{dP(Q,L(Q))}{dQ}\big[E_f(L(Q)) + B\big] + P(Q,L)\frac{dE_f(L(Q))}{dQ}\Big]
> \end{align*}
> 
> We know that $L(Q)$, $E_f$, $P$, and $B$ are all $\ge 0$. Thus for $dE'/dQ$ to
> be negative requires either $dL/dQ$ to be negative, or for $dL/dQ$ to be
> positive and one of $dP/dQ$ or $dE_f/dQ$ negative.
> 
> Suppose $dP/dQ$ negative and $dL/dQ$ positive:
> 
> \begin{align}
>     \frac{dL(Q)}{dQ} > 0    &\implies L(Q + \epsilon) > L(Q) \notag \\
>     \frac{dP(L(Q))}{dQ} < 0 &\implies P(Q + \epsilon, L(Q + \epsilon)) < P(Q, L(Q)) \label{eq:dl-pos-dp-neg}
> \end{align}
> 
> But that contradicts our definition \eqref{eq:p-increases} of $P$ as continuous
> and increasing. Suppose instead that $dE_f/dQ$ is negative and $dL/dQ$
> positive:
> 
> \begin{align}
>     \frac{dL(Q)}{dQ} > 0      &\implies L(Q) < L(Q + \epsilon) \notag \\
>     \frac{dE_f(L(Q))}{dL} < 0 &\implies E_f(L(Q)) > E_f(L(Q + \epsilon)) \notag \\
>                               &\implies \int_0^{L(Q)} f(l)\,dl > \int_0^{L(Q+\epsilon)} f(l)\,dl \notag \\
>                               &\implies f(l) < 0 \label{eq:dl-pos-de-neg}
> \end{align}
> 
> Again we have a contradiction with our definition \eqref{eq:f-increases} of
> $f$. Finally suppose $dL/dQ$ is negative:
> 
> \begin{align}
>     \frac{dL(Q)}{dQ} < 0 &\implies L(Q) > L(Q + \epsilon) \notag \\
>                          &\implies P(Q + \epsilon, L(Q + \epsilon)) < P(Q, L(Q)) \notag \\
>                          &\implies \frac{dP(Q, L(Q))}{dQ} < 0 \notag \\
>                          &\implies \frac{dL(Q)}{dQ}\frac{dP(Q, L(Q))}{dQ} > 0 \label{eq:dl-neg-dp-neg} \\
>                          &\implies E_f(L(Q + \epsilon)) < E_f(L(Q)) \implies \frac{dE_f(L(Q))}{dQ} < 0 \notag \\
>                          &\implies \frac{dL(Q)}{dQ}\frac{dE_f(L(Q))}{dQ} > 0 \label{eq:dl-neg-de-neg}
> \end{align}
> 
> Even if $dL/dQ$ is negative \eqref{eq:dl-neg-dp-neg} and
> \eqref{eq:dl-neg-de-neg} show that $dE'/dQ > 0$. In conjunction with
> \eqref{eq:dl-pos-dp-neg} and \eqref{eq:dl-pos-de-neg} we prove that increased
> hashing power always leads to increased return on investment per unit hashing
> power.
> 
> 
> \subsection{Real-world implications to centralization}
> 
> While the author has shown that they still remember first-year, is this result
> relevant?
> 
> The proof holds regardless of what any of the functions actually are, provided
> that they meet the requirements set out in section
> \ref{sec:exp-return-of-a-block}. The requirements are met by any reasonable
> real-world scenario\footnote{Negative fees are not reasonable!}, and show an
> incentive for mining to centralize even in an ideal situation where all miners
> are on a level playing field and have no fixed costs.
> 
> However the proof is abstract, and doesn't tell us anything about how strong
> that pressure is; it may be insignificant enough to be outweighed by effects
> such as social pressure.
> 
> We need to investigate $dE'/dQ$ in detail.
> 
> 
> \section{Detailed derivation of of $P(Q,L)$}
> 
> \subsection{Assumptions}
> 
> The difficulty is assumed to be in a steady state condition and the
> percentage of hashing power for any given miner is fixed. Unconfirmed
> transactions are assumed to be known to all miners, giving everyone an
> equal opportunity of mining any given transaction.
> 
> We assume that the graph of all Bitcoin miners is fully connected and
> that the bandwidth, $1/k$, and latency, $t_0$, is identical for all
> connections and unchanging. We assume that miners always attempt to
> build upon the first block they see on the longest chain known to them,
> and when they find a block, they always broadcast it to all other miners
> simultaneously. From that we see that the time taken for a block of size
> $L$ to propagate to $100\%$ of the hashing power is simply:
> 
> \begin{equation}
>     t(L) = t_0 + kL
> \end{equation}
> 
> 
> \subsection{Analysis}
> 
> When miner $Q$ finds a block during the condition of full consensus the
> outcomes can be described by the following state tree.  The numbers in brackets
> are the "scorecard" of blocks found by $Q$ and all other miners should a given
> state be reached:
> 
> \begin{description}
> 
>     \item[1)] No other block is found prior to full propagation. (1:0)
> 
>     \item[2)] $Q$ finds another block prior to full propagation. (2:0)
>     \begin{description}
>         \item[2.1)] $Q$'s second block is not orphaned. (2:0)
>         \item[2.2)] $Q$'s second block is orphaned. (2:3)
>     \end{description}
> 
>     \item[3)] $(1-Q)$ finds another block prior to full propagation. (1:1)
>     \begin{description}
>         \item[3.1)] $(1-Q)$'s block is orphaned. (2:1)
>         \item[3.2)] $(1-Q)$'s block is not orphaned. (1:2)
>     \end{description}
> \end{description}
> 
> Miner $Q$ wins if states $1$, $2.1$, or $3.1$ are reached. Though it is
> possible to derive an equation for $P$ that accurately models possible states -
> the author did exactly that in a fit of madness - the resulting equation is
> unwieldly and offers no additional insight.
> 
> We want to end up with a $dE'/dQ$ that captures second order effects. Since
> $L(Q)$ and thus $E'(Q)$ will depend on $Q$ our approximation of $P$ should be
> such that $dP/dQ$ is at least linear.
> 
> With $\lambda$ as the block interval the probabilities of reaching states $1$,
> $2$, and $3$ are as follows:
> \begin{align}
>     p_1 &= 1 - \frac{t}{\lambda} \\
>     p_2 &= \frac{t}{\lambda} Q \\
>     p_3 &= \frac{t}{\lambda} (1-Q)
> \end{align}
> 
> We could assume that states $2$ and $3$ both lead to the block being orphaned,
> thus giving us:
> \begin{equation}
>     P(Q, L) = 1 - \frac{t}{\lambda} = 1 - \frac{t_o + kL}{\lambda}
> \end{equation}
> 
> However this gives us a linear $E(Q, L)$, linear $L(Q)$, and thus only a
> quadratic $E'(Q)$. We need at least one more state in our model; state $2.1$ is
> a good choice. Reaching state $2.2$ is exceptionally improbable - the miners
> $(1-Q)$ have to find three blocks in time $t$ - so ignoring state $2.2$ and
> thus using the probability for state $2$ instead has negligible impact on the
> model. Meanwhile state $3$ requires that state $3.1$ be used directly and would
> result in a third-order terms in $P$ when treating state $3$ as an always loss
> is a conservative lower-bound.
> 
> This gives us:
> \begin{align}
>     P(Q, L) &= p_1 + p_2 = 1 - \frac{t}{\lambda} + \frac{t}{\lambda} Q = 1 - (1-Q)\frac{t}{\lambda} \notag \\
>             &= 1 - (1-Q)\frac{t_o + kL}{\lambda}
> \end{align}
> 
> 
> \subsection{Detailed derivation of E'(Q)}
> 
> Some preliminaries:
> 
> \begin{align}
>     \frac{dP(Q,L)}{dL} &= -(1-Q)\frac{k}{\lambda} \\
>     \notag\\
>     \frac{dE(Q,L)}{dL} &= \frac{dP(Q,L)}{dL}\big[E_f(L) + B\big] + P(Q,L)\frac{dE_f(L)}{dL} \notag\\
>                        &= \frac{dP(Q,L)}{dL}\big[E_f(L) + B\big] + P(Q,L)\,f(L)
> \end{align}
> 
> We're not going to get very far without a definition for $f$ so we'll use a
> simple linear demand model:
> 
> \begin{align}
>     f(L) &= a - bL \\
>     E_f(L) &= aL - \frac{1}{2}bL^2
> \end{align}
> 
> Now we set $dE/dL=0$ and solve for $L$. To simplify the problem we will consider the no-subsidy, $B=0$ case:
> 
> \begin{align}
>     0 &= \frac{dP(Q,L)}{dL}E_f(L) + P(Q,L)\,f(L) \\
>       &= -(1-Q)\frac{k}{\lambda}\big[aL - \frac{1}{2}bL^2] + \big[1 - (1-Q)\frac{t_o + kL}{\lambda}\big](a - bL) \\
> \end{align}
> 
> 
> \end{document}
> 
>>> Luckily the fork frequency and the average block size are easily
>>> measurable. blockchain.info keeps historical graphs of number of
>>> orphaned blocks pr day
>>
>> Are those stats accurate? Have any pool operators at least confirmed that the
>> orphaned blocks that blockchain.info reports match their own records?
>>
>> My gut feeling is to relay all orphaned blocks. We know that with a high
>> investment and sybil attack as blockchain.info has done you can have better
>> awareness of orphaned blocks than someone without those resources. If having
>> that awareness is ever a profitable thing we have both created an incentive to
>> sybil attack the network and we have linked profitability to high up-front
>> capital investments.
>>
>> On those grounds alone I will argue that we should relay all orphans to even
>> the playing field. If there is a circumstance where we do not want the attacker
>> to have that knowledge we have failed anyway, as blockchain.info's sybil attack
>> on the network clearly shows.
> 
> Agreed.
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJShfvZAAoJEKpww0VFxdGRDVoIALEgjxC8PQvj4hyp8CmTM8wP
4ASL72gs/V6cRZuVPXjKJrrWxs2GjvxASQWaZa+9Oe5pXTg1Qa9yo5/3vBnB4kmK
SgeJNo+C1rQjd3KuunAV0vG4pkIYnMa9GyBYnWf8mNuP1oysy8NSDOVt2jhtO5A3
gKra0YFJYIEyOgewfefDrxokP0iSfQnJO7mPYfkoaLQm0ugoAi1IR8EiAuZX3oT9
v80o9yhKqilz0wxhvsFAFf8txfpJw7LWTne5L/gQkHIV3v3dY7fLoWTfil/mqsAq
6+d6xf+9s1tOXD18C/QTvhZIAyE3yiW7ZxbOyAYbQmbjORRZBdgWzaxCQbTHQNM=
=k1i2
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-13 12:34 ` Michael Gronager
@ 2013-11-15 10:46   ` Peter Todd
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Todd @ 2013-11-15 10:46 UTC (permalink / raw)
  To: Michael Gronager; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 554 bytes --]

On Wed, Nov 13, 2013 at 01:34:07PM +0100, Michael Gronager wrote:
> Just a quick comment on the actual fees (checked at blockchain.info) the
> average fee over the last 90 days is actually ~0.0003BTC/txn - so not
> too far behind the theoretical minimum of 0.00037BTC/txn.

How did you get those numbers exactly?

Also fee per txn is *not* useful and we really shouldn't quote it so
that newbies reading this stuff get the right understanding.

-- 
'peter'[:-1]@petertodd.org
00000000000000075ed91531e07d2045b5823da050fe373bde7bb363965e44ae

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 685 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-13 11:52 Michael Gronager
  2013-11-13 12:34 ` Michael Gronager
  2013-11-13 20:01 ` John Dillon
@ 2013-11-15 10:32 ` Peter Todd
  2013-11-15 11:47   ` Michael Gronager
  2 siblings, 1 reply; 16+ messages in thread
From: Peter Todd @ 2013-11-15 10:32 UTC (permalink / raw)
  To: Michael Gronager; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 2536 bytes --]

On Wed, Nov 13, 2013 at 12:52:21PM +0100, Michael Gronager wrote:
> Last week I posted a writeup: "On the optimal block size and why
> transaction fees are 8 times too low (or transactions 8 times too big)".
> 
> Peter Todd made some nice additions to it including different pool sizes
> into the numbers.
> 
> However, it occurred to me that things can in fact be calculated even
> simpler: The measured fork rate will mean out all the different pool
> sizes and network latencies and will as such provide a simple number we
> can use to estimate the minimum fee. Key assumption is that the latency
> will depend on block size (# txns) and the fork rate will depend on latency.
> 
> Using the formulas from last week:
> 
> P_fork = t_propagate/t_blocks
> 
> and:
> 
> t_propagate = t_0 + alpha*S ~= alpha*S

Assuming t_0 is negligible is wrong in this case. Or, it should be...

> We get a measure for alpha as a function of the average fork rate and
> average block size:
> 
> alpha = P_fork*t_block/S

So alpha has units of seconds/byte, which lets us indirectly figure out
the bandwidth the blocks are propagating at assuming t_0=0 and all links
are equal. When you realize that P_fork is basically a multiplier on the
bandwidth required to get a block out fast enough, the derivation makes
sense. In any case we get:

alpha = (1/113)*600s/134kBytes = 39.62uS/byte = 24kB/second

Which is atrocious... but when you remember that Bitcoin nodes send
blocks to all peers simultaneously,(1) thus dividing up the bandwidth and
ruining latency you see why. t_0 shouldn't be at all negligible due to
speed of light, but with this low bandwidth it is anyway.

1) To be precise, nodes answer queries for blocks from all peers
simultaneously.

This also indicates that pools haven't taken the simple step of peering
with each other using high-bandwidth nodes with restricted numbers of
peers, which shows you how little attention they are paying to
optimizing profits.  Right now mining pulls in $1.8 million/day, so
that's up to $16k wasted.

However, because miners don't orphan themselves, that $16k loss is born
disproportionately by smaller miners... which also means the 24kB/sec
bandwidth estimate is wrong, and the real number is even worse. In
theory anyway, could just as easily be the case that larger pools have
screwed up relaying still such that p2pool's forwarding wins.

-- 
'peter'[:-1]@petertodd.org
000000000000000658459cd64e63243e719106014257870d073207c2d5460137

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 685 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-15  9:54   ` Peter Todd
@ 2013-11-15  9:59     ` Gregory Maxwell
  2013-11-15 10:47     ` Michael Gronager
  1 sibling, 0 replies; 16+ messages in thread
From: Gregory Maxwell @ 2013-11-15  9:59 UTC (permalink / raw)
  To: Peter Todd; +Cc: Bitcoin Dev, Michael Gronager

On Fri, Nov 15, 2013 at 1:54 AM, Peter Todd <pete@petertodd•org> wrote:
> \documentclass{article}

LaTeX moon language to PDF moon language conversion:

https://people.xiph.org/~greg/peter_todd_mining_ev.pdf



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-13 20:01 ` John Dillon
  2013-11-13 20:32   ` Michael Gronager
@ 2013-11-15  9:54   ` Peter Todd
  2013-11-15  9:59     ` Gregory Maxwell
  2013-11-15 10:47     ` Michael Gronager
  1 sibling, 2 replies; 16+ messages in thread
From: Peter Todd @ 2013-11-15  9:54 UTC (permalink / raw)
  To: John Dillon; +Cc: Bitcoin Dev, Michael Gronager

[-- Attachment #1: Type: text/plain, Size: 12968 bytes --]

On Wed, Nov 13, 2013 at 08:01:27PM +0000, John Dillon wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> > Last week I posted a writeup: "On the optimal block size and why
> > transaction fees are 8 times too low (or transactions 8 times too big)".
> >
> > Peter Todd made some nice additions to it including different pool sizes
> > into the numbers.
> 
> Peter claims on IRC that he is writing a paper of some kind on this topic. I
> suggest he submit it to that crypto-currency thing the foundation is
> sponsoring. Given the Nov 24th deadline, I also suggest at least making part of
> it public ASAP so some peer review can be done. It would be a shame for a
> simple math error to cause embarassment later.

Here's what I've got to date. The first two sections is just a
relatively simple proof that mining is more profitable as centralization
increases under any circumstance, even before any real-world factors are
taken into account. (other than non-zero latency and bandwidth) Nice
homework problem, and neat that you can easily get a solid proof, but
academic because it doesn't say anything about the magnitude of the
incentives.

The latter part is the actual derivation with proper model of
supply-and-demand for fees. Or will be: while you can of course solve
the equations with mathematica or similar - getting a horrid mess - I'm
still trying to see if I can simplify them sanely in a way that's
step-by-step understandable. Taking me longer than I'd like; sobering to
realize how rusty I am. That said if any you do just throw it at
Mathematica, looks like you get a result where the slope of your
expected block return is at least quadratic with increasing hashing
power. (though I spent all of five minutes eyeballing that result)

\documentclass{article}
\usepackage{url}
\usepackage{mathtools}
\begin{document}
\title{Expected Return}
\author{Peter Todd}
\date{FIXME}
\maketitle

\section{Expected return of a block}
\label{sec:exp-return-of-a-block}

Let $f(L)$, a continuous function,\footnote{Transactions do of course give a
discontinuous $f$. For a large $L$ the approximation error is negligible.} be
the fee-per-byte available to a rational miner for the last transaction
included in a block of size $L$. $f(L)$ is a continuous function defined for $L
\ge 0$. Supply and demand dictates that:

\begin{equation}
    f(L) \ge f(L+\epsilon) \label{eq:f-increases}
\end{equation}

A reasonable example for $f$ might be $f(L) = kL$, representing the demand side
of a linear supply and demand plot. For a block of size $L$ that is optimally
filled with transactions the value of those fees is just the integral:

\begin{equation}
    E_f(L) = \int_0^L f(l)\,dl
\end{equation}

Let $P(Q,L)$, a continuous function, be the probability that a block of size
$L$ produced by a miner with relative hashing power $Q$ will be orphaned.
Because a miner will never orphan their own blocks the following holds true:

\begin{equation}
    P(Q,L) \le P(Q + \epsilon,L) \label{eq:p-increases}
\end{equation}

Similarly because larger blocks take longer to propagate and thus risk getting
orphaned by another miner finding a block at the same time:

\begin{equation}
    P(Q,L) \ge P(Q,L + \epsilon)
\end{equation}

By combining $P(Q, L)$, $E_f(L)$ and the inflation subsidy $B$, gives us the
expected return of a block for a given size and hashing power:\footnote{Note
how real world marginal costs can be accommodated easily in the definitions of
$f$ and $B$.}

\begin{equation}
    E(Q,L) = P(Q,L)[E_f(L) + B]
\end{equation}

The optimal size is simply the size $L$ at which $E(Q, L)$ no longer increases:

\begin{equation}
    \frac{d}{dL}\big[E(Q, L(Q))\big] = 0
\end{equation}

We will define the function $L(Q)$ as the optimal value for a given $Q$. A
miner creating optimal blocks will thus have an expected return per block found
of $E'(Q)=E(Q,L(Q))$. Note how this definition is per unit hashing power by
virtue of being per block found.

\section{Optimal return $E'$ vs. hashing power $Q$}

We want to know if a large miner has a larger return for a given amount of
hashing power. We do this by taking the derivative with respect to $Q$ of the
expected return given optimal strategy:

\begin{align*}
    \frac{d}{dQ}\big[E'(Q)\big] &= \frac{d}{dQ}\big[P(Q,L(Q))\big]\big[E_f(L(Q)) + B\big] + P(Q,L(Q))\frac{d}{dQ}\big[E_f(L(Q))\big] \\
                                &= \frac{dL(Q)}{dQ}\Big[\frac{dP(Q,L(Q))}{dQ}\big[E_f(L(Q)) + B\big] + P(Q,L)\frac{dE_f(L(Q))}{dQ}\Big]
\end{align*}

We know that $L(Q)$, $E_f$, $P$, and $B$ are all $\ge 0$. Thus for $dE'/dQ$ to
be negative requires either $dL/dQ$ to be negative, or for $dL/dQ$ to be
positive and one of $dP/dQ$ or $dE_f/dQ$ negative.

Suppose $dP/dQ$ negative and $dL/dQ$ positive:

\begin{align}
    \frac{dL(Q)}{dQ} > 0    &\implies L(Q + \epsilon) > L(Q) \notag \\
    \frac{dP(L(Q))}{dQ} < 0 &\implies P(Q + \epsilon, L(Q + \epsilon)) < P(Q, L(Q)) \label{eq:dl-pos-dp-neg}
\end{align}

But that contradicts our definition \eqref{eq:p-increases} of $P$ as continuous
and increasing. Suppose instead that $dE_f/dQ$ is negative and $dL/dQ$
positive:

\begin{align}
    \frac{dL(Q)}{dQ} > 0      &\implies L(Q) < L(Q + \epsilon) \notag \\
    \frac{dE_f(L(Q))}{dL} < 0 &\implies E_f(L(Q)) > E_f(L(Q + \epsilon)) \notag \\
                              &\implies \int_0^{L(Q)} f(l)\,dl > \int_0^{L(Q+\epsilon)} f(l)\,dl \notag \\
                              &\implies f(l) < 0 \label{eq:dl-pos-de-neg}
\end{align}

Again we have a contradiction with our definition \eqref{eq:f-increases} of
$f$. Finally suppose $dL/dQ$ is negative:

\begin{align}
    \frac{dL(Q)}{dQ} < 0 &\implies L(Q) > L(Q + \epsilon) \notag \\
                         &\implies P(Q + \epsilon, L(Q + \epsilon)) < P(Q, L(Q)) \notag \\
                         &\implies \frac{dP(Q, L(Q))}{dQ} < 0 \notag \\
                         &\implies \frac{dL(Q)}{dQ}\frac{dP(Q, L(Q))}{dQ} > 0 \label{eq:dl-neg-dp-neg} \\
                         &\implies E_f(L(Q + \epsilon)) < E_f(L(Q)) \implies \frac{dE_f(L(Q))}{dQ} < 0 \notag \\
                         &\implies \frac{dL(Q)}{dQ}\frac{dE_f(L(Q))}{dQ} > 0 \label{eq:dl-neg-de-neg}
\end{align}

Even if $dL/dQ$ is negative \eqref{eq:dl-neg-dp-neg} and
\eqref{eq:dl-neg-de-neg} show that $dE'/dQ > 0$. In conjunction with
\eqref{eq:dl-pos-dp-neg} and \eqref{eq:dl-pos-de-neg} we prove that increased
hashing power always leads to increased return on investment per unit hashing
power.

\subsection{Real-world implications to centralization}

While the author has shown that they still remember first-year, is this result
relevant?

The proof holds regardless of what any of the functions actually are, provided
that they meet the requirements set out in section
\ref{sec:exp-return-of-a-block}. The requirements are met by any reasonable
real-world scenario\footnote{Negative fees are not reasonable!}, and show an
incentive for mining to centralize even in an ideal situation where all miners
are on a level playing field and have no fixed costs.

However the proof is abstract, and doesn't tell us anything about how strong
that pressure is; it may be insignificant enough to be outweighed by effects
such as social pressure.

We need to investigate $dE'/dQ$ in detail.

\section{Detailed derivation of of $P(Q,L)$}

\subsection{Assumptions}

The difficulty is assumed to be in a steady state condition and the
percentage of hashing power for any given miner is fixed. Unconfirmed
transactions are assumed to be known to all miners, giving everyone an
equal opportunity of mining any given transaction.

We assume that the graph of all Bitcoin miners is fully connected and
that the bandwidth, $1/k$, and latency, $t_0$, is identical for all
connections and unchanging. We assume that miners always attempt to
build upon the first block they see on the longest chain known to them,
and when they find a block, they always broadcast it to all other miners
simultaneously. From that we see that the time taken for a block of size
$L$ to propagate to $100\%$ of the hashing power is simply:

\begin{equation}
    t(L) = t_0 + kL
\end{equation}

\subsection{Analysis}

When miner $Q$ finds a block during the condition of full consensus the
outcomes can be described by the following state tree.  The numbers in brackets
are the "scorecard" of blocks found by $Q$ and all other miners should a given
state be reached:

\begin{description}

    \item[1)] No other block is found prior to full propagation. (1:0)

    \item[2)] $Q$ finds another block prior to full propagation. (2:0)
    \begin{description}
        \item[2.1)] $Q$'s second block is not orphaned. (2:0)
        \item[2.2)] $Q$'s second block is orphaned. (2:3)
    \end{description}

    \item[3)] $(1-Q)$ finds another block prior to full propagation. (1:1)
    \begin{description}
        \item[3.1)] $(1-Q)$'s block is orphaned. (2:1)
        \item[3.2)] $(1-Q)$'s block is not orphaned. (1:2)
    \end{description}
\end{description}

Miner $Q$ wins if states $1$, $2.1$, or $3.1$ are reached. Though it is
possible to derive an equation for $P$ that accurately models possible states -
the author did exactly that in a fit of madness - the resulting equation is
unwieldly and offers no additional insight.

We want to end up with a $dE'/dQ$ that captures second order effects. Since
$L(Q)$ and thus $E'(Q)$ will depend on $Q$ our approximation of $P$ should be
such that $dP/dQ$ is at least linear.

With $\lambda$ as the block interval the probabilities of reaching states $1$,
$2$, and $3$ are as follows:
\begin{align}
    p_1 &= 1 - \frac{t}{\lambda} \\
    p_2 &= \frac{t}{\lambda} Q \\
    p_3 &= \frac{t}{\lambda} (1-Q)
\end{align}

We could assume that states $2$ and $3$ both lead to the block being orphaned,
thus giving us:
\begin{equation}
    P(Q, L) = 1 - \frac{t}{\lambda} = 1 - \frac{t_o + kL}{\lambda}
\end{equation}

However this gives us a linear $E(Q, L)$, linear $L(Q)$, and thus only a
quadratic $E'(Q)$. We need at least one more state in our model; state $2.1$ is
a good choice. Reaching state $2.2$ is exceptionally improbable - the miners
$(1-Q)$ have to find three blocks in time $t$ - so ignoring state $2.2$ and
thus using the probability for state $2$ instead has negligible impact on the
model. Meanwhile state $3$ requires that state $3.1$ be used directly and would
result in a third-order terms in $P$ when treating state $3$ as an always loss
is a conservative lower-bound.

This gives us:
\begin{align}
    P(Q, L) &= p_1 + p_2 = 1 - \frac{t}{\lambda} + \frac{t}{\lambda} Q = 1 - (1-Q)\frac{t}{\lambda} \notag \\
            &= 1 - (1-Q)\frac{t_o + kL}{\lambda}
\end{align}

\subsection{Detailed derivation of E'(Q)}

Some preliminaries:

\begin{align}
    \frac{dP(Q,L)}{dL} &= -(1-Q)\frac{k}{\lambda} \\
    \notag\\
    \frac{dE(Q,L)}{dL} &= \frac{dP(Q,L)}{dL}\big[E_f(L) + B\big] + P(Q,L)\frac{dE_f(L)}{dL} \notag\\
                       &= \frac{dP(Q,L)}{dL}\big[E_f(L) + B\big] + P(Q,L)\,f(L)
\end{align}

We're not going to get very far without a definition for $f$ so we'll use a
simple linear demand model:

\begin{align}
    f(L) &= a - bL \\
    E_f(L) &= aL - \frac{1}{2}bL^2
\end{align}

Now we set $dE/dL=0$ and solve for $L$. To simplify the problem we will consider the no-subsidy, $B=0$ case:

\begin{align}
    0 &= \frac{dP(Q,L)}{dL}E_f(L) + P(Q,L)\,f(L) \\
      &= -(1-Q)\frac{k}{\lambda}\big[aL - \frac{1}{2}bL^2] + \big[1 - (1-Q)\frac{t_o + kL}{\lambda}\big](a - bL) \\
\end{align}

\end{document}

> > Luckily the fork frequency and the average block size are easily
> > measurable. blockchain.info keeps historical graphs of number of
> > orphaned blocks pr day
> 
> Are those stats accurate? Have any pool operators at least confirmed that the
> orphaned blocks that blockchain.info reports match their own records?
> 
> My gut feeling is to relay all orphaned blocks. We know that with a high
> investment and sybil attack as blockchain.info has done you can have better
> awareness of orphaned blocks than someone without those resources. If having
> that awareness is ever a profitable thing we have both created an incentive to
> sybil attack the network and we have linked profitability to high up-front
> capital investments.
> 
> On those grounds alone I will argue that we should relay all orphans to even
> the playing field. If there is a circumstance where we do not want the attacker
> to have that knowledge we have failed anyway, as blockchain.info's sybil attack
> on the network clearly shows.

Agreed.

-- 
'peter'[:-1]@petertodd.org
0000000000000004fe7b45f3bbc4c7edbd9ff86c963fe77282453e1b38f66503

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 685 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-13 20:01 ` John Dillon
@ 2013-11-13 20:32   ` Michael Gronager
  2013-11-15  9:54   ` Peter Todd
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Gronager @ 2013-11-13 20:32 UTC (permalink / raw)
  To: bitcoin-development

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi John,

Thanks for the feedback - comments below:

>> However, it occurred to me that things can in fact be calculated even
>> simpler: The measured fork rate will mean out all the different pool
>> sizes and network latencies and will as such provide a simple number we
>> can use to estimate the minimum fee.
> 
> Are you sure about that? You are assuming linearity where none may exist.

Well, my work from last week and now is a model. A model enabling you to
easily calculate the minimum fee and as a miner which transaction to
include to not shoot yourselves in the foot risking to create an
orphaned block.

The assumption that there is a linearity between block size and latency
is shown pretty well in the paper by Decker et. al (see last weeks
post). What I add this week is mainly more up to date numbers and a
formula dependent only of data that is easy to measure. (fork rate and
block size).

> 
> Are those stats accurate? Have any pool operators at least confirmed that the
> orphaned blocks that blockchain.info reports match their own records?

Probably not - but the are at least a minimum - in case they are higher,
the fee should go up further.

> 
> My gut feeling is to relay all orphaned blocks. We know that with a high
> investment and sybil attack as blockchain.info has done you can have better
> awareness of orphaned blocks than someone without those resources. If having
> that awareness is ever a profitable thing we have both created an incentive to
> sybil attack the network and we have linked profitability to high up-front
> capital investments.

Another way to measure latency is to setup a node that only listens but
do not relay data. By measuring the propagation of blocks of different
size as well as transactions, you can get a propagation distribution and
from that an average. However, the relevant propagation time is the one
between the pools/(single miners). Which you cannot assess using this
scheme - however, it would be nice to compare it to the orphan block scheme.

> 
> With relayed orphans you could even have P2Pool enforce an optimal tx inclusion
> policy based on a statistical model by including proof of those orphans into
> the P2Pool share chain. P2Pool needs to take fees into account soon, but simply
> asking for blocks with the highest total fees or even highest fee/kb appears to
> be incomplete according to what your and Peter's analysis is suggesting.

Indeed, and nice... But note that it is never of benefit for the miner
to include a transaction with a fee of less than ~0.0004BTC - unless it
is linked to another transaction that pay an extra fee.

There have been a lot of assumptions on the fee size and generally it
has been linked to the bitcoin exchange rate. This analysis shows that
this is wrong. Also it shows that the scalability of bitcoin is directly
linked to the network and node latency (with the current latency it will
never beneficial for miners to include more than ~30k transactions in a
block or ~70 pr second resulting in ~10MB blocks).
However, halving the latency will double the capacity, down to the
minimum which is governed by the speed of light.

> 
> 
> ------------------------------------------------------------------------------
> DreamFactory - Open Source REST & JSON Services for HTML5 & Native Apps
> OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
> Free app hosting. Or install the open source package on any LAMP server.
> Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
> http://pubads.g.doubleclick.net/gampad/clk?id=63469471&iu=/4140/ostg.clktrk
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists•sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSg+H7AAoJEKpww0VFxdGRn+gIAIgju90DED5r//USqKvkQsYI
JDj0tLBLMg9BPXOOt3eJ+NX4YE4lW+QkwqDd/swuJxLmj0l9BQKgt1lTb/f0P/cY
GdE14gh5EYlvNzY1h0TGKcMe8NTWXU0/tC+Clpy4sqBHPXW/eF/77sLQUnFRrLKi
sT48aHOOFUdBLdlyylUzzevh/FFVLidkKqV031tv52+BFHcTFd4kRPwZXgBSs9YH
U66MkJ4ytAqeOfJue9n7Qn4kJF9kNIhRpqTrtapqu8jglLfuYlJ3s5fwaw9FxQdR
+On4IWeXzURQ6tcVRCovCq/2lxRKIbYGlW7HGVASjRmm68/+8YUAfFsYFl6DIgA=
=9tbL
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-13 11:52 Michael Gronager
  2013-11-13 12:34 ` Michael Gronager
@ 2013-11-13 20:01 ` John Dillon
  2013-11-13 20:32   ` Michael Gronager
  2013-11-15  9:54   ` Peter Todd
  2013-11-15 10:32 ` Peter Todd
  2 siblings, 2 replies; 16+ messages in thread
From: John Dillon @ 2013-11-13 20:01 UTC (permalink / raw)
  To: Michael Gronager; +Cc: Bitcoin Dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

> Last week I posted a writeup: "On the optimal block size and why
> transaction fees are 8 times too low (or transactions 8 times too big)".
>
> Peter Todd made some nice additions to it including different pool sizes
> into the numbers.

Peter claims on IRC that he is writing a paper of some kind on this topic. I
suggest he submit it to that crypto-currency thing the foundation is
sponsoring. Given the Nov 24th deadline, I also suggest at least making part of
it public ASAP so some peer review can be done. It would be a shame for a
simple math error to cause embarassment later.

> However, it occurred to me that things can in fact be calculated even
> simpler: The measured fork rate will mean out all the different pool
> sizes and network latencies and will as such provide a simple number we
> can use to estimate the minimum fee.

Are you sure about that? You are assuming linearity where none may exist.

> Luckily the fork frequency and the average block size are easily
> measurable. blockchain.info keeps historical graphs of number of
> orphaned blocks pr day

Are those stats accurate? Have any pool operators at least confirmed that the
orphaned blocks that blockchain.info reports match their own records?

My gut feeling is to relay all orphaned blocks. We know that with a high
investment and sybil attack as blockchain.info has done you can have better
awareness of orphaned blocks than someone without those resources. If having
that awareness is ever a profitable thing we have both created an incentive to
sybil attack the network and we have linked profitability to high up-front
capital investments.

On those grounds alone I will argue that we should relay all orphans to even
the playing field. If there is a circumstance where we do not want the attacker
to have that knowledge we have failed anyway, as blockchain.info's sybil attack
on the network clearly shows.

> Anyway - the all important number is alpha, the network latency which we
> expect to be dependent of various things such as interconnectivity,
> bandwidths, software quality etc, where mainly the latter is within our
> hands to bring down the fee. And you can actually setup the standard
> client to choose a better fee, as all the parameters in the formula are
> easily measured!

With relayed orphans you could even have P2Pool enforce an optimal tx inclusion
policy based on a statistical model by including proof of those orphans into
the P2Pool share chain. P2Pool needs to take fees into account soon, but simply
asking for blocks with the highest total fees or even highest fee/kb appears to
be incomplete according to what your and Peter's analysis is suggesting.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBCAAGBQJSg9pfAAoJEEWCsU4mNhiP5mcH/jKd2Rpl9gEJ7WhTndS5gYJ9
Ep151NyD/iKpAA4E/d9QVYalo8595LCqnrXnV6wuvuiifB6EJD5WBJq3MAMyaJLA
agl920ygY98slhDmFhnwlU9lkJVim5FoUkZgE7lQ5dr0MIhvoLQiF2Ywky49Izf0
IqL+nyW83AQweSalvktA+XGkDfGDV/EnJN7SdNqKDNtE7E9NeMl61NNOWNndsYy6
uT4PF2YB7rh8wGyHXMTC4Z192pfW4S4s60ZAflG/sTtWCcEwWi+5V/RIu0o5Hmog
RFpEPvc6d6ykdqtPfTRADMGkT2wC1yXsgeos9oFFVVuVSj8EqHb2db0B+psHRBk=
=76Qs
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
  2013-11-13 11:52 Michael Gronager
@ 2013-11-13 12:34 ` Michael Gronager
  2013-11-15 10:46   ` Peter Todd
  2013-11-13 20:01 ` John Dillon
  2013-11-15 10:32 ` Peter Todd
  2 siblings, 1 reply; 16+ messages in thread
From: Michael Gronager @ 2013-11-13 12:34 UTC (permalink / raw)
  To: Bitcoin Dev

Just a quick comment on the actual fees (checked at blockchain.info) the
average fee over the last 90 days is actually ~0.0003BTC/txn - so not
too far behind the theoretical minimum of 0.00037BTC/txn.

I suppose, though, that it has more to do with old clients and fee
settings (0.0005) than network wisdom ;)

On 13/11/13, 12:52 , Michael Gronager wrote:
> Last week I posted a writeup: "On the optimal block size and why
> transaction fees are 8 times too low (or transactions 8 times too big)".
> 
> Peter Todd made some nice additions to it including different pool sizes
> into the numbers.
> 
> However, it occurred to me that things can in fact be calculated even
> simpler: The measured fork rate will mean out all the different pool
> sizes and network latencies and will as such provide a simple number we
> can use to estimate the minimum fee. Key assumption is that the latency
> will depend on block size (# txns) and the fork rate will depend on latency.
> 
> Using the formulas from last week:
> 
> P_fork = t_propagate/t_blocks
> 
> and:
> 
> t_propagate = t_0 + alpha*S ~= alpha*S
> 
> We get a measure for alpha as a function of the average fork rate and
> average block size:
> 
> alpha = P_fork*t_block/S
> 
> Further, take the formula for the minimum fee:
> 
> f > alpha*E_bounty/t_block
> 
> And insert the formula for alpha:
> 
> f > P_fork*E_bounty/S_average
> 
> Luckily the fork frequency and the average block size are easily
> measurable. blockchain.info keeps historical graphs of number of
> orphaned blocks pr day - average over the last year is 1.5. Average
> number of blocks per day over the last year is 169, which yields a
> P_fork of ~1/113. Average block size in the same time is 134kBytes,
> which yields a minimum fee:
> 
> f > 0.00165XBT/kb or 0.00037XBT/txn
> 
> So the 0.0001 is only 4 times too small. Further, let us look at the
> trend over the last 12 months. Pieter Wuille claimed that there has been
> several improvements over the last half year that would bring down the
> latency, there has also been speculations regarding direct connections
> between the major pools etc - lets see if this is indeed true.
> 
> If you look instead of 360 days, only at the last 90 days the average
> block size has been 131kBytes, and the fork rate has been ~1/118, which
> results in a minimum fee of:
> 
> f > 0.00162XBT/kb or 0.00037XBT/txn
> 
> So a small improvement but not statistically important...
> 
> Last question, recalling that optimal revenue block size is a function
> of the txn-fee (from the last writeup) - lets see what fee it takes to
> support a block size of 131kBytes:
> 
> S = 1/2 * (t_block/alpha - E_bounty/f)
> 
> S = 1/2 * (S/P_fork - E_bounty/f)
> 
> f = E_bounty/[(1/P_fork-2)*S] = 0.00165XBT/kB
> 
> So a 4 times increase is still sufficient for the current load.
> 
> Anyway - the all important number is alpha, the network latency which we
> expect to be dependent of various things such as interconnectivity,
> bandwidths, software quality etc, where mainly the latter is within our
> hands to bring down the fee. And you can actually setup the standard
> client to choose a better fee, as all the parameters in the formula are
> easily measured!
> 
> ------------------------------------------------------------------------------
> DreamFactory - Open Source REST & JSON Services for HTML5 & Native Apps
> OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
> Free app hosting. Or install the open source package on any LAMP server.
> Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
> http://pubads.g.doubleclick.net/gampad/clk?id=63469471&iu=/4140/ostg.clktrk
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists•sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
> 




^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize
@ 2013-11-13 11:52 Michael Gronager
  2013-11-13 12:34 ` Michael Gronager
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Michael Gronager @ 2013-11-13 11:52 UTC (permalink / raw)
  To: Bitcoin Dev

Last week I posted a writeup: "On the optimal block size and why
transaction fees are 8 times too low (or transactions 8 times too big)".

Peter Todd made some nice additions to it including different pool sizes
into the numbers.

However, it occurred to me that things can in fact be calculated even
simpler: The measured fork rate will mean out all the different pool
sizes and network latencies and will as such provide a simple number we
can use to estimate the minimum fee. Key assumption is that the latency
will depend on block size (# txns) and the fork rate will depend on latency.

Using the formulas from last week:

P_fork = t_propagate/t_blocks

and:

t_propagate = t_0 + alpha*S ~= alpha*S

We get a measure for alpha as a function of the average fork rate and
average block size:

alpha = P_fork*t_block/S

Further, take the formula for the minimum fee:

f > alpha*E_bounty/t_block

And insert the formula for alpha:

f > P_fork*E_bounty/S_average

Luckily the fork frequency and the average block size are easily
measurable. blockchain.info keeps historical graphs of number of
orphaned blocks pr day - average over the last year is 1.5. Average
number of blocks per day over the last year is 169, which yields a
P_fork of ~1/113. Average block size in the same time is 134kBytes,
which yields a minimum fee:

f > 0.00165XBT/kb or 0.00037XBT/txn

So the 0.0001 is only 4 times too small. Further, let us look at the
trend over the last 12 months. Pieter Wuille claimed that there has been
several improvements over the last half year that would bring down the
latency, there has also been speculations regarding direct connections
between the major pools etc - lets see if this is indeed true.

If you look instead of 360 days, only at the last 90 days the average
block size has been 131kBytes, and the fork rate has been ~1/118, which
results in a minimum fee of:

f > 0.00162XBT/kb or 0.00037XBT/txn

So a small improvement but not statistically important...

Last question, recalling that optimal revenue block size is a function
of the txn-fee (from the last writeup) - lets see what fee it takes to
support a block size of 131kBytes:

S = 1/2 * (t_block/alpha - E_bounty/f)

S = 1/2 * (S/P_fork - E_bounty/f)

f = E_bounty/[(1/P_fork-2)*S] = 0.00165XBT/kB

So a 4 times increase is still sufficient for the current load.

Anyway - the all important number is alpha, the network latency which we
expect to be dependent of various things such as interconnectivity,
bandwidths, software quality etc, where mainly the latter is within our
hands to bring down the fee. And you can actually setup the standard
client to choose a better fee, as all the parameters in the formula are
easily measured!

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-11-20 10:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-13 23:52 [Bitcoin-development] Even simpler minimum fee calculation formula: f > bounty*fork_rate/average_blocksize Gavin Andresen
  -- strict thread matches above, loose matches on Subject: below --
2013-11-13 11:52 Michael Gronager
2013-11-13 12:34 ` Michael Gronager
2013-11-15 10:46   ` Peter Todd
2013-11-13 20:01 ` John Dillon
2013-11-13 20:32   ` Michael Gronager
2013-11-15  9:54   ` Peter Todd
2013-11-15  9:59     ` Gregory Maxwell
2013-11-15 10:47     ` Michael Gronager
2013-11-15 11:12       ` Peter Todd
2013-11-15 11:58         ` Michael Gronager
2013-11-15 19:09           ` Peter Todd
2013-11-15 10:32 ` Peter Todd
2013-11-15 11:47   ` Michael Gronager
2013-11-15 19:19     ` Peter Todd
2013-11-20 10:01       ` Peter Todd

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox