On 11/16/2016 06:47 PM, Pieter Wuille wrote:
> On Wed, Nov 16, 2016 at 6:16 PM, Eric Voskuil <eric@voskuil.org
> <mailto:eric@voskuil.org>> wrote:
> 
>     On 11/16/2016 05:50 PM, Pieter Wuille wrote:
> 
>     > So are checkpoints good now?
>     > I believe we should get rid of checkpoints because they seem to be
>     misunderstood as a security feature rather than as an optimization.
> 
>     Or maybe because they place control of the "true chain" in the hands of
>     those selecting the checkpoints? It's not a great leap for the parties
>     distributing the checkpoints to become the central authority.
> 
> Yes, they can be used to control the "true chain", and this has happened
> with various forks. But developers inevitably have this possibility, if
> you ignore review. If review is good enough to catch unintended
> consensus changes, it is certainly enough to catch the introduction of
> an invalid checkpoint. The risk you point out is real, but the way to
> deal with it is good review and release practices.
> 
> I wish we had never used checkpoints the way we did, but here we are.
> Because of this, I want to get rid of them. However, It's not because I
> think they offer an excessive power to developers - but because they're
> often perceived this way (partially as a result of how they've been
> abused in other systems).
>  
>     I recommend users of our node validate the full chain without
>     checkpoints and from that chain select their own checkpoints and place
>     them into config. From that point forward they can apply the
>     optimization. Checkpoints should never be hardcoded into the source.
> 
> Having users with the discipline you suggest would be wonderful to have.
> I don't think it's very realistic, though, and I fear that the result
> would be that everyone copies their config from one or a few websites
> "because that's what everyone uses".

Certainly, but embedding them in the code makes that a practical
certainty. People cannot be prevented from doing dumb things, but let's
not make it hard for them to be smart.

>     > I don't think buried softforks have that problem.
> 
>     I find "buried softfork" a curious name as you are using it. You seem to
>     be implying that this type of change is itself a softfork as opposed to
>     a hardfork that changes the activation of a softfork. It was my
>     understanding that the term referred to the 3 softforks that were being
>     "buried", or the proposal, but not the burial itself.


> I do not consider the practice of "buried softforks" to be a fork at
> all. It is a change that modifies the validity of a theoretically
> construable chain from invalid to valid.

I was out at a Bitcoin meetup when I read this and I think beer actually
came out of my nose.

> However, a reorganization to
> that theoretical chain itself is likely already impossible due to the
> vast number of blocks to rewind, and economic damage that is far greater
> than chain divergence itself.

It's either possible or it is not. If it is not there is no reason for a
proposal - just make the change and don't bother to tell anyone. The
reason we are having this discussion is because it is not impossible.

>     Nevertheless, this proposal shouldn't have "that problem" because it is
>     clearly neither a security feature nor an optimization. That is the
>     first issue that needs to be addressed.
> 
> It is clearly not a security feature, agreed. But how would you propose
> to avoid the ISM checks for BIP34 and BIP66 all the time?

I'll call straw man on the question. It is not important to avoid the
activation checks. The question is whether there is a material
performance optimization in eliminating them. This would have to be
significant enough to rise to the level of a change to the protocol.
Having said that there are a few options:

1. The naive approach to activation is, for each new block, to query the
store for the previous 1000 block headers (to the extent there are that
many), and just do so forever, summing up after the query. This is the
most straightforward but also the most costly approach.

2. A slightly less costly approach is, for each new block, to reverse
iterate over the store until all decisions can be made. This would be an
improvement below activation in that it would take it takes as little as
251 vs. 1000 queries to make the determinations.

3. A further improvement is available by caching the height of full
activation of all three soft forks. Unless there is a subsequent reorg
with a fork point prior that height, there is never a need to make
another query. Once fully activated the activation height is cached to
the store (otherwise just query the last 1000 versions at startup to
determine the state), eliminating any ongoing material cost.

4. We may also be interested in optimizing initial block download. A
cache of the last 1000 block versions can be maintained by adding each
to a circular buffer as they are committed. This eliminates *all*
querying for block versions unless:

(1) there is a restart prior to full activation - in which case there is
a query of up to 1000 versions to prime the cache.

(2) there is a potential reorg after full activation, and the fork point
precedes the saved full activation height - in which case the cache must
be reprimed.

(3) there is a potential reorg. before reaching full activation - in
which case the cache must be backfilled with a query for a number of
versions equal to the depth of the fork point.

During initial block download potential reorgs are exceedingly rare
(reorgs don't have potential unless they have sufficient work to
overcome the long chain) and the cost of handling them as described
above is trivial. The cost of priming the cache is immaterial in the
context of a restart.

So even with a full chain validation one is not likely to *ever* need to
query the store. The memory cost of the cache is strictly 3 bits per
block (375 bytes total). A simpler less memory-sensitive approach is to
use one byte (1,000 bytes total). The computational cost is trivial.

This should already be implemented. A protocol fork (or "change that
modifies the validity of a theoretically construable chain from invalid
to valid") to avoid doing so is not a performance optimization.

> I feel this
> approach is a perfectly reasonable choice for code that likely won't
> ever affect the valid chain again.

I find it to be completely unsupportable as there is no security,
performance, or feature benefit in it.

e