Hi Keagan,

I had a very similar idea. The only difference being for the node to decide
on a range of blocks to keep beforehand, rather than making the decision
block-by-block like you suggest.

I felt the other nodes would be better served by ranges due to the
sequential nature of IBD. Perhaps this would be computationally lighter as
well.

I also encourage you to read Ryosuke Abe's paper [1] that proposes a DHT
scheme to solve this same problem.

Cheers,
Igor

[1] https://arxiv.org/abs/1902.02174

On Fri, 26 Feb 2021 at 21:57, Keagan McClelland via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Hi all,
>
> I've been thinking for quite some time about the problem of pruned nodes
> and ongoing storage costs for full nodes. One of the things that strikes me
> as odd is that we only really have two settings.
>
> A. Prune everything except the most recent blocks, down to the cache size
> B. Keep everything since genesis
>
> From my observations and conversations with various folks in the
> community, they would like to be able to run a "partially" pruned node to
> help bear the load of bootstrapping other nodes and helping with data
> redundancy in the network, but would prefer to not dedicate hundreds of
> Gigabytes of storage space to the cause.
>
> This led me to the idea that a node could randomly prune some of the
> blocks from history if it passed some predicate. A rough sketch of this
> would look as follows.
>
> 1. At node startup, it would generate a random seed, this would be unique
> to the node but not necessary that it be cryptographically secure.
> 2. In the node configuration it would also carry a "threshold" expressed
> as some percentage of blocks it wanted to keep.
> 3. As IBD occurs, based off of the threshold, the block hash, and the
> node's unique seed, the node would either decide to prune the data or keep
> it. The uniqueness of the node's hash should ensure that no block is
> systematically overrepresented in the set of nodes choosing this storage
> scheme.
> 4. Once the node's IBD is complete it would advertise this as a peer
> service, advertising its seed and threshold, so that nodes could
> deterministically deduce which of its peers had which blocks.
>
> The goals are to increase data redundancy in a way that more uniformly
> shares the load across nodes, alleviating some of the pressure of full
> archive nodes on the IBD problem. I am working on a draft BIP for this
> proposal but figured I would submit it as a high level idea in case anyone
> had any feedback on the initial design before I go into specification
> levels of detail.
>
> If you have thoughts on
>
> A. The protocol design itself
> B. The barriers to put this kind of functionality into Core
>
> I would love to hear from you,
>
> Cheers,
> Keagan
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>


-- 
*Igor Cota*
Codex Apertus d.o.o.