Hi Keagan, I had a very similar idea. The only difference being for the node to decide on a range of blocks to keep beforehand, rather than making the decision block-by-block like you suggest. I felt the other nodes would be better served by ranges due to the sequential nature of IBD. Perhaps this would be computationally lighter as well. I also encourage you to read Ryosuke Abe's paper [1] that proposes a DHT scheme to solve this same problem. Cheers, Igor [1] https://arxiv.org/abs/1902.02174 On Fri, 26 Feb 2021 at 21:57, Keagan McClelland via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > Hi all, > > I've been thinking for quite some time about the problem of pruned nodes > and ongoing storage costs for full nodes. One of the things that strikes me > as odd is that we only really have two settings. > > A. Prune everything except the most recent blocks, down to the cache size > B. Keep everything since genesis > > From my observations and conversations with various folks in the > community, they would like to be able to run a "partially" pruned node to > help bear the load of bootstrapping other nodes and helping with data > redundancy in the network, but would prefer to not dedicate hundreds of > Gigabytes of storage space to the cause. > > This led me to the idea that a node could randomly prune some of the > blocks from history if it passed some predicate. A rough sketch of this > would look as follows. > > 1. At node startup, it would generate a random seed, this would be unique > to the node but not necessary that it be cryptographically secure. > 2. In the node configuration it would also carry a "threshold" expressed > as some percentage of blocks it wanted to keep. > 3. As IBD occurs, based off of the threshold, the block hash, and the > node's unique seed, the node would either decide to prune the data or keep > it. The uniqueness of the node's hash should ensure that no block is > systematically overrepresented in the set of nodes choosing this storage > scheme. > 4. Once the node's IBD is complete it would advertise this as a peer > service, advertising its seed and threshold, so that nodes could > deterministically deduce which of its peers had which blocks. > > The goals are to increase data redundancy in a way that more uniformly > shares the load across nodes, alleviating some of the pressure of full > archive nodes on the IBD problem. I am working on a draft BIP for this > proposal but figured I would submit it as a high level idea in case anyone > had any feedback on the initial design before I go into specification > levels of detail. > > If you have thoughts on > > A. The protocol design itself > B. The barriers to put this kind of functionality into Core > > I would love to hear from you, > > Cheers, > Keagan > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > -- *Igor Cota* Codex Apertus d.o.o.