public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Yuval Kogman <nothingmuch@woobling•org>
To: Keagan McClelland <keagan.mcclelland@gmail•com>,
	 Bitcoin Protocol Discussion
	<bitcoin-dev@lists•linuxfoundation.org>
Subject: Re: [bitcoin-dev] A design for Probabilistic Partial Pruning
Date: Sat, 27 Feb 2021 22:09:48 +0000	[thread overview]
Message-ID: <CAAQdECD=G_UV+GBB7365iVdvLduRYcFe_G1i_eZCbLqCVpaO5w@mail.gmail.com> (raw)
In-Reply-To: <CALeFGL1WSSA69ARvJW3di-UC_gz7NV9q7=6zd7s=CHnmttdQFg@mail.gmail.com>

On Fri, 26 Feb 2021 at 20:57, Keagan McClelland via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:

> The goals are to increase data redundancy in a way that more uniformly shares the load across nodes, alleviating some of the pressure of full archive nodes on the IBD problem. I am working on a draft BIP for this proposal but figured I would submit it as a high level idea in case anyone had any feedback on the initial design before I go into specification levels of detail.

You might be interested in an approach (henceforth "SeF") by Swanand
Kadhe, Jichan Chung and Kannan Ramchandran which employs fountain
codes, presented at Scaling Bitcoin 2019:
https://arxiv.org/abs/1906.12140

From a cursory search it appears there is quite a bit of
related/followup work as well.

The simplest fountain code, the Luby Transform (applied in this work)
the encoder divides a message into smaller chunks, and then constructs
an infinite stream of codewords which are XORs of d randomly selected
chunks where d is sampled from the robust soliton distribution. The
simplest decoder simply XORs new k=1 codewords from the relevant k>1
codewords, and the full message can be recovered with overwhelming
probability and in linear time with a sufficiently large random sample
of codewords from the encoded stream. Note that the decoder must know
which chunks went into a codeword, but this is usually addressed using
pseudorandomness, which has other motivations in an adversarial
setting.

In SeF, the general idea is that "droplet nodes" are pruning nodes
which retain some number (equivalent to your threshold parameter) of
codewords from the encoding concatenated blocks (to obtain a fixed
message size), and serve these to compatible clients. This is more
robust than retaining a random sample of blocks, and also performs
well according to their simulations.

Even if this paper is not directly applicable to your efforts, the
theory of fountain codes should be of interest (many variants exist),
and there is work on fountain codes. There is also some work on
fountain codes in an adversarial setting (Falcon codes) but I'm only
vaguely familiar with it, and if i'm not mistaken most of the
considerations are either already implicitly addressed by the Bitcoin
protocol or explicitly addressed in the SeF paper, whose results also
take into account malicious droplet nodes.


  parent reply	other threads:[~2021-02-27 22:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26 18:40 Keagan McClelland
2021-02-27  7:10 ` Igor Cota
2021-02-28  3:41   ` Leo Wandersleb
2021-03-01  6:22     ` Craig Raw
2021-03-01  9:37       ` eric
2021-03-01 20:55         ` Keagan McClelland
2021-02-27 19:19 ` David A. Harding
2021-02-27 23:37   ` David A. Harding
2021-02-27 22:09 ` Yuval Kogman [this message]
2021-02-27 22:13   ` Yuval Kogman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAQdECD=G_UV+GBB7365iVdvLduRYcFe_G1i_eZCbLqCVpaO5w@mail.gmail.com' \
    --to=nothingmuch@woobling$(echo .)org \
    --cc=bitcoin-dev@lists$(echo .)linuxfoundation.org \
    --cc=keagan.mcclelland@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox