Re: [Bitcoin-development] "network disruption as a service" and proof of local storage

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
@ 2015-03-23 10:06 Thy Shizzle
  0 siblings, 0 replies; 13+ messages in thread
From: Thy Shizzle @ 2015-03-23 10:06 UTC (permalink / raw)
  To: Sergio Lerner, bitcoin-development

[-- Attachment #1: Type: text/plain, Size: 7166 bytes --]

Wow, that's quite impressive. But what comes to my mind is if such an extravagant solution really need to be implemented regarding proof of storage? I mean, my idea whilst building my node was to do something along the lines of "tell me what you got" i.e get block height from the version message, and then fire off your getblock, getdata etc and simply if a node does not respond with the requested data after a few attempts, we disconnect  and perhaps blacklist until the  node restarts or something. I am of course purely looking at it from the perspective of useless nodes consuming connection slots, there may be other scenarios where you require proof of storage that I am not considering. I just think that simple blacklist rules could easily avoid this without the extra resource usage? I mean if you start doing encryption for every task then before you know it you need to dedicate all your cpu to the node! Especially for tasks that are not mission critical or require verification, I mean useless nodes are more of an annoyance with the potential to disrupt the network, slow it down, but not compromise it, so I shouldn't think it would be something that you would turn to encryption for right? I feel this anyway.
________________________________
From: Sergio Lerner<mailto:sergiolerner@certimix•com>
Sent: ‎17/‎03/‎2015 3:45 AM
To: bitcoin-development@lists.sourceforge.net<mailto:bitcoin-development@lists•sourceforge.net>
Subject: [Bitcoin-development] "network disruption as a service" and proof of local storage

The problem of pseudo-nodes will come over and over. The cat and mouse
chase is just beginning.
It has been discussed some times that the easiest solution world be to
request some kind of resource consumption on each peer to be allowed to
connect to other peers.
Gmaxwell proposed Proof of Storage here:
https://bitcointalk.org/index.php?topic=310323.msg3332919#msg3332919

I proposed a (what I think) is better protocol for Proof of Storage that
I call "Proof of Local storage" here
https://bitslog.wordpress.com/2014/11/03/proof-of-local-blockchain-storage/
. It's better because it does not need the storage of additional data,
but more importantly, it allows you to prove full copy of the blockchain
is being maintained by the peer.
This is specially important now that Bitnodes is trying a full-node
incentive program that may be easily cheated
(http://qntra.net/2015/02/pseudonode-proxy-fools-bitcoin-full-node-incentive-program/)

Proof of local storage allows a node to prove another peer that he is
storing a LOCAL copy of a PUBLIC file, such as the blockchain. So the
peer need not waste more resources (well, just some resources to
encode/decode the block-chain).
The main idea is to use what I called asymmetric-time-encoding.
Basically you encode the block-chain in a way that it takes 100 more
times to write it than to read it. Since the block-chain is an
append-only (write-only) file, this fit good for our needs. For instance
(and as a simplification), choosing a global 1024-bit prime, then
splitting the block-chain in 1024-bit blocks, and encrypting each block
using Polihg-Hellman (modexp) with decryption exponent 3.  Then
encryption is at least 100 times slower than decryption. Before PH
encryption each node must xor each block with a pseudo-random mask
derived from the public IP and the block index.  So block encryption
could be:
BlockEncryptIndex(i) = E(IP+i,block(i))^inv(3) (mod p),

where inv(3) is 3^-1 mod (p-1). E() could be a fast tweaked encryption
routine (tweak = index), but we only need the PRNG properties of E() and
that E() does share algebraic properties with P.H..

Two protocols can be performed to prove local possession:
1. (prover and verifier pay a small cost) The verifier sends a seed to
derive some n random indexes, and the prover must respond with the hash
of the decrypted blocks within a certain time bound. Suppose that
decryption of n blocks take 100 msec (+-100 msec of network jitter).
Then an attacker must have a computer 50 faster to be able to
consistently cheat. The last 50 blocks should not be part of the list to
allow nodes to catch-up and encrypt the blocks in background.

2. (prover pay a high cost, verified pays negligible cost). The verifier
chooses a seed n, and then pre-computes the encrypted blocks derived
from the seed using the prover's IP. Then the verifier sends the  seed,
and the prover must respond with the hash of the encrypted blocks within
a certain time bound. The proved does not require to do any PH
decryption, just take the encrypted blocks for indexes derived from the
seed, hash them and send the hash back to the verifier. The verifier
validates the time bound and the hash.

Both protocols can me made available by the client, under different
states. For instance, new nodes are only allowed to request protocol 2
(and so they get an initial assurance their are connecting to
full-nodes). After a first-time mutual authentication, they are allowed
to periodically perform protocol 1. Also new nodes may be allowed to
perform protocol 1 with a small index set, and increase the index set
over time, to get higher confidence.

The important difference between this protocol and classical remote
software attestation protocols, is that the time gap between a good peer
and a malicious peer can be made arbitrarily high, picking a larger p.
Maybe there is even another crypto primitive which is more asymmetric
than exponent 3 decryption (the LUC or NTRU cryptosystem?).

In GMaxwell proposal each peer builds a table for each other peer. In my
proposal, each peer builds a single table (the encrypted blockchain), so
it could be still possible to establish a thousands of connections to
the network from a single peer. Nevertheless, the attacker's IP will be
easily detected (he cannot hide under a thousands different IPs). It's
also possible to restrict the challenge-response to a portion of the
block-chain, the portion offset being derived from the hash of both IP
addresses and one random numbers provided by each peer. Suppose each
connection has a C-R space equivalent to 1% of the block-chain. Then
having 100 connections and responding to C-R on each connection means
storing approximate 1 copy of the block-chain (there may be overlaps,
which would need to be stored twice) , while having 1K connections would
require storing 10 copies of the blockchain.

Best regards,
 Sergio

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Bitcoin-development mailing list
Bitcoin-development@lists•sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

[-- Attachment #2: Type: text/html, Size: 9006 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-27 18:40                 ` Jeremy Spilman
@ 2015-04-01  2:34                   ` Sergio Lerner
  0 siblings, 0 replies; 13+ messages in thread
From: Sergio Lerner @ 2015-04-01  2:34 UTC (permalink / raw)
  To: Jeremy Spilman, bitcoin-development

Matt is right:  the goal is to prove digital copies of a public file.
Nothing more, nothing less.

Regarding the IP, I don't claim that every machine should provide the
protocol. Mobiles phones shouldn't. But machines that what to be
prioritized in some way or that want to be rewarded for hosting a node
should use a fixed IP. That's the cost of prioritization/reward. The
protocol could be a service bit, advertised in the version message.

My response to your comment below:

On 27/03/2015 03:40 p.m., Jeremy Spilman wrote:
>
> It would be extremely impressive to achieve a reliable mechanism for discerning a local copy exists under these constraints, particularly without false positives and false negatives, and without imposing very substantial one-time encoding costs, e.g. on par with doubling the verification cost. 
I see it differently. The asymmetric-time protocol is quite reliable. If
can be made to have almost no false positives/false negatives (not
considering rare communication problems, such as congestion and packet
loss for more than 5 seconds).
These are my back-of-the-envelope calculations:
Bitcoind takes approximately 1 second to serve a 1 Mb block (seek time,
but mostly transfer time)
Then decryption of a block can take 150 msec without problem (15%
overhead). The last N blocks could be cached so they don't need to be
decrypted to be sent.
In 150 msec a PC can decrypt a 1MB of data split over 1024-bit blocks
decrypted by modexp 3 (0.2 msec for 3 bigint multiplications), so a full
block can be decrypted.
Encrypting such block would take approximately 15 seconds (which is much
less than the 10 minutes available to encrypt each block)
Then the protocol works with a security margin of approximately 50x.
A communication problem during 5 seconds would be needed to disturb a
protocol of that takes 100 msec for the prover.

Regards,
 Sergio.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
@ 2015-03-28  2:55 Thy Shizzle
  0 siblings, 0 replies; 13+ messages in thread
From: Thy Shizzle @ 2015-03-28  2:55 UTC (permalink / raw)
  To: Robert McKay, Matt Whitlock; +Cc: bitcoin-development

[-- Attachment #1: Type: text/plain, Size: 10526 bytes --]

If the IP discovery is your main motivation, why don't you introduce some onion routing into transactions? That would solve this problem easily, of course there is an overhead which will slightly slow down the relay of transactions but not significantly, also make it an option not enforced, for those worried about IP association.
________________________________
From: Robert McKay<mailto:robert@mckay•com>
Sent: ‎28/‎03/‎2015 2:33 AM
To: Matt Whitlock<mailto:bip@mattwhitlock•name>
Cc: bitcoin-development@lists.sourceforge.net<mailto:bitcoin-development@lists•sourceforge.net>
Subject: Re: [Bitcoin-development] "network disruption as a service" and proof of local storage

The main motivation is to try and stop a single entity running lots of
nodes in order to harvest transaction origin IPs. That's what's behind
this.

Probably the efforts are a waste of time.. if someone has to keep a few
hundred copies of the blockchain around in order to keep IP specific
precomputed data around for all the IPs they listen on then they'll just
buy a handful of 5TB HDs and call it a day.. still some of the ideas
proposed are quite interesting and might not have much downside.

Rob


On 2015-03-27 15:16, Matt Whitlock wrote:
> I agree that someone could do this, but why is that a problem? Isn't
> the goal of this exercise to ensure more full nodes on the network?
> In
> order to be able to answer the challenges, an entity would need to be
> running a full node somewhere. Thus, they have contributed at least
> one additional full node to the network. I could certainly see a case
> for a company to host hundreds of lightweight (e.g., EC2) servers all
> backed by a single copy of the block chain. Why force every single
> machine to have its own copy? All you really need to require is that
> each agency/participant have its own copy.
>
>
> On Friday, 27 March 2015, at 2:32 pm, Robert McKay wrote:
>> Basically the problem with that is that someone could setup a single
>> full node that has the blockchain and can answer those challenges
>> and
>> then a bunch of other non-full nodes that just proxy any such
>> challenges
>> to the single full node.
>>
>> Rob
>>
>> On 2015-03-26 23:04, Matt Whitlock wrote:
>> > Maybe I'm overlooking something, but I've been watching this
>> thread
>> > with increasing skepticism at the complexity of the offered
>> solution.
>> > I don't understand why it needs to be so complex. I'd like to
>> offer
>> > an
>> > alternative for your consideration...
>> >
>> > Challenge:
>> > "Send me: SHA256(SHA256(concatenation of N pseudo-randomly
>> selected
>> > bytes from the block chain))."
>> >
>> > Choose N such that it would be infeasible for the responding node
>> to
>> > fetch all of the needed blocks in a short amount of time. In other
>> > words, assume that a node can seek to a given byte in a block
>> stored
>> > on local disk much faster than it can download the entire block
>> from
>> > a
>> > remote peer. This is almost certainly a safe assumption.
>> >
>> > For example, choose N = 1024. Then the proving node needs to
>> perform
>> > 1024 random reads from local disk. On spinning media, this is
>> likely
>> > to take somewhere on the order of 15 seconds. Assuming blocks are
>> > averaging 500 KiB each, then 1024 blocks would comprise 500 MiB of
>> > data. Can 500 MiB be downloaded in 15 seconds? This data transfer
>> > rate
>> > is 280 Mbps. Almost certainly not possible. And if it is, just
>> > increase N. The challenge also becomes more difficult as average
>> > block
>> > size increases.
>> >
>> > This challenge-response protocol relies on the lack of a "partial
>> > getdata" command in the Bitcoin protocol: a node cannot ask for
>> only
>> > part of a block; it must ask for an entire block. Furthermore,
>> nodes
>> > could ban other nodes for making too many random requests for
>> blocks.
>> >
>> >
>> > On Thursday, 26 March 2015, at 7:09 pm, Sergio Lerner wrote:
>> >>
>> >> > If I understand correctly, transforming raw blocks to keyed
>> blocks
>> >> > takes 512x longer than transforming keyed blocks back to raw.
>> The
>> >> key
>> >> > is public, like the IP, or some other value which perhaps
>> changes
>> >> less
>> >> > frequently.
>> >> >
>> >> Yes. I was thinking that the IP could be part of a first layer of
>> >> encryption done to the blockchain data prior to the asymetric
>> >> operation.
>> >> That way the asymmetric operation can be the same for all users
>> (no
>> >> different primers for different IPs, and then the verifiers does
>> not
>> >> have to verify that a particular p is actually a pseudo-prime
>> >> suitable
>> >> for P.H. ) and the public exponent can be just 3.
>> >>
>> >> >
>> >> >> Two protocols can be performed to prove local possession:
>> >> >> 1. (prover and verifier pay a small cost) The verifier sends a
>> >> seed to
>> >> >> derive some n random indexes, and the prover must respond with
>> >> the hash
>> >> >> of the decrypted blocks within a certain time bound. Suppose
>> that
>> >> >> decryption of n blocks take 100 msec (+-100 msec of network
>> >> jitter).
>> >> >> Then an attacker must have a computer 50 faster to be able to
>> >> >> consistently cheat. The last 50 blocks should not be part of
>> the
>> >> list to
>> >> >> allow nodes to catch-up and encrypt the blocks in background.
>> >> >>
>> >> >
>> >> > Can you clarify, the prover is hashing random blocks of
>> >> *decrypted*,
>> >> > as-in raw, blockchain data? What does this prove other than,
>> >> perhaps,
>> >> > fast random IO of the blockchain? (which is useful in its own
>> >> right,
>> >> > e.g. as a way to ensure only full-node IO-bound mining if baked
>> >> into
>> >> > the PoW)
>> >> >
>> >> > How is the verifier validating the response without possession
>> of
>> >> the
>> >> > full blockchain?
>> >>
>> >> You're right, It is incorrect. Not the decrypted blocks must be
>> >> sent,
>> >> but the encrypted blocks. There correct protocol is this:
>> >>
>> >> 1. (prover and verifier pay a small cost) The verifier sends a
>> seed
>> >> to
>> >> derive some n random indexes, and the prover must respond with
>> the
>> >> the
>> >> encrypted blocks within a certain time bound. The verifier
>> decrypts
>> >> those blocks to check if they are part of the block-chain.
>> >>
>> >> But then there is this improvement which allows the verifier do
>> >> detect
>> >> non full-nodes with much less computation:
>> >>
>> >> 3. (prover pays a small cost, verifier smaller cost) The verifier
>> >> asks
>> >> the prover to send a Merkle tree root of hashes of encrypted
>> blocks
>> >> with
>> >> N indexes selected by a psudo-random function seeded by a
>> challenge
>> >> value, where each encrypted-block is previously prefixed with the
>> >> seed
>> >> before being hashed (e.g. N=100). The verifier receives the
>> Markle
>> >> Root
>> >> and performs a statistical test on the received information. From
>> >> the N
>> >> hashes blocks, it chooses M < N (e.g. M = 20), and asks the
>> proved
>> >> for
>> >> the blocks at these indexes. The prover sends the blocks, the
>> >> verifier
>> >> validates the blocks by decrypting them and also verifies that
>> the
>> >> Merkle tree was well constructed for those block nodes. This
>> proves
>> >> with
>> >> high probability that the Merkle tree was built on-the-fly and
>> >> specifically for this challenge-response protocol.
>> >>
>> >> > I also wonder about the effect of spinning disk versus SSD.
>> Seek
>> >> time
>> >> > for 1,000 random reads is either nearly zero or dominating
>> >> depending
>> >> > on the two modes. I wonder if a sequential read from a random
>> >> index is
>> >> > a possible trade-off,; it doesn't prove possession of the whole
>> >> chain
>> >> > nearly as well, but at least iowait converges significantly.
>> Then
>> >> > again, that presupposes a specific ordering on disk which might
>> >> not
>> >> > exist. In X years it will all be solid-state, so eventually
>> it's
>> >> moot.
>> >> >
>> >> Good idea.
>> >>
>> >> Also we don't need that every node implements the protocol, but
>> only
>> >> nodes that want to prove full-node-ness, such as the ones which
>> want
>> >> to
>> >> receive bitnodes subsidy.
>> >
>> >
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Dive into the World of Parallel Programming The Go Parallel
>> Website,
>> > sponsored
>> > by Intel and developed in partnership with Slashdot Media, is your
>> > hub for all
>> > things parallel software development, from weekly thought
>> leadership
>> > blogs to
>> > news, videos, case studies, tutorials and more. Take a look and
>> join
>> > the
>> > conversation now. http://goparallel.sourceforge.net/
>> > _______________________________________________
>> > Bitcoin-development mailing list
>> > Bitcoin-development@lists•sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming The Go Parallel Website,
>> sponsored
>> by Intel and developed in partnership with Slashdot Media, is your
>> hub for all
>> things parallel software development, from weekly thought leadership
>> blogs to
>> news, videos, case studies, tutorials and more. Take a look and join
>> the
>> conversation now. http://goparallel.sourceforge.net/
>> _______________________________________________
>> Bitcoin-development mailing list
>> Bitcoin-development@lists•sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development


------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Bitcoin-development mailing list
Bitcoin-development@lists•sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

[-- Attachment #2: Type: text/html, Size: 15420 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-27 15:16               ` Matt Whitlock
  2015-03-27 15:32                 ` Robert McKay
       [not found]                 ` <20150327155730.GB20754@amethyst.visucore.com>
@ 2015-03-27 18:40                 ` Jeremy Spilman
  2015-04-01  2:34                   ` Sergio Lerner
  2 siblings, 1 reply; 13+ messages in thread
From: Jeremy Spilman @ 2015-03-27 18:40 UTC (permalink / raw)
  To: Matt Whitlock; +Cc: bitcoin-development

> On Mar 27, 2015, at 8:16 AM, Matt Whitlock <bip@mattwhitlock•name> wrote:
> 
> Isn't the goal of this exercise to ensure more full nodes on the network?

Basically we're talking about a form of Sybil defense and better quantifying true blockchain resiliency by proof of storage.

In this case the goal is to see if we can prove the number of distinct digital copies of the blockchain. This is actually a tricky problem because it will (always?) devolve to inferences from response timing, and we are running over a heterogenous network with heterogeneous machines.

It would be extremely impressive to achieve a reliable mechanism for discerning a local copy exists under these constraints, particularly without false positives and false negatives, and without imposing very substantial one-time encoding costs, e.g. on par with doubling the verification cost. 

I think while its a difficult cost-benefit analysis, even code complexity aside, it's interesting to discuss all the same!

Simply having many unique IP addresses possibly accessing the same unique copy provides a different (if any) benefit. E.g. Tor uses IPs as a cost factor, but (until recently?) didn't even factor in things like them all being the same Class C. 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
       [not found]                 ` <20150327155730.GB20754@amethyst.visucore.com>
  2015-03-27 16:00                   ` Matt Whitlock
@ 2015-03-27 16:08                   ` Matt Whitlock
  1 sibling, 0 replies; 13+ messages in thread
From: Matt Whitlock @ 2015-03-27 16:08 UTC (permalink / raw)
  To: bitcoin-development

On Friday, 27 March 2015, at 4:57 pm, Wladimir J. van der Laan wrote:
> On Fri, Mar 27, 2015 at 11:16:43AM -0400, Matt Whitlock wrote:
> > I agree that someone could do this, but why is that a problem? Isn't the goal of this exercise to ensure more full nodes on the network? In order to be able to answer the challenges, an entity would need to be running a full node somewhere. Thus, they have contributed at least one additional full node to the network. I could certainly see a case for a company to host hundreds of lightweight (e.g., EC2) servers all backed by a single copy of the block chain. Why force every single machine to have its own copy? All you really need to require is that each agency/participant have its own copy.
> 
> They would not even have to run one. It could just pass the query to a random other node, and forward its result :)

Ah, easy way to fix that. In fact, in my first draft of my suggestion, I had the answer, but I removed it because I thought it was superfluous.

Challenge:
"Send me: SHA256(SHA256(concatenation of N pseudo-randomly selected bytes from the block chain | prover's nonce | verifier's nonce))."

The nonces are from the "version" messages exchanged at connection startup. A node can't pass the buck because it can't control the nonce that a random other node chooses.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
       [not found]                 ` <20150327155730.GB20754@amethyst.visucore.com>
@ 2015-03-27 16:00                   ` Matt Whitlock
  2015-03-27 16:08                   ` Matt Whitlock
  1 sibling, 0 replies; 13+ messages in thread
From: Matt Whitlock @ 2015-03-27 16:00 UTC (permalink / raw)
  To: bitcoin-development

On Friday, 27 March 2015, at 4:57 pm, Wladimir J. van der Laan wrote:
> On Fri, Mar 27, 2015 at 11:16:43AM -0400, Matt Whitlock wrote:
> > I agree that someone could do this, but why is that a problem? Isn't the goal of this exercise to ensure more full nodes on the network? In order to be able to answer the challenges, an entity would need to be running a full node somewhere. Thus, they have contributed at least one additional full node to the network. I could certainly see a case for a company to host hundreds of lightweight (e.g., EC2) servers all backed by a single copy of the block chain. Why force every single machine to have its own copy? All you really need to require is that each agency/participant have its own copy.
> 
> They would not even have to run one. It could just pass the query to a random other node, and forward its result :)

D'oh. Of course. Thanks. :/

The suggestion about encrypting blocks with a key tied to IP address seems like a bad idea, though. Lots of nodes are on dynamic IP addresses. It wouldn't really be practical to re-encrypt the entire block chain every time a node's IP address changes.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-27 15:16               ` Matt Whitlock
@ 2015-03-27 15:32                 ` Robert McKay
       [not found]                 ` <20150327155730.GB20754@amethyst.visucore.com>
  2015-03-27 18:40                 ` Jeremy Spilman
  2 siblings, 0 replies; 13+ messages in thread
From: Robert McKay @ 2015-03-27 15:32 UTC (permalink / raw)
  To: Matt Whitlock; +Cc: bitcoin-development

The main motivation is to try and stop a single entity running lots of 
nodes in order to harvest transaction origin IPs. That's what's behind 
this.

Probably the efforts are a waste of time.. if someone has to keep a few 
hundred copies of the blockchain around in order to keep IP specific 
precomputed data around for all the IPs they listen on then they'll just 
buy a handful of 5TB HDs and call it a day.. still some of the ideas 
proposed are quite interesting and might not have much downside.

Rob


On 2015-03-27 15:16, Matt Whitlock wrote:
> I agree that someone could do this, but why is that a problem? Isn't
> the goal of this exercise to ensure more full nodes on the network? 
> In
> order to be able to answer the challenges, an entity would need to be
> running a full node somewhere. Thus, they have contributed at least
> one additional full node to the network. I could certainly see a case
> for a company to host hundreds of lightweight (e.g., EC2) servers all
> backed by a single copy of the block chain. Why force every single
> machine to have its own copy? All you really need to require is that
> each agency/participant have its own copy.
>
>
> On Friday, 27 March 2015, at 2:32 pm, Robert McKay wrote:
>> Basically the problem with that is that someone could setup a single
>> full node that has the blockchain and can answer those challenges 
>> and
>> then a bunch of other non-full nodes that just proxy any such 
>> challenges
>> to the single full node.
>>
>> Rob
>>
>> On 2015-03-26 23:04, Matt Whitlock wrote:
>> > Maybe I'm overlooking something, but I've been watching this 
>> thread
>> > with increasing skepticism at the complexity of the offered 
>> solution.
>> > I don't understand why it needs to be so complex. I'd like to 
>> offer
>> > an
>> > alternative for your consideration...
>> >
>> > Challenge:
>> > "Send me: SHA256(SHA256(concatenation of N pseudo-randomly 
>> selected
>> > bytes from the block chain))."
>> >
>> > Choose N such that it would be infeasible for the responding node 
>> to
>> > fetch all of the needed blocks in a short amount of time. In other
>> > words, assume that a node can seek to a given byte in a block 
>> stored
>> > on local disk much faster than it can download the entire block 
>> from
>> > a
>> > remote peer. This is almost certainly a safe assumption.
>> >
>> > For example, choose N = 1024. Then the proving node needs to 
>> perform
>> > 1024 random reads from local disk. On spinning media, this is 
>> likely
>> > to take somewhere on the order of 15 seconds. Assuming blocks are
>> > averaging 500 KiB each, then 1024 blocks would comprise 500 MiB of
>> > data. Can 500 MiB be downloaded in 15 seconds? This data transfer
>> > rate
>> > is 280 Mbps. Almost certainly not possible. And if it is, just
>> > increase N. The challenge also becomes more difficult as average
>> > block
>> > size increases.
>> >
>> > This challenge-response protocol relies on the lack of a "partial
>> > getdata" command in the Bitcoin protocol: a node cannot ask for 
>> only
>> > part of a block; it must ask for an entire block. Furthermore, 
>> nodes
>> > could ban other nodes for making too many random requests for 
>> blocks.
>> >
>> >
>> > On Thursday, 26 March 2015, at 7:09 pm, Sergio Lerner wrote:
>> >>
>> >> > If I understand correctly, transforming raw blocks to keyed 
>> blocks
>> >> > takes 512x longer than transforming keyed blocks back to raw. 
>> The
>> >> key
>> >> > is public, like the IP, or some other value which perhaps 
>> changes
>> >> less
>> >> > frequently.
>> >> >
>> >> Yes. I was thinking that the IP could be part of a first layer of
>> >> encryption done to the blockchain data prior to the asymetric
>> >> operation.
>> >> That way the asymmetric operation can be the same for all users 
>> (no
>> >> different primers for different IPs, and then the verifiers does 
>> not
>> >> have to verify that a particular p is actually a pseudo-prime
>> >> suitable
>> >> for P.H. ) and the public exponent can be just 3.
>> >>
>> >> >
>> >> >> Two protocols can be performed to prove local possession:
>> >> >> 1. (prover and verifier pay a small cost) The verifier sends a
>> >> seed to
>> >> >> derive some n random indexes, and the prover must respond with
>> >> the hash
>> >> >> of the decrypted blocks within a certain time bound. Suppose 
>> that
>> >> >> decryption of n blocks take 100 msec (+-100 msec of network
>> >> jitter).
>> >> >> Then an attacker must have a computer 50 faster to be able to
>> >> >> consistently cheat. The last 50 blocks should not be part of 
>> the
>> >> list to
>> >> >> allow nodes to catch-up and encrypt the blocks in background.
>> >> >>
>> >> >
>> >> > Can you clarify, the prover is hashing random blocks of
>> >> *decrypted*,
>> >> > as-in raw, blockchain data? What does this prove other than,
>> >> perhaps,
>> >> > fast random IO of the blockchain? (which is useful in its own
>> >> right,
>> >> > e.g. as a way to ensure only full-node IO-bound mining if baked
>> >> into
>> >> > the PoW)
>> >> >
>> >> > How is the verifier validating the response without possession 
>> of
>> >> the
>> >> > full blockchain?
>> >>
>> >> You're right, It is incorrect. Not the decrypted blocks must be
>> >> sent,
>> >> but the encrypted blocks. There correct protocol is this:
>> >>
>> >> 1. (prover and verifier pay a small cost) The verifier sends a 
>> seed
>> >> to
>> >> derive some n random indexes, and the prover must respond with 
>> the
>> >> the
>> >> encrypted blocks within a certain time bound. The verifier 
>> decrypts
>> >> those blocks to check if they are part of the block-chain.
>> >>
>> >> But then there is this improvement which allows the verifier do
>> >> detect
>> >> non full-nodes with much less computation:
>> >>
>> >> 3. (prover pays a small cost, verifier smaller cost) The verifier
>> >> asks
>> >> the prover to send a Merkle tree root of hashes of encrypted 
>> blocks
>> >> with
>> >> N indexes selected by a psudo-random function seeded by a 
>> challenge
>> >> value, where each encrypted-block is previously prefixed with the
>> >> seed
>> >> before being hashed (e.g. N=100). The verifier receives the 
>> Markle
>> >> Root
>> >> and performs a statistical test on the received information. From
>> >> the N
>> >> hashes blocks, it chooses M < N (e.g. M = 20), and asks the 
>> proved
>> >> for
>> >> the blocks at these indexes. The prover sends the blocks, the
>> >> verifier
>> >> validates the blocks by decrypting them and also verifies that 
>> the
>> >> Merkle tree was well constructed for those block nodes. This 
>> proves
>> >> with
>> >> high probability that the Merkle tree was built on-the-fly and
>> >> specifically for this challenge-response protocol.
>> >>
>> >> > I also wonder about the effect of spinning disk versus SSD. 
>> Seek
>> >> time
>> >> > for 1,000 random reads is either nearly zero or dominating
>> >> depending
>> >> > on the two modes. I wonder if a sequential read from a random
>> >> index is
>> >> > a possible trade-off,; it doesn't prove possession of the whole
>> >> chain
>> >> > nearly as well, but at least iowait converges significantly. 
>> Then
>> >> > again, that presupposes a specific ordering on disk which might
>> >> not
>> >> > exist. In X years it will all be solid-state, so eventually 
>> it's
>> >> moot.
>> >> >
>> >> Good idea.
>> >>
>> >> Also we don't need that every node implements the protocol, but 
>> only
>> >> nodes that want to prove full-node-ness, such as the ones which 
>> want
>> >> to
>> >> receive bitnodes subsidy.
>> >
>> >
>> >
>> > 
>> ------------------------------------------------------------------------------
>> > Dive into the World of Parallel Programming The Go Parallel 
>> Website,
>> > sponsored
>> > by Intel and developed in partnership with Slashdot Media, is your
>> > hub for all
>> > things parallel software development, from weekly thought 
>> leadership
>> > blogs to
>> > news, videos, case studies, tutorials and more. Take a look and 
>> join
>> > the
>> > conversation now. http://goparallel.sourceforge.net/
>> > _______________________________________________
>> > Bitcoin-development mailing list
>> > Bitcoin-development@lists•sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>>
>>
>> 
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming The Go Parallel Website, 
>> sponsored
>> by Intel and developed in partnership with Slashdot Media, is your 
>> hub for all
>> things parallel software development, from weekly thought leadership 
>> blogs to
>> news, videos, case studies, tutorials and more. Take a look and join 
>> the
>> conversation now. http://goparallel.sourceforge.net/
>> _______________________________________________
>> Bitcoin-development mailing list
>> Bitcoin-development@lists•sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-27 14:32             ` Robert McKay
@ 2015-03-27 15:16               ` Matt Whitlock
  2015-03-27 15:32                 ` Robert McKay
                                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Matt Whitlock @ 2015-03-27 15:16 UTC (permalink / raw)
  To: bitcoin-development

I agree that someone could do this, but why is that a problem? Isn't the goal of this exercise to ensure more full nodes on the network? In order to be able to answer the challenges, an entity would need to be running a full node somewhere. Thus, they have contributed at least one additional full node to the network. I could certainly see a case for a company to host hundreds of lightweight (e.g., EC2) servers all backed by a single copy of the block chain. Why force every single machine to have its own copy? All you really need to require is that each agency/participant have its own copy.


On Friday, 27 March 2015, at 2:32 pm, Robert McKay wrote:
> Basically the problem with that is that someone could setup a single 
> full node that has the blockchain and can answer those challenges and 
> then a bunch of other non-full nodes that just proxy any such challenges 
> to the single full node.
> 
> Rob
> 
> On 2015-03-26 23:04, Matt Whitlock wrote:
> > Maybe I'm overlooking something, but I've been watching this thread
> > with increasing skepticism at the complexity of the offered solution.
> > I don't understand why it needs to be so complex. I'd like to offer 
> > an
> > alternative for your consideration...
> >
> > Challenge:
> > "Send me: SHA256(SHA256(concatenation of N pseudo-randomly selected
> > bytes from the block chain))."
> >
> > Choose N such that it would be infeasible for the responding node to
> > fetch all of the needed blocks in a short amount of time. In other
> > words, assume that a node can seek to a given byte in a block stored
> > on local disk much faster than it can download the entire block from 
> > a
> > remote peer. This is almost certainly a safe assumption.
> >
> > For example, choose N = 1024. Then the proving node needs to perform
> > 1024 random reads from local disk. On spinning media, this is likely
> > to take somewhere on the order of 15 seconds. Assuming blocks are
> > averaging 500 KiB each, then 1024 blocks would comprise 500 MiB of
> > data. Can 500 MiB be downloaded in 15 seconds? This data transfer 
> > rate
> > is 280 Mbps. Almost certainly not possible. And if it is, just
> > increase N. The challenge also becomes more difficult as average 
> > block
> > size increases.
> >
> > This challenge-response protocol relies on the lack of a "partial
> > getdata" command in the Bitcoin protocol: a node cannot ask for only
> > part of a block; it must ask for an entire block. Furthermore, nodes
> > could ban other nodes for making too many random requests for blocks.
> >
> >
> > On Thursday, 26 March 2015, at 7:09 pm, Sergio Lerner wrote:
> >>
> >> > If I understand correctly, transforming raw blocks to keyed blocks
> >> > takes 512x longer than transforming keyed blocks back to raw. The 
> >> key
> >> > is public, like the IP, or some other value which perhaps changes 
> >> less
> >> > frequently.
> >> >
> >> Yes. I was thinking that the IP could be part of a first layer of
> >> encryption done to the blockchain data prior to the asymetric 
> >> operation.
> >> That way the asymmetric operation can be the same for all users (no
> >> different primers for different IPs, and then the verifiers does not
> >> have to verify that a particular p is actually a pseudo-prime 
> >> suitable
> >> for P.H. ) and the public exponent can be just 3.
> >>
> >> >
> >> >> Two protocols can be performed to prove local possession:
> >> >> 1. (prover and verifier pay a small cost) The verifier sends a 
> >> seed to
> >> >> derive some n random indexes, and the prover must respond with 
> >> the hash
> >> >> of the decrypted blocks within a certain time bound. Suppose that
> >> >> decryption of n blocks take 100 msec (+-100 msec of network 
> >> jitter).
> >> >> Then an attacker must have a computer 50 faster to be able to
> >> >> consistently cheat. The last 50 blocks should not be part of the 
> >> list to
> >> >> allow nodes to catch-up and encrypt the blocks in background.
> >> >>
> >> >
> >> > Can you clarify, the prover is hashing random blocks of 
> >> *decrypted*,
> >> > as-in raw, blockchain data? What does this prove other than, 
> >> perhaps,
> >> > fast random IO of the blockchain? (which is useful in its own 
> >> right,
> >> > e.g. as a way to ensure only full-node IO-bound mining if baked 
> >> into
> >> > the PoW)
> >> >
> >> > How is the verifier validating the response without possession of 
> >> the
> >> > full blockchain?
> >>
> >> You're right, It is incorrect. Not the decrypted blocks must be 
> >> sent,
> >> but the encrypted blocks. There correct protocol is this:
> >>
> >> 1. (prover and verifier pay a small cost) The verifier sends a seed 
> >> to
> >> derive some n random indexes, and the prover must respond with the 
> >> the
> >> encrypted blocks within a certain time bound. The verifier decrypts
> >> those blocks to check if they are part of the block-chain.
> >>
> >> But then there is this improvement which allows the verifier do 
> >> detect
> >> non full-nodes with much less computation:
> >>
> >> 3. (prover pays a small cost, verifier smaller cost) The verifier 
> >> asks
> >> the prover to send a Merkle tree root of hashes of encrypted blocks 
> >> with
> >> N indexes selected by a psudo-random function seeded by a challenge
> >> value, where each encrypted-block is previously prefixed with the 
> >> seed
> >> before being hashed (e.g. N=100). The verifier receives the Markle 
> >> Root
> >> and performs a statistical test on the received information. From 
> >> the N
> >> hashes blocks, it chooses M < N (e.g. M = 20), and asks the proved 
> >> for
> >> the blocks at these indexes. The prover sends the blocks, the 
> >> verifier
> >> validates the blocks by decrypting them and also verifies that the
> >> Merkle tree was well constructed for those block nodes. This proves 
> >> with
> >> high probability that the Merkle tree was built on-the-fly and
> >> specifically for this challenge-response protocol.
> >>
> >> > I also wonder about the effect of spinning disk versus SSD. Seek 
> >> time
> >> > for 1,000 random reads is either nearly zero or dominating 
> >> depending
> >> > on the two modes. I wonder if a sequential read from a random 
> >> index is
> >> > a possible trade-off,; it doesn't prove possession of the whole 
> >> chain
> >> > nearly as well, but at least iowait converges significantly. Then
> >> > again, that presupposes a specific ordering on disk which might 
> >> not
> >> > exist. In X years it will all be solid-state, so eventually it's 
> >> moot.
> >> >
> >> Good idea.
> >>
> >> Also we don't need that every node implements the protocol, but only
> >> nodes that want to prove full-node-ness, such as the ones which want 
> >> to
> >> receive bitnodes subsidy.
> >
> >
> > 
> > ------------------------------------------------------------------------------
> > Dive into the World of Parallel Programming The Go Parallel Website,
> > sponsored
> > by Intel and developed in partnership with Slashdot Media, is your
> > hub for all
> > things parallel software development, from weekly thought leadership 
> > blogs to
> > news, videos, case studies, tutorials and more. Take a look and join 
> > the
> > conversation now. http://goparallel.sourceforge.net/
> > _______________________________________________
> > Bitcoin-development mailing list
> > Bitcoin-development@lists•sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/bitcoin-development
> 
> 
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the 
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists•sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-26 23:04           ` Matt Whitlock
@ 2015-03-27 14:32             ` Robert McKay
  2015-03-27 15:16               ` Matt Whitlock
  0 siblings, 1 reply; 13+ messages in thread
From: Robert McKay @ 2015-03-27 14:32 UTC (permalink / raw)
  To: bitcoin-development

Basically the problem with that is that someone could setup a single 
full node that has the blockchain and can answer those challenges and 
then a bunch of other non-full nodes that just proxy any such challenges 
to the single full node.

Rob

On 2015-03-26 23:04, Matt Whitlock wrote:
> Maybe I'm overlooking something, but I've been watching this thread
> with increasing skepticism at the complexity of the offered solution.
> I don't understand why it needs to be so complex. I'd like to offer 
> an
> alternative for your consideration...
>
> Challenge:
> "Send me: SHA256(SHA256(concatenation of N pseudo-randomly selected
> bytes from the block chain))."
>
> Choose N such that it would be infeasible for the responding node to
> fetch all of the needed blocks in a short amount of time. In other
> words, assume that a node can seek to a given byte in a block stored
> on local disk much faster than it can download the entire block from 
> a
> remote peer. This is almost certainly a safe assumption.
>
> For example, choose N = 1024. Then the proving node needs to perform
> 1024 random reads from local disk. On spinning media, this is likely
> to take somewhere on the order of 15 seconds. Assuming blocks are
> averaging 500 KiB each, then 1024 blocks would comprise 500 MiB of
> data. Can 500 MiB be downloaded in 15 seconds? This data transfer 
> rate
> is 280 Mbps. Almost certainly not possible. And if it is, just
> increase N. The challenge also becomes more difficult as average 
> block
> size increases.
>
> This challenge-response protocol relies on the lack of a "partial
> getdata" command in the Bitcoin protocol: a node cannot ask for only
> part of a block; it must ask for an entire block. Furthermore, nodes
> could ban other nodes for making too many random requests for blocks.
>
>
> On Thursday, 26 March 2015, at 7:09 pm, Sergio Lerner wrote:
>>
>> > If I understand correctly, transforming raw blocks to keyed blocks
>> > takes 512x longer than transforming keyed blocks back to raw. The 
>> key
>> > is public, like the IP, or some other value which perhaps changes 
>> less
>> > frequently.
>> >
>> Yes. I was thinking that the IP could be part of a first layer of
>> encryption done to the blockchain data prior to the asymetric 
>> operation.
>> That way the asymmetric operation can be the same for all users (no
>> different primers for different IPs, and then the verifiers does not
>> have to verify that a particular p is actually a pseudo-prime 
>> suitable
>> for P.H. ) and the public exponent can be just 3.
>>
>> >
>> >> Two protocols can be performed to prove local possession:
>> >> 1. (prover and verifier pay a small cost) The verifier sends a 
>> seed to
>> >> derive some n random indexes, and the prover must respond with 
>> the hash
>> >> of the decrypted blocks within a certain time bound. Suppose that
>> >> decryption of n blocks take 100 msec (+-100 msec of network 
>> jitter).
>> >> Then an attacker must have a computer 50 faster to be able to
>> >> consistently cheat. The last 50 blocks should not be part of the 
>> list to
>> >> allow nodes to catch-up and encrypt the blocks in background.
>> >>
>> >
>> > Can you clarify, the prover is hashing random blocks of 
>> *decrypted*,
>> > as-in raw, blockchain data? What does this prove other than, 
>> perhaps,
>> > fast random IO of the blockchain? (which is useful in its own 
>> right,
>> > e.g. as a way to ensure only full-node IO-bound mining if baked 
>> into
>> > the PoW)
>> >
>> > How is the verifier validating the response without possession of 
>> the
>> > full blockchain?
>>
>> You're right, It is incorrect. Not the decrypted blocks must be 
>> sent,
>> but the encrypted blocks. There correct protocol is this:
>>
>> 1. (prover and verifier pay a small cost) The verifier sends a seed 
>> to
>> derive some n random indexes, and the prover must respond with the 
>> the
>> encrypted blocks within a certain time bound. The verifier decrypts
>> those blocks to check if they are part of the block-chain.
>>
>> But then there is this improvement which allows the verifier do 
>> detect
>> non full-nodes with much less computation:
>>
>> 3. (prover pays a small cost, verifier smaller cost) The verifier 
>> asks
>> the prover to send a Merkle tree root of hashes of encrypted blocks 
>> with
>> N indexes selected by a psudo-random function seeded by a challenge
>> value, where each encrypted-block is previously prefixed with the 
>> seed
>> before being hashed (e.g. N=100). The verifier receives the Markle 
>> Root
>> and performs a statistical test on the received information. From 
>> the N
>> hashes blocks, it chooses M < N (e.g. M = 20), and asks the proved 
>> for
>> the blocks at these indexes. The prover sends the blocks, the 
>> verifier
>> validates the blocks by decrypting them and also verifies that the
>> Merkle tree was well constructed for those block nodes. This proves 
>> with
>> high probability that the Merkle tree was built on-the-fly and
>> specifically for this challenge-response protocol.
>>
>> > I also wonder about the effect of spinning disk versus SSD. Seek 
>> time
>> > for 1,000 random reads is either nearly zero or dominating 
>> depending
>> > on the two modes. I wonder if a sequential read from a random 
>> index is
>> > a possible trade-off,; it doesn't prove possession of the whole 
>> chain
>> > nearly as well, but at least iowait converges significantly. Then
>> > again, that presupposes a specific ordering on disk which might 
>> not
>> > exist. In X years it will all be solid-state, so eventually it's 
>> moot.
>> >
>> Good idea.
>>
>> Also we don't need that every node implements the protocol, but only
>> nodes that want to prove full-node-ness, such as the ones which want 
>> to
>> receive bitnodes subsidy.
>
>
> 
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your
> hub for all
> things parallel software development, from weekly thought leadership 
> blogs to
> news, videos, case studies, tutorials and more. Take a look and join 
> the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists•sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-26 22:09         ` Sergio Lerner
@ 2015-03-26 23:04           ` Matt Whitlock
  2015-03-27 14:32             ` Robert McKay
  0 siblings, 1 reply; 13+ messages in thread
From: Matt Whitlock @ 2015-03-26 23:04 UTC (permalink / raw)
  To: bitcoin-development

Maybe I'm overlooking something, but I've been watching this thread with increasing skepticism at the complexity of the offered solution. I don't understand why it needs to be so complex. I'd like to offer an alternative for your consideration...

Challenge:
"Send me: SHA256(SHA256(concatenation of N pseudo-randomly selected bytes from the block chain))."

Choose N such that it would be infeasible for the responding node to fetch all of the needed blocks in a short amount of time. In other words, assume that a node can seek to a given byte in a block stored on local disk much faster than it can download the entire block from a remote peer. This is almost certainly a safe assumption.

For example, choose N = 1024. Then the proving node needs to perform 1024 random reads from local disk. On spinning media, this is likely to take somewhere on the order of 15 seconds. Assuming blocks are averaging 500 KiB each, then 1024 blocks would comprise 500 MiB of data. Can 500 MiB be downloaded in 15 seconds? This data transfer rate is 280 Mbps. Almost certainly not possible. And if it is, just increase N. The challenge also becomes more difficult as average block size increases.

This challenge-response protocol relies on the lack of a "partial getdata" command in the Bitcoin protocol: a node cannot ask for only part of a block; it must ask for an entire block. Furthermore, nodes could ban other nodes for making too many random requests for blocks.


On Thursday, 26 March 2015, at 7:09 pm, Sergio Lerner wrote:
> 
> > If I understand correctly, transforming raw blocks to keyed blocks
> > takes 512x longer than transforming keyed blocks back to raw. The key
> > is public, like the IP, or some other value which perhaps changes less
> > frequently.
> >
> Yes. I was thinking that the IP could be part of a first layer of
> encryption done to the blockchain data prior to the asymetric operation.
> That way the asymmetric operation can be the same for all users (no
> different primers for different IPs, and then the verifiers does not
> have to verify that a particular p is actually a pseudo-prime suitable
> for P.H. ) and the public exponent can be just 3.
> 
> >
> >> Two protocols can be performed to prove local possession:
> >> 1. (prover and verifier pay a small cost) The verifier sends a seed to
> >> derive some n random indexes, and the prover must respond with the hash
> >> of the decrypted blocks within a certain time bound. Suppose that
> >> decryption of n blocks take 100 msec (+-100 msec of network jitter).
> >> Then an attacker must have a computer 50 faster to be able to
> >> consistently cheat. The last 50 blocks should not be part of the list to
> >> allow nodes to catch-up and encrypt the blocks in background.
> >>
> >
> > Can you clarify, the prover is hashing random blocks of *decrypted*,
> > as-in raw, blockchain data? What does this prove other than, perhaps,
> > fast random IO of the blockchain? (which is useful in its own right,
> > e.g. as a way to ensure only full-node IO-bound mining if baked into
> > the PoW)
> >
> > How is the verifier validating the response without possession of the
> > full blockchain?
> 
> You're right, It is incorrect. Not the decrypted blocks must be sent,
> but the encrypted blocks. There correct protocol is this:
> 
> 1. (prover and verifier pay a small cost) The verifier sends a seed to
> derive some n random indexes, and the prover must respond with the the
> encrypted blocks within a certain time bound. The verifier decrypts
> those blocks to check if they are part of the block-chain.
> 
> But then there is this improvement which allows the verifier do detect
> non full-nodes with much less computation:
> 
> 3. (prover pays a small cost, verifier smaller cost) The verifier asks
> the prover to send a Merkle tree root of hashes of encrypted blocks with
> N indexes selected by a psudo-random function seeded by a challenge
> value, where each encrypted-block is previously prefixed with the seed
> before being hashed (e.g. N=100). The verifier receives the Markle Root
> and performs a statistical test on the received information. From the N
> hashes blocks, it chooses M < N (e.g. M = 20), and asks the proved for
> the blocks at these indexes. The prover sends the blocks, the verifier
> validates the blocks by decrypting them and also verifies that the
> Merkle tree was well constructed for those block nodes. This proves with
> high probability that the Merkle tree was built on-the-fly and
> specifically for this challenge-response protocol.
> 
> > I also wonder about the effect of spinning disk versus SSD. Seek time
> > for 1,000 random reads is either nearly zero or dominating depending
> > on the two modes. I wonder if a sequential read from a random index is
> > a possible trade-off,; it doesn't prove possession of the whole chain
> > nearly as well, but at least iowait converges significantly. Then
> > again, that presupposes a specific ordering on disk which might not
> > exist. In X years it will all be solid-state, so eventually it's moot.
> >
> Good idea.
> 
> Also we don't need that every node implements the protocol, but only
> nodes that want to prove full-node-ness, such as the ones which want to
> receive bitnodes subsidy.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-24  5:14       ` Jeremy Spilman
@ 2015-03-26 22:09         ` Sergio Lerner
  2015-03-26 23:04           ` Matt Whitlock
  0 siblings, 1 reply; 13+ messages in thread
From: Sergio Lerner @ 2015-03-26 22:09 UTC (permalink / raw)
  To: Jeremy Spilman, bitcoin-development

> If I understand correctly, transforming raw blocks to keyed blocks
> takes 512x longer than transforming keyed blocks back to raw. The key
> is public, like the IP, or some other value which perhaps changes less
> frequently.
>
Yes. I was thinking that the IP could be part of a first layer of
encryption done to the blockchain data prior to the asymetric operation.
That way the asymmetric operation can be the same for all users (no
different primers for different IPs, and then the verifiers does not
have to verify that a particular p is actually a pseudo-prime suitable
for P.H. ) and the public exponent can be just 3.

>
>> Two protocols can be performed to prove local possession:
>> 1. (prover and verifier pay a small cost) The verifier sends a seed to
>> derive some n random indexes, and the prover must respond with the hash
>> of the decrypted blocks within a certain time bound. Suppose that
>> decryption of n blocks take 100 msec (+-100 msec of network jitter).
>> Then an attacker must have a computer 50 faster to be able to
>> consistently cheat. The last 50 blocks should not be part of the list to
>> allow nodes to catch-up and encrypt the blocks in background.
>>
>
> Can you clarify, the prover is hashing random blocks of *decrypted*,
> as-in raw, blockchain data? What does this prove other than, perhaps,
> fast random IO of the blockchain? (which is useful in its own right,
> e.g. as a way to ensure only full-node IO-bound mining if baked into
> the PoW)
>
> How is the verifier validating the response without possession of the
> full blockchain?

You're right, It is incorrect. Not the decrypted blocks must be sent,
but the encrypted blocks. There correct protocol is this:

1. (prover and verifier pay a small cost) The verifier sends a seed to
derive some n random indexes, and the prover must respond with the the
encrypted blocks within a certain time bound. The verifier decrypts
those blocks to check if they are part of the block-chain.

But then there is this improvement which allows the verifier do detect
non full-nodes with much less computation:

3. (prover pays a small cost, verifier smaller cost) The verifier asks
the prover to send a Merkle tree root of hashes of encrypted blocks with
N indexes selected by a psudo-random function seeded by a challenge
value, where each encrypted-block is previously prefixed with the seed
before being hashed (e.g. N=100). The verifier receives the Markle Root
and performs a statistical test on the received information. From the N
hashes blocks, it chooses M < N (e.g. M = 20), and asks the proved for
the blocks at these indexes. The prover sends the blocks, the verifier
validates the blocks by decrypting them and also verifies that the
Merkle tree was well constructed for those block nodes. This proves with
high probability that the Merkle tree was built on-the-fly and
specifically for this challenge-response protocol.

> I also wonder about the effect of spinning disk versus SSD. Seek time
> for 1,000 random reads is either nearly zero or dominating depending
> on the two modes. I wonder if a sequential read from a random index is
> a possible trade-off,; it doesn't prove possession of the whole chain
> nearly as well, but at least iowait converges significantly. Then
> again, that presupposes a specific ordering on disk which might not
> exist. In X years it will all be solid-state, so eventually it's moot.
>
Good idea.

Also we don't need that every node implements the protocol, but only
nodes that want to prove full-node-ness, such as the ones which want to
receive bitnodes subsidy.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-16 16:29     ` [Bitcoin-development] "network disruption as a service" and proof of local storage Sergio Lerner
@ 2015-03-24  5:14       ` Jeremy Spilman
  2015-03-26 22:09         ` Sergio Lerner
  0 siblings, 1 reply; 13+ messages in thread
From: Jeremy Spilman @ 2015-03-24  5:14 UTC (permalink / raw)
  To: bitcoin-development

On Mon, 16 Mar 2015 09:29:03 -0700, Sergio Lerner  
<sergiolerner@certimix•com> wrote:
> I proposed a (what I think) is better protocol for Proof of Storage that
> I call "Proof of Local storage" here
> https://bitslog.wordpress.com/2014/11/03/proof-of-local-blockchain-storage/

Thanks so much for publishing this. It could be useful in any application  
to try to prove a keyed copy of some data.

If I understand correctly, transforming raw blocks to keyed blocks takes  
512x longer than transforming keyed blocks back to raw. The key is public,  
like the IP, or some other value which perhaps changes less frequently.

The verifier keeps blocks in the keyed format, and can decrypt quickly to  
provide raw data, or use the keyed data for hashing to try to demonstrate  
they have a pre-keyed copy.

>
> Two protocols can be performed to prove local possession:
> 1. (prover and verifier pay a small cost) The verifier sends a seed to
> derive some n random indexes, and the prover must respond with the hash
> of the decrypted blocks within a certain time bound. Suppose that
> decryption of n blocks take 100 msec (+-100 msec of network jitter).
> Then an attacker must have a computer 50 faster to be able to
> consistently cheat. The last 50 blocks should not be part of the list to
> allow nodes to catch-up and encrypt the blocks in background.
>

Can you clarify, the prover is hashing random blocks of *decrypted*, as-in  
raw, blockchain data? What does this prove other than, perhaps, fast  
random IO of the blockchain? (which is useful in its own right, e.g. as a  
way to ensure only full-node IO-bound mining if baked into the PoW)

How is the verifier validating the response without possession of the full  
blockchain?

> 2. (prover pay a high cost, verified pays negligible cost). The verifier
> chooses a seed n, and then pre-computes the encrypted blocks derived
> from the seed using the prover's IP. Then the verifier sends the  seed,
> and the prover must respond with the hash of the encrypted blocks within
> a certain time bound. The proved does not require to do any PH
> decryption, just take the encrypted blocks for indexes derived from the
> seed, hash them and send the hash back to the verifier. The verifier
> validates the time bound and the hash.

The challenger requests a hash-sum of a random sequence of indices of the  
keyed data, based on a challenge seed. So in a few bytes round-trip we can  
see how fast the computation is completed. If the data is already keyed,  
the hash of 1,000 random 1024-bit blocks should come back much faster than  
if the data needs to be keyed on-the-fly.

To verify the response, the challenger would have to use the peer's  
identity key and perform the slower transforms on those same 1,000 blocks  
and see that the result matches, so cost to challenger is higher than  
prover, assuming they actually do the computation.

Which brings up a good tweak, a full-node challenger could have to do the  
computation first, then also include something like HMAC(identityKey,  
expectedResult). The prover could then know if the challenger was honest  
before returning a result, and blacklist them if not.

>
> Both protocols can me made available by the client, under different
> states. For instance, new nodes are only allowed to request protocol 2
> (and so they get an initial assurance their are connecting to
> full-nodes). After a first-time mutual authentication, they are allowed
> to periodically perform protocol 1. Also new nodes may be allowed to
> perform protocol 1 with a small index set, and increase the index set
> over time, to get higher confidence.

I guess a new-node could see if different servers all returned the same  
challenge response, but they would have no way to know if the challenge  
response was technically correct, or sybil.

I also wonder about the effect of spinning disk versus SSD. Seek time for  
1,000 random reads is either nearly zero or dominating depending on the  
two modes. I wonder if a sequential read from a random index is a possible  
trade-off,; it doesn't prove possession of the whole chain nearly as well,  
but at least iowait converges significantly. Then again, that presupposes  
a specific ordering on disk which might not exist. In X years it will all  
be solid-state, so eventually it's moot.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bitcoin-development] "network disruption as a service" and proof of local storage
  2015-03-16  8:44   ` Jan Møller
@ 2015-03-16 16:29     ` Sergio Lerner
  2015-03-24  5:14       ` Jeremy Spilman
  0 siblings, 1 reply; 13+ messages in thread
From: Sergio Lerner @ 2015-03-16 16:29 UTC (permalink / raw)
  To: bitcoin-development

The problem of pseudo-nodes will come over and over. The cat and mouse
chase is just beginning.
It has been discussed some times that the easiest solution world be to
request some kind of resource consumption on each peer to be allowed to
connect to other peers.
Gmaxwell proposed Proof of Storage here:
https://bitcointalk.org/index.php?topic=310323.msg3332919#msg3332919

I proposed a (what I think) is better protocol for Proof of Storage that
I call "Proof of Local storage" here
https://bitslog.wordpress.com/2014/11/03/proof-of-local-blockchain-storage/
. It's better because it does not need the storage of additional data,
but more importantly, it allows you to prove full copy of the blockchain
is being maintained by the peer.
This is specially important now that Bitnodes is trying a full-node
incentive program that may be easily cheated
(http://qntra.net/2015/02/pseudonode-proxy-fools-bitcoin-full-node-incentive-program/)

Proof of local storage allows a node to prove another peer that he is
storing a LOCAL copy of a PUBLIC file, such as the blockchain. So the
peer need not waste more resources (well, just some resources to
encode/decode the block-chain).
The main idea is to use what I called asymmetric-time-encoding.
Basically you encode the block-chain in a way that it takes 100 more
times to write it than to read it. Since the block-chain is an
append-only (write-only) file, this fit good for our needs. For instance
(and as a simplification), choosing a global 1024-bit prime, then
splitting the block-chain in 1024-bit blocks, and encrypting each block
using Polihg-Hellman (modexp) with decryption exponent 3.  Then
encryption is at least 100 times slower than decryption. Before PH
encryption each node must xor each block with a pseudo-random mask
derived from the public IP and the block index.  So block encryption
could be: 
BlockEncryptIndex(i) = E(IP+i,block(i))^inv(3) (mod p),

where inv(3) is 3^-1 mod (p-1). E() could be a fast tweaked encryption
routine (tweak = index), but we only need the PRNG properties of E() and
that E() does share algebraic properties with P.H..

Two protocols can be performed to prove local possession:
1. (prover and verifier pay a small cost) The verifier sends a seed to
derive some n random indexes, and the prover must respond with the hash
of the decrypted blocks within a certain time bound. Suppose that
decryption of n blocks take 100 msec (+-100 msec of network jitter).
Then an attacker must have a computer 50 faster to be able to
consistently cheat. The last 50 blocks should not be part of the list to
allow nodes to catch-up and encrypt the blocks in background.

2. (prover pay a high cost, verified pays negligible cost). The verifier
chooses a seed n, and then pre-computes the encrypted blocks derived
from the seed using the prover's IP. Then the verifier sends the  seed,
and the prover must respond with the hash of the encrypted blocks within
a certain time bound. The proved does not require to do any PH
decryption, just take the encrypted blocks for indexes derived from the
seed, hash them and send the hash back to the verifier. The verifier
validates the time bound and the hash.

Both protocols can me made available by the client, under different
states. For instance, new nodes are only allowed to request protocol 2
(and so they get an initial assurance their are connecting to
full-nodes). After a first-time mutual authentication, they are allowed
to periodically perform protocol 1. Also new nodes may be allowed to
perform protocol 1 with a small index set, and increase the index set
over time, to get higher confidence.

The important difference between this protocol and classical remote
software attestation protocols, is that the time gap between a good peer
and a malicious peer can be made arbitrarily high, picking a larger p.
Maybe there is even another crypto primitive which is more asymmetric
than exponent 3 decryption (the LUC or NTRU cryptosystem?).

In GMaxwell proposal each peer builds a table for each other peer. In my
proposal, each peer builds a single table (the encrypted blockchain), so
it could be still possible to establish a thousands of connections to
the network from a single peer. Nevertheless, the attacker's IP will be
easily detected (he cannot hide under a thousands different IPs). It's
also possible to restrict the challenge-response to a portion of the
block-chain, the portion offset being derived from the hash of both IP
addresses and one random numbers provided by each peer. Suppose each
connection has a C-R space equivalent to 1% of the block-chain. Then
having 100 connections and responding to C-R on each connection means
storing approximate 1 copy of the block-chain (there may be overlaps,
which would need to be stored twice) , while having 1K connections would
require storing 10 copies of the blockchain.

Best regards,
 Sergio

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-04-01  2:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-23 10:06 [Bitcoin-development] "network disruption as a service" and proof of local storage Thy Shizzle
  -- strict thread matches above, loose matches on Subject: below --
2015-03-28  2:55 Thy Shizzle
2015-03-13 20:01 [Bitcoin-development] Criminal complaints against "network disruption as a service" startups Justus Ranvier
2015-03-13 21:48 ` Mike Hearn
2015-03-16  8:44   ` Jan Møller
2015-03-16 16:29     ` [Bitcoin-development] "network disruption as a service" and proof of local storage Sergio Lerner
2015-03-24  5:14       ` Jeremy Spilman
2015-03-26 22:09         ` Sergio Lerner
2015-03-26 23:04           ` Matt Whitlock
2015-03-27 14:32             ` Robert McKay
2015-03-27 15:16               ` Matt Whitlock
2015-03-27 15:32                 ` Robert McKay
     [not found]                 ` <20150327155730.GB20754@amethyst.visucore.com>
2015-03-27 16:00                   ` Matt Whitlock
2015-03-27 16:08                   ` Matt Whitlock
2015-03-27 18:40                 ` Jeremy Spilman
2015-04-01  2:34                   ` Sergio Lerner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox