Hi Greg, list,
re:
> A point of clarification, that's really a scheme to keep arbitrary data
out of unprunable data. The proofs that the values in question are
what they're supposed to be are themselves arbitrary data channels. But
these proofs are prunable.
Could you expand on the "arbitrary data channels" here?
I was thinking through specifically the case of (pre-taproot) raw pubkey outputs and thinking about what it would mean to prove that it's not data; here, the obvious approach is "ECDSA-sign using the key" which trivially fails to prove it's not data if the message for the signature is unconstrained. But obviously, it would be constrained to be e.g. the hash of the pubkey, fixing it and preventing data storage in the (s) component of ECDSA's (r, s). (i.e. if you wanted to bypass this scheme to still store 32 bytes of data, you'd just choose s as that data and then use the standard pubkey recovery algorithm; except you can't if P is in the message).
All of that seems spectacularly irrelevant, not only because trivially you avoid it with using Hash(P) in the sig message, but even more, since this would be a new piece of consensus, you could just use BIP340 or any similar Schnorr with key-prefixing anyway, no matter what style of scriptPubKey is involved.
The question is, what is the arbitrary data channel that you refer to, that remains, when doing this? The R-value is ofc arbitrary but it's still a "image" not "preimage" (x-coord for the nonce secret * G). As I write this, one answer occurs to me, that if you used the same R value twice you leak the nonce and the secret key, which in this weird setup means you are "broadcasting" 2 32 byte values that are random, in 2 outputs, which I guess is the same embedding ratio? A horrible idea in practice given you lose control of the outputs; I know that at least some schemes that embed data in utxos deliberately do so to keep them in the utxo set permanently. So I somehow feel that that's not what you meant ...
Cheers,
AdamISZ/waxwing
On Friday, May 2, 2025 at 9:00:25 PM UTC-3 Greg Maxwell wrote:
On Friday, May 2, 2025 at 10:23:45 PM UTC Peter Todd wrote:
# _Uninterrupted_ Illicit Data
To refine that, _illicit data_ is a problem and encryption at rest does not address particularly in so far as possession of some data is a strict liability crime.
Uninterrupted however means that it's more likely to get caught by random scanning tools and whatnot -- and the encryption does that and probably eliminates most of difference between interrupted and not, which is Peter Todd's point.
But I heard someone last night say that encryption solves the illicit data issue and it absolutely doesn't. It solves a particular unexciting but more immediate sub part of the problem which is stuff like AV scanners. But I think that issue is orthogonal to this proposed change.
Aside, I'd been thinking there was a consensus limit on output sizes of 10kb but now I'm remembering that it's just at spend time and so obviously wouldn't be relevant here.
to make data publication somewhat more expensive with consensus changes.
Gregory Maxwell outlined how to do so on this mailing list years ago
A point of clarification, that's really a scheme to keep arbitrary data out of unprunable data. The proofs that the values in question are what they're supposed to be are themselves arbitrary data channels. But these proofs are prunable.
It's true that they they only need to be carried near the tip, so you could even consider them *super prunable*. And while perhaps you can get many existing transaction patterns into that model, I'm pretty confident you can't eliminate high bandwidth channels in script without massively hobbling Bitcoin overall. (Though hey, there are a lot of people out there these days who would like to hobble bitcoin, so ::shrugs::)
Even if the functionality reduction were worth it, I dunno that the gain between prunable (where most data storage stuff is) and super-prunable is that interesting, particularly since you're looking at on the order of a 20%-30% increase of bandwidth for transactions and blocks to carry those proofs. Though for context I then eventually most nodes will sync through some kind of utxo fast forward, just due to practical considerations, and w/ that the difference in prunability degree is diminished further.
It might make sense for just *outputs* if data stuffing into the UTXO set continues to be a problem as I think it can be done for just outputs without huge functionality loss... though even so the disruption and overheads yuck. But before even considering such a disruptive change you'd want to be really user everything was done to get the storage out of the unprunable data first, e.g. by getting rid of limits on op_return size.
have an overhead of about 6.6x. Existing data encoders have been happy
to pay even more money than that in terms of increased fees during fee
spikes; the difference in cost between witness space and txout space is
already 4x, and some are happy to publish data that way anyway.
A point I raised on bitcointalk: If you work out how much it costs to store data on S3 (by far not the cheapest internet data storage) for *forever* you end up with a rate that is less than a hundred thousandth the current Bitcoin minimum fee rate-- maybe way less if you also factor in the cost of storage decreasing, but I didn't. Data stuffers are not particularly price sensitive, if they were they wouldn't be using Bitcoin at all. Schemes to discourage them by causing them increased costs (e.g. by forcing them to encode in ways that use more block capacity) shouldn't be expected to work.
And to the extent that what many of these things have been doing is trying to profit off seigniorage-- creating a rare 'asset' to sell to some greater fool and profit off the difference-- further restricting them could increase their volume because the resource they need has been made more rare. For the vast majority of users the ire comes about this stuff from the fact that they've driven up fees at times, but that is dependent on what they're willing to spend, which is likely not particularly related to the marginal data rates. (And one could always embed smaller jpegs, compress them better, or not use raw json instead of an efficient encoding if they cared.. which they clearly don't.)
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/bd2c8cd1-69bc-4ffc-9ff4-fa12e929c2afn%40googlegroups.com.