On Friday, May 2, 2025 at 9:00:25 PM UTC-3 Greg Maxwell wrote:

On Friday, May 2, 2025 at 10:23:45 PM UTC Peter Todd wrote:
# _Uninterrupted_ Illicit Data

To refine that, _illicit data_ is a problem and encryption at rest does not address particularly in so far as possession of some data is a strict liability crime.

Uninterrupted however means that it's more likely to get caught by random scanning tools and whatnot -- and the encryption does that and probably eliminates most of difference between interrupted and not, which is Peter Todd's point.

But I heard someone last night say that encryption solves the illicit data issue and it absolutely doesn't. It solves a particular unexciting but more immediate sub part of the problem which is stuff like AV scanners. But I think that issue is orthogonal to this proposed change.

Aside, I'd been thinking there was a consensus limit on output sizes of 10kb but now I'm remembering that it's just at spend time and so obviously wouldn't be relevant here.

to make data publication somewhat more expensive with consensus changes.
Gregory Maxwell outlined how to do so on this mailing list years ago

A point of clarification, that's really a scheme to keep arbitrary data out of unprunable data. The proofs that the values in question are what they're supposed to be are themselves arbitrary data channels. But these proofs are prunable.

It's true that they they only need to be carried near the tip, so you could even consider them *super prunable*. And while perhaps you can get many existing transaction patterns into that model, I'm pretty confident you can't eliminate high bandwidth channels in script without massively hobbling Bitcoin overall. (Though hey, there are a lot of people out there these days who would like to hobble bitcoin, so ::shrugs::)

Even if the functionality reduction were worth it, I dunno that the gain between prunable (where most data storage stuff is) and super-prunable is that interesting, particularly since you're looking at on the order of a 20%-30% increase of bandwidth for transactions and blocks to carry those proofs. Though for context I then eventually most nodes will sync through some kind of utxo fast forward, just due to practical considerations, and w/ that the difference in prunability degree is diminished further.

It might make sense for just *outputs* if data stuffing into the UTXO set continues to be a problem as I think it can be done for just outputs without huge functionality loss... though even so the disruption and overheads yuck. But before even considering such a disruptive change you'd want to be really user everything was done to get the storage out of the unprunable data first, e.g. by getting rid of limits on op_return size.

have an overhead of about 6.6x. Existing data encoders have been happy
to pay even more money than that in terms of increased fees during fee
spikes; the difference in cost between witness space and txout space is
already 4x, and some are happy to publish data that way anyway.

A point I raised on bitcointalk: If you work out how much it costs to store data on S3 (by far not the cheapest internet data storage) for *forever* you end up with a rate that is less than a hundred thousandth the current Bitcoin minimum fee rate-- maybe way less if you also factor in the cost of storage decreasing, but I didn't. Data stuffers are not particularly price sensitive, if they were they wouldn't be using Bitcoin at all. Schemes to discourage them by causing them increased costs (e.g. by forcing them to encode in ways that use more block capacity) shouldn't be expected to work.

And to the extent that what many of these things have been doing is trying to profit off seigniorage-- creating a rare 'asset' to sell to some greater fool and profit off the difference-- further restricting them could increase their volume because the resource they need has been made more rare. For the vast majority of users the ire comes about this stuff from the fact that they've driven up fees at times, but that is dependent on what they're willing to spend, which is likely not particularly related to the marginal data rates. (And one could always embed smaller jpegs, compress them better, or not use raw json instead of an efficient encoding if they cared.. which they clearly don't.)