Hi ZmnSCPxj,

To me it seems that more space can be saved.

The data-“transaction” need not specify any output. The network could subtract the fee amount of the transaction directly from the specified UTXO. A fee also need not to be specified. It can be calculated in advance both by the network and the transaction sender based on the size of the data.

The calculation of the fee should be such that it only marginally cheaper to use this new construct over using one or more transactions. For instance, sending 81 bytes should cost as much as two OP_RETURN transactions (minus some marginal discount to incentivize the use of this more efficient way to store data).

If the balance of the selected UTXO is insufficient to pay for the data then the transaction will be invalid.

I can’t judge whether this particular approach would require a hardfork, sadly.

Zac

On Fri, 25 Feb 2022 at 04:19, ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:

Good morning Zac,

> Hi ZmnSCPxj,
>
> Any benefits of my proposal depend on my presumption that using a standard transaction for storing data must be inefficient. Presumably a transaction takes up significantly more on-chain space than the data it carries within its OP_RETURN. Therefore, not requiring a standard transaction for data storage should be more efficient. Facilitating data storage within some specialized, more space-efficient data structure at marginally lower fee per payload-byte should enable reducing the footprint of storing data on-chain.
>
> In case storing data through OP_RETURN embedded within a transaction is optimal in terms of on-chain footprint then my proposal doesn’t seem useful.

You need to have some assurance that, if you pay a fee, this data gets on the blockchain.
And you also need to pay a fee for the blockchain space.
In order to do that, you need to indicate an existing UTXO, and of course you have to provably authorize the spend of that UTXO.
But that is already an existing transaction structure, the transaction input.
If you are not going to pay an entire UTXO for it, you need a transaction output as well to store the change.

Your signature needs to cover the data being published, and it is more efficient to have a single signature that covers the transaction input, the transaction output, and the data being published.
We already have a structure for that, the transaction.

So an `OP_RETURN` transaction output is added and you put published data there, and existing constructions make everything Just Work (TM).

Now I admit we can shave off some bytes.
Pure published data does not need an amount, and using a transaction output means there is always an amount field.
We do not want the `OP_RETURN` opcode itself, though if the data is variable-size we do need an equivalent to the `OP_PUSH` opcode (which has many variants depending on the size of the data).

But that is not really a lot of bytes, and adding a separate field to the transaction would require a hardfork.
We cannot use the SegWit technique of just adding a new field that is not serialized for `txid` and `wtxid` calculations, but is committed in a new id, let us call it `dtxid`, and a new Merkle Tree added to the coinbase.
If we *could*, then a separate field for data publication would be softforkable, but the technique does not apply here.
The reason we cannot use that technique is that we want to save bytes by having the signature cover the data to be published, and signatures need to be validated by pre-softfork nodes looking at just the data committed to in `wtxid`.
If you have a separate signature that is in the `dtxid`, then you spend more actual bytes to save a few bytes.

Saving a few bytes for an application that is arguably not the "job" of Bitcoin (Bitcoin is supposed to be for value transfer, not data archiving) is not enough to justify a **hard**fork.
And any softfork seems likely to spend more bytes than what it could save.

Regards,
ZmnSCPxj