* [bitcoindev] On (in)ability to embed data into Schnorr
@ 2025-10-01 14:24 waxwing/ AdamISZ
2025-10-01 22:10 ` Greg Maxwell
` (4 more replies)
0 siblings, 5 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-01 14:24 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 1548 bytes --]
Hi all,
https://github.com/AdamISZ/schnorr-unembeddability/
Here I'm analyzing whether the following statement is true: "if you can
embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340
style), without grinding or using a sidechannel to "inform" the reader, you
must be leaking your private key".
See the abstract for a slightly more fleshed out context.
I'm curious about the case of P, R, s published in utxos to prevent usage
of utxos as data. I think this answers in the half-affirmative: you can
only embed data by leaking the privkey so that it (can) immediately fall
out of the utxo set.
(To emphasize, this is different to the earlier observations (including by
me!) that just say it is *possible* to leak data by leaking the private
key; here I'm trying to prove that there is *no other way*).
However I still am probably in the large majority that thinks it's
appalling to imagine a sig attached to every pubkey onchain.
Either way, I found it very interesting! Perhaps others will find the
analysis valuable.
Feedback (especially of the "that's wrong/that's not meaningful" variety)
appreciated.
Regards,
AdamISZ/waxwing
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 2061 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
@ 2025-10-01 22:10 ` Greg Maxwell
2025-10-01 23:11 ` Andrew Poelstra
2025-10-03 13:24 ` Peter Todd
` (3 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Greg Maxwell @ 2025-10-01 22:10 UTC (permalink / raw)
To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List
[-- Attachment #1: Type: text/plain, Size: 3193 bytes --]
Intuitively it sounds likely, -- just in that the available values are a
image on the curve and a value summed with a hash dependent on everything
else. I think it would be hard to prove.
But is it even really worth the analysis when grinding gets you a 12%
embedding rate in that signature at not that significant cost? (because you
can independently grind the nonce and signature itself, or nonce and
pubkey) -- and when beyond the cost of the additional signature (making the
output 3x its cost) requiring signing when forming the address completely
kills public derivation, multisig with cold keys. etc? ... and then any of
whatever spam concerns people have would likely be exacerbated by the
spammers using more resources due to the embedding rate?
Also re private key leaking an utxo set, well not so if it's part of an
explicit multisig. E.g. 2 of 2 with leaked key and a secure one.
On Wed, Oct 1, 2025 at 7:50 PM waxwing/ AdamISZ <ekaggata@gmail•com> wrote:
> Hi all,
>
> https://github.com/AdamISZ/schnorr-unembeddability/
>
> Here I'm analyzing whether the following statement is true: "if you can
> embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340
> style), without grinding or using a sidechannel to "inform" the reader, you
> must be leaking your private key".
>
> See the abstract for a slightly more fleshed out context.
>
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.
>
> (To emphasize, this is different to the earlier observations (including by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).
>
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
>
> Either way, I found it very interesting! Perhaps others will find the
> analysis valuable.
>
> Feedback (especially of the "that's wrong/that's not meaningful" variety)
> appreciated.
>
> Regards,
> AdamISZ/waxwing
>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAAS2fgQRz%3DEJ%2BNm2rxrB_SEpqroFbcc%2BhUhmghJJ1jrJc-WUDA%40mail.gmail.com.
[-- Attachment #2: Type: text/html, Size: 4267 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-01 22:10 ` Greg Maxwell
@ 2025-10-01 23:11 ` Andrew Poelstra
2025-10-02 0:25 ` waxwing/ AdamISZ
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Poelstra @ 2025-10-01 23:11 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1: Type: text/plain, Size: 2288 bytes --]
On Wed, Oct 01, 2025 at 10:10:16PM +0000, Greg Maxwell wrote:
> Intuitively it sounds likely, -- just in that the available values are a
> image on the curve and a value summed with a hash dependent on everything
> else. I think it would be hard to prove.
>
> But is it even really worth the analysis when grinding gets you a 12%
> embedding rate in that signature at not that significant cost? (because you
> can independently grind the nonce and signature itself, or nonce and
> pubkey) -- and when beyond the cost of the additional signature (making the
> output 3x its cost) requiring signing when forming the address completely
> kills public derivation, multisig with cold keys. etc? ... and then any of
> whatever spam concerns people have would likely be exacerbated by the
> spammers using more resources due to the embedding rate?
>
Some time ago, I talked to Ethan Heilman about this in the context of PQ
signatures, and he made the interesting point that you can think of
12% embedding rate as representing an 8x discount for real signatures vs
embedded data. And that maybe that's okay, incentive-wise.
Needing to grind out portions of 32-byte blocks probably also reduces
the risk from people trying to embed virus signatures or other malicious
data.
As for waxwing's original question -- I also intuitively believe that
the only way to embed data in a Schnorr signature is by grinding or
revealing your key ... and I'm not convinced you can do it even by
revealing your key. (R is an EC point that you can't force to be any
particular value except by making a NUMS point, which you then can't use
to sign; and s = k + ex where e is a hash of kG (among other things)
so I don't think you can force that value at all.)
--
Andrew Poelstra
Director, Blockstream Research
Email: apoelstra at wpsoftware.net
Web: https://www.wpsoftware.net/andrew
The sun is always shining in space
-Justin Lewis-Webster
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aN21KbXTORgXAVH0%40mail.wpsoftware.net.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-01 23:11 ` Andrew Poelstra
@ 2025-10-02 0:25 ` waxwing/ AdamISZ
2025-10-02 15:56 ` waxwing/ AdamISZ
0 siblings, 1 reply; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-02 0:25 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 5725 bytes --]
Hi Greg, Andrew, list,
Answers to Greg then Andrew:
> E.g. 2 of 2 with leaked key and a secure one.
That's a very good point! I was narrowly focused on the signature scheme,
but Bitcoin is more than a signature scheme!
> But is it even really worth the analysis when grinding gets you a 12%
embedding rate in that signature at not that significant cost? (because you
can independently grind the nonce and signature itself, or nonce and
pubkey) -- and when beyond the cost of the additional signature (making the
output 3x its cost) requiring signing when forming the address completely
kills public derivation, multisig with cold keys. etc? ... and then any of
whatever spam concerns people have would likely be exacerbated by the
spammers using more resources due to the embedding rate?
I certainly don't think it's worth *doing* (hence my use of the term
"appalling idea" :) ), as per the things you mention there.
I wrote the document as a mostly academic investigation. It would be nice
to be surer what the limits are, although I suspect we're all reasonably
confident of what is/isn't possible.
> 12% embedding rate
Where do you get that number from? 33% for embedding 256 bits in (P, R, s)
(but as per this discussion, according to me, at the cost of key leakage).
If we include the other bytes in a (taproot anyway) utxo that's not much
less, I guess 30% ish. I could try to guess but it'd be easier if you told
me :)
to Andrew:
> As for waxwing's original question -- I also intuitively believe that
the only way to embed data in a Schnorr signature is by grinding or
revealing your key ... and I'm not convinced you can do it even by
revealing your key. (R is an EC point that you can't force to be any
particular value except by making a NUMS point, which you then can't use
to sign; and s = k + ex where e is a hash of kG (among other things)
so I don't think you can force that value at all.)
Ah, I see what you're saying, it's a subtly different target. ECDSA allows
that s be controlled, Schnorr doesn't, but I set up the game as "adversary
must be able to publish a function f such that f(any published R, s, (e)) =
data", i.e. not just f = identity function. That was why I wrote in the
introduction (copied here for convenience:)
"Data can effectively be embedded in signatures by using a publically-
inferrable nonce, as was noted \href{https://groups.google.com/g/bitcoindev
/c/d6ZO7gXGYbQ/m/Y8BfxMVxAAAJ}{here} and was later fleshed out in detail
\href{https://blog.bitmex.com/the-unstoppable-jpg-in-private-keys/}{here} (
\textbf{note}: both these sources discuss nonce-reuse but it's worse than
that: any \emph{publically inferrable} nonce can achieve the same thing,
such as, the block hash of the parent block; this will have the same
embedding rate and cannot be disallowed)."
It may be a different target "politically" :) but I was only thinking
technically, in terms of how people might end up using outputs. From a
technical point of view it makes no difference if f is the identity or
something more complex (as long as it's efficiently computable).
Cheers,
AdamISZ/waxwing
On Wednesday, October 1, 2025 at 8:20:25 PM UTC-3 Andrew Poelstra wrote:
> On Wed, Oct 01, 2025 at 10:10:16PM +0000, Greg Maxwell wrote:
> > Intuitively it sounds likely, -- just in that the available values are a
> > image on the curve and a value summed with a hash dependent on everything
> > else. I think it would be hard to prove.
> >
> > But is it even really worth the analysis when grinding gets you a 12%
> > embedding rate in that signature at not that significant cost? (because
> you
> > can independently grind the nonce and signature itself, or nonce and
> > pubkey) -- and when beyond the cost of the additional signature (making
> the
> > output 3x its cost) requiring signing when forming the address completely
> > kills public derivation, multisig with cold keys. etc? ... and then any
> of
> > whatever spam concerns people have would likely be exacerbated by the
> > spammers using more resources due to the embedding rate?
> >
>
> Some time ago, I talked to Ethan Heilman about this in the context of PQ
> signatures, and he made the interesting point that you can think of
> 12% embedding rate as representing an 8x discount for real signatures vs
> embedded data. And that maybe that's okay, incentive-wise.
>
> Needing to grind out portions of 32-byte blocks probably also reduces
> the risk from people trying to embed virus signatures or other malicious
> data.
>
> As for waxwing's original question -- I also intuitively believe that
> the only way to embed data in a Schnorr signature is by grinding or
> revealing your key ... and I'm not convinced you can do it even by
> revealing your key. (R is an EC point that you can't force to be any
> particular value except by making a NUMS point, which you then can't use
> to sign; and s = k + ex where e is a hash of kG (among other things)
> so I don't think you can force that value at all.)
>
> --
> Andrew Poelstra
> Director, Blockstream Research
> Email: apoelstra at wpsoftware.net
> Web: https://www.wpsoftware.net/andrew
>
> The sun is always shining in space
> -Justin Lewis-Webster
>
>
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/2e366b25-f789-4c9d-acf9-b87149d6a796n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 10070 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-02 0:25 ` waxwing/ AdamISZ
@ 2025-10-02 15:56 ` waxwing/ AdamISZ
2025-10-02 19:49 ` Greg Maxwell
0 siblings, 1 reply; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-02 15:56 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 2892 bytes --]
> > 12% embedding rate
> Where do you get that number from? 33% for embedding 256 bits in (P, R,
s) (but as per this discussion, according to me, at the cost of key
leakage). If we include the other bytes in a (taproot anyway) utxo that's
not much less, I guess 30% ish. I could try to guess but it'd be easier if
you told me :)
Thinking about it again: to publish data, you have to publish a
transaction! I guess the most economical, paying taproot to taproot, is
about 192 bytes with script path plus the posited extra 64 for the (R,s) in
the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit
different for key path though, because no control block? Well it hardly
matters, it's some small fraction in that range.
An interesting mechanical detail in this near-absurd scenario is that if
you wanted to repeatedly publish off the same (presumably a few multiples
of dust level) output, you couldn't also do the leak single key thing,
since you'd lose control to re-spend. So that'd place us in the "explicit
multisig" scenario that Greg mentioned, which I think would only make sense
with legacy script? Kind of a different scenario, also it would be really
weird to update legacy script to take into account a new "you must sign the
pubkeys" rule. Though I guess in this fictional scenario, it might happen
like that. If you did do it with legacy, you'd be publishing bare 2 of 2
multisig. If you did it with taproot due to how that works, the script is
not published until the output is spent, so I think that's outside what I
was considering ("data in utxo set"). (I guess you could also use something
like a hash lock which might be more efficient). So anyway if you wanted to
do this repeatedly and minimize cost, for whatever strange reason, you'd be
adding another 50-100 bytes each time bringing that % down to like 10% or
less.
But that all became way too hypothetical to even analyze properly :)
Anyway just to reemphasize I certainly wasn't advocating this sig-attaching
system, but it seems important to know what the result of it would be: we
would still not have changed the obvious reality that embedding data in
witness gives more space for data, and is more economical, and we would
only reduce by a big factor how much can be embedded in outputs (anything
from 8% to 15% embedding rate seems possible depending on the hypothetical
details), while having to screw up much of Bitcoin's functionality in the
process.
Cheers,
AdamISZ/waxwing
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 3303 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-02 15:56 ` waxwing/ AdamISZ
@ 2025-10-02 19:49 ` Greg Maxwell
2025-10-06 13:04 ` waxwing/ AdamISZ
0 siblings, 1 reply; 19+ messages in thread
From: Greg Maxwell @ 2025-10-02 19:49 UTC (permalink / raw)
To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List
[-- Attachment #1: Type: text/plain, Size: 3950 bytes --]
I just meant in the purely grinding non-key leaking case you could get 4
bytes into the nonce pretty easily and 4 bytes into either the pubkey or
signature out of a 64 byte signature. Obviously the delivered embedding
rate in a whole txn will be lower, but maybe not that much thanks to
multisig outputs.
On Thu, Oct 2, 2025 at 4:17 PM waxwing/ AdamISZ <ekaggata@gmail•com> wrote:
> > > 12% embedding rate
> > Where do you get that number from? 33% for embedding 256 bits in (P, R,
> s) (but as per this discussion, according to me, at the cost of key
> leakage). If we include the other bytes in a (taproot anyway) utxo that's
> not much less, I guess 30% ish. I could try to guess but it'd be easier if
> you told me :)
>
> Thinking about it again: to publish data, you have to publish a
> transaction! I guess the most economical, paying taproot to taproot, is
> about 192 bytes with script path plus the posited extra 64 for the (R,s) in
> the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit
> different for key path though, because no control block? Well it hardly
> matters, it's some small fraction in that range.
>
> An interesting mechanical detail in this near-absurd scenario is that if
> you wanted to repeatedly publish off the same (presumably a few multiples
> of dust level) output, you couldn't also do the leak single key thing,
> since you'd lose control to re-spend. So that'd place us in the "explicit
> multisig" scenario that Greg mentioned, which I think would only make sense
> with legacy script? Kind of a different scenario, also it would be really
> weird to update legacy script to take into account a new "you must sign the
> pubkeys" rule. Though I guess in this fictional scenario, it might happen
> like that. If you did do it with legacy, you'd be publishing bare 2 of 2
> multisig. If you did it with taproot due to how that works, the script is
> not published until the output is spent, so I think that's outside what I
> was considering ("data in utxo set"). (I guess you could also use something
> like a hash lock which might be more efficient). So anyway if you wanted to
> do this repeatedly and minimize cost, for whatever strange reason, you'd be
> adding another 50-100 bytes each time bringing that % down to like 10% or
> less.
>
> But that all became way too hypothetical to even analyze properly :)
>
> Anyway just to reemphasize I certainly wasn't advocating this
> sig-attaching system, but it seems important to know what the result of it
> would be: we would still not have changed the obvious reality that
> embedding data in witness gives more space for data, and is more
> economical, and we would only reduce by a big factor how much can be
> embedded in outputs (anything from 8% to 15% embedding rate seems possible
> depending on the hypothetical details), while having to screw up much of
> Bitcoin's functionality in the process.
>
> Cheers,
> AdamISZ/waxwing
>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAAS2fgQtx_FnecKxpKryTq9o5HJfirY_Vyih6FXzHGHG2itmQQ%40mail.gmail.com.
[-- Attachment #2: Type: text/html, Size: 4778 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
2025-10-01 22:10 ` Greg Maxwell
@ 2025-10-03 13:24 ` Peter Todd
2025-10-04 2:39 ` waxwing/ AdamISZ
2025-10-07 8:22 ` Anthony Towns
` (2 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Peter Todd @ 2025-10-03 13:24 UTC (permalink / raw)
To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List
[-- Attachment #1: Type: text/plain, Size: 1531 bytes --]
On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote:
> Hi all,
>
> https://github.com/AdamISZ/schnorr-unembeddability/
>
> Here I'm analyzing whether the following statement is true: "if you can
> embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340
> style), without grinding or using a sidechannel to "inform" the reader, you
> must be leaking your private key".
>
> See the abstract for a slightly more fleshed out context.
>
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.
>
> (To emphasize, this is different to the earlier observations (including by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).
You can probably use timelock encryption to ensure that the leak of the private
key only happens in the future, after the funds are recovered by the owner in a
subsequent transaction.
--
https://petertodd.org 'peter'[:-1]@petertodd.org
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aN_OlgvB-Co1BL19%40petertodd.org.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-03 13:24 ` Peter Todd
@ 2025-10-04 2:39 ` waxwing/ AdamISZ
0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-04 2:39 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 1252 bytes --]
Hi Peter,
> You can probably use timelock encryption to ensure that the leak of the
private
key only happens in the future, after the funds are recovered by the owner
in a
subsequent transaction.
Another very interesting point, there, to get around the issue of key
leakage ... albeit I don't see a usecase, maybe I'm just not imaginative
enough, very possible.
If someone wants to keep something in the utxo set "forever", it doesn't
help. If they want the property of "immediately accessible in the utxo set"
(like "deposit into some fancy system with a blob of data"; I emphasize
"deposit" because that would explain why not "just put it in the witness",
your current outputs don't support that; correct me if my reasoning is
wrong here), then I guess they don't get that, either: the data is
accessible "intermediate term" instead.
Cheers,
AdamISZ/waxwing
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/7b4296ca-50ed-4a8b-b853-0accff46abfbn%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 1627 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-02 19:49 ` Greg Maxwell
@ 2025-10-06 13:04 ` waxwing/ AdamISZ
0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-06 13:04 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 4873 bytes --]
Yes, sorry, reading fail on my part (somehow missed that you were
explicitly referring to grinding in the comment).
Still don't think the 12% figure is a good one though? in (P,R,s) it's 8
out of 96 (and as discussed, worse if whole tx is (realistically)
included), 1/4 the rate you get from direct key leakage. (Plus the perhaps
trivial point that it does actually require work, which might conceivably
matter at scale?). I'm not sure why one would not include P in the measure?
Even an explicit multisig that does not sacrifice control of the output
would be of the order of double the embedding rate, without having to do
work. (P,R,s x 2 = 192 and embed 32 for a 1/6 rate; vs. grinding all 4 P,R
values for a 1/12 rate).
On Thursday, October 2, 2025 at 6:59:41 PM UTC-3 Greg Maxwell wrote:
> I just meant in the purely grinding non-key leaking case you could get 4
> bytes into the nonce pretty easily and 4 bytes into either the pubkey or
> signature out of a 64 byte signature. Obviously the delivered embedding
> rate in a whole txn will be lower, but maybe not that much thanks to
> multisig outputs.
>
>
> On Thu, Oct 2, 2025 at 4:17 PM waxwing/ AdamISZ <ekag...@gmail•com> wrote:
>
>> > > 12% embedding rate
>> > Where do you get that number from? 33% for embedding 256 bits in (P, R,
>> s) (but as per this discussion, according to me, at the cost of key
>> leakage). If we include the other bytes in a (taproot anyway) utxo that's
>> not much less, I guess 30% ish. I could try to guess but it'd be easier if
>> you told me :)
>>
>> Thinking about it again: to publish data, you have to publish a
>> transaction! I guess the most economical, paying taproot to taproot, is
>> about 192 bytes with script path plus the posited extra 64 for the (R,s) in
>> the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit
>> different for key path though, because no control block? Well it hardly
>> matters, it's some small fraction in that range.
>>
>> An interesting mechanical detail in this near-absurd scenario is that if
>> you wanted to repeatedly publish off the same (presumably a few multiples
>> of dust level) output, you couldn't also do the leak single key thing,
>> since you'd lose control to re-spend. So that'd place us in the "explicit
>> multisig" scenario that Greg mentioned, which I think would only make sense
>> with legacy script? Kind of a different scenario, also it would be really
>> weird to update legacy script to take into account a new "you must sign the
>> pubkeys" rule. Though I guess in this fictional scenario, it might happen
>> like that. If you did do it with legacy, you'd be publishing bare 2 of 2
>> multisig. If you did it with taproot due to how that works, the script is
>> not published until the output is spent, so I think that's outside what I
>> was considering ("data in utxo set"). (I guess you could also use something
>> like a hash lock which might be more efficient). So anyway if you wanted to
>> do this repeatedly and minimize cost, for whatever strange reason, you'd be
>> adding another 50-100 bytes each time bringing that % down to like 10% or
>> less.
>>
>> But that all became way too hypothetical to even analyze properly :)
>>
>> Anyway just to reemphasize I certainly wasn't advocating this
>> sig-attaching system, but it seems important to know what the result of it
>> would be: we would still not have changed the obvious reality that
>> embedding data in witness gives more space for data, and is more
>> economical, and we would only reduce by a big factor how much can be
>> embedded in outputs (anything from 8% to 15% embedding rate seems possible
>> depending on the hypothetical details), while having to screw up much of
>> Bitcoin's functionality in the process.
>>
>> Cheers,
>> AdamISZ/waxwing
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Bitcoin Development Mailing List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to bitcoindev+...@googlegroups•com.
>>
> To view this discussion visit
>> https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com
>> <https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/b486e5dd-d5b4-43f1-9d9a-20b772d3dc1bn%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 6492 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
2025-10-01 22:10 ` Greg Maxwell
2025-10-03 13:24 ` Peter Todd
@ 2025-10-07 8:22 ` Anthony Towns
2025-10-07 12:05 ` waxwing/ AdamISZ
2025-10-31 9:10 ` Tim Ruffing
2025-10-31 13:19 ` Garlo Nicon
4 siblings, 1 reply; 19+ messages in thread
From: Anthony Towns @ 2025-10-07 8:22 UTC (permalink / raw)
To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List
On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote:
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.
I think you can attack the setup here.
If you allow scriptPubKeys in the utxo set whose spending conditions
are HTLC/atomic-swap-like:
(pubkey A and preimage reveal of X)
OR (pubkey B and block height > H)
then you either set H to be arbitrarily far in the future and reveal
B's privkey, or choose an NUMS X with no known preimage, and reveal
A's privkey.
If you don't allow those things (eg, by requiring such constructions
also have a (pubkey musig(A,B)) path) then I think you rule out NUMS-IPK
constructions, and end up making things like vaults ("hotkey with delay,
coldkey anytime") difficult to send to ("I have to sign with my cold
key to request funds?"), or, depending on what the utxo R,s is signing,
encourage key reuse.
> (To emphasize, this is different to the earlier observations (including by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).
That seems right to me.
I think if the signature scheme supported pubkey recovery (ie, s*G = R +
H(R,m)*P, and our "m" didn't commit to P as well), you could get around
this by just having P be the data, with no one, including the "signer"
able to recover the private key.
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
I think the only thing achieved by embedding data in the utxo set (vs
an OP_RETURN output or witness data) is to bloat the utxo set; and if
that's the goal, it can equally easily be done with spendable outputs
that the attacker simply chooses not to ever spend. So that doesn't seem
like a terribly interesting solution to anything.
As far as embedding data in signatures goes, I think the following
scheme would allow you to publish data in a cryptographically-secure way,
with minimal lost funds:
0) Setup secret keys p and q, and a 32-byte secret k. H(a,b,..) is sha256
of a,b,.. concatenated.
1) Split your data into N 31 byte blocks, a1, a2, .., aN.
2) Calculate r0 as H(k*G). Calculate r1, .., rN as:
r(i+1) = H(p, r(i)) + a(i)
3) Sign N+1 transactions in a chain spending pubkey p*G, using rN, r(N-1),
.., r1, r0 as nonces. All but the final tx should pay to a p*G output to
continue the chain; the final output should pay to q*G instead.
4) Once all transactions are sufficiently confirmed, spend the final
output with k as the secret nonce (and hence R=k*G as the public
nonce).
Recover the data using the following process:
1) From the final transaction, recover R=k*G, and calculate r0 as H(R).
Recover p from the previous transaction, p = (s0-r0)/H(r0*G, P,mi).
2) Recover ri from each signature; ri = si - H(Ri, P, mi)*p. Recover
the data ai as ai = ri - H(p,r(i-1)).
Dealing with the points being 32-bytes might require carrying over a
sign-bit; but that should be possible in the spare ~7 bits since each
block was only 31 bytes not 32 bytes. Left as an exercise for the
reader, etc.
I believe that the privkey p is secure prior to k*G being revealed,
since all the nonces are distinct hashes seeded by that privkey; and q
remains secure because k is never revealed.
If you wanted to not reuse the pubkey p*G repeatedly, you could tweak it
to be p0 = p, p(i+1) = p + H(k*G, p(i)), or similar. That would allow you
to use an n-of-n multisig to get multiple blocks in a single transaction
without seeming weird, eg.
I believe the only way to distinguish this from a normal transaction
pattern where a wallet has a change output, is via the final transaction
that reveals k*G, and detecting the relationship between k*G and the
spending conditions of the transaction that created the coin being spent.
That's already somewhat expensive to check for every spend, but could
be made more so by publishing k*G on some other medium (ie the data is
in the blockchain, but you obtain the txid and key to find the data
from elsewhere), or by revealing (k+x)*G where x is a random 20-bit
(?) number, and a significant but tractable amount of grinding is needed
to recover the desired k*G and decode the data -- the idea being that
that is tractable for someone who knows there is data at that txid,
but not tractable when performed on every signature in the blockchain
in order to filter data publication.
I think if you did 20 such transactions per block, each spending a single
20-of-20 tapscript multisig, you'd get 12400 bytes of data per block
(without violating standardness constraints), at a cost of ~11800vb, so
much less efficient than inscriptions, but slightly more efficient than
OP_RETURN, and significantly less detectable than either. I think Knots
default policy currently allows up to 50-of-50 multisig in tapscript,
which would give you 31kB of data in ~26.6kvB of tx weight in a block.
If you're regularly making payments from a particular wallet, I think
that procedure would allow you to encode data in your change outputs at
the rate of 32B/tx for no additional cost. Though the data would only be
recoverable once complete, and it's probably worth noting that I haven't
provided any security proofs...
Cheers,
aj
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aOTNvteE8PCm6yDd%40erisian.com.au.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-07 8:22 ` Anthony Towns
@ 2025-10-07 12:05 ` waxwing/ AdamISZ
2025-10-08 5:12 ` Anthony Towns
0 siblings, 1 reply; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-07 12:05 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 8898 bytes --]
Hi aj,
Interesting points! Answers inline.
On Tuesday, October 7, 2025 at 6:38:40 AM UTC-3 Anthony Towns wrote:
On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote:
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.
I think you can attack the setup here.
If you allow scriptPubKeys in the utxo set whose spending conditions
are HTLC/atomic-swap-like:
(pubkey A and preimage reveal of X)
OR (pubkey B and block height > H)
then you either set H to be arbitrarily far in the future and reveal
B's privkey, or choose an NUMS X with no known preimage, and reveal
A's privkey.
Yes. In the paper (and my OP email) I'm trying to narrow it down completely
to a P, R, s structure. I guess if we try to be realistic about this
"publish a signature in the output always" horrible scenario, it would have
to just ditch the NUMS variant of taproot, and I agree, that is a very Bad
Thing (TM). (uh sorry you discuss this in the next paragraph but, w/e).
Alternative examples like multisig or hash lock in script to get the data
leakage without losing control of the output (necessarily) have been
mentioned but I like your 2-branch setup as a good flexible example.
If you don't allow those things (eg, by requiring such constructions
also have a (pubkey musig(A,B)) path) then I think you rule out NUMS-IPK
constructions, and end up making things like vaults ("hotkey with delay,
coldkey anytime") difficult to send to ("I have to sign with my cold
key to request funds?"), or, depending on what the utxo R,s is signing,
encourage key reuse.
> (To emphasize, this is different to the earlier observations (including
by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).
That seems right to me.
I think if the signature scheme supported pubkey recovery (ie, s*G = R +
H(R,m)*P, and our "m" didn't commit to P as well), you could get around
this by just having P be the data, with no one, including the "signer"
able to recover the private key.
Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your description
of the relevance of pubkey recovery is good, but there are some nuances.
You can't quite (with ECDSA) get P to be the data and have a valid sig, but
you can get 's' to be the data simply by backsolving for the private key x.
Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in
ECDSA causes that. And the second nuance, you did actually mention: you get
"not leaking the key" for free, here. But it's still only a 32/96 bytes
embedding rate though, the way I count it.
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
I think the only thing achieved by embedding data in the utxo set (vs
an OP_RETURN output or witness data) is to bloat the utxo set; and if
that's the goal, it can equally easily be done with spendable outputs
that the attacker simply chooses not to ever spend. So that doesn't seem
like a terribly interesting solution to anything.
I think the logic of that is not quite right. Suppose I want to embed
pictures into the unpruneable utxo set specifically (and not only 'in
transactions'). The starting point here was me trying to write out how you
can't embed data in known-privkey (Schnorr) P, R, s tuples.
And not only pictures; as Andrew pointed out above, there's always the
concern of some kind of virus-y "naughty" data.
As far as embedding data in signatures goes, I think the following
scheme would allow you to publish data in a cryptographically-secure way,
with minimal lost funds:
0) Setup secret keys p and q, and a 32-byte secret k. H(a,b,..) is sha256
of a,b,.. concatenated.
1) Split your data into N 31 byte blocks, a1, a2, .., aN.
2) Calculate r0 as H(k*G). Calculate r1, .., rN as:
r(i+1) = H(p, r(i)) + a(i)
3) Sign N+1 transactions in a chain spending pubkey p*G, using rN, r(N-1),
.., r1, r0 as nonces. All but the final tx should pay to a p*G output to
continue the chain; the final output should pay to q*G instead.
4) Once all transactions are sufficiently confirmed, spend the final
output with k as the secret nonce (and hence R=k*G as the public
nonce).
Recover the data using the following process:
1) From the final transaction, recover R=k*G, and calculate r0 as H(R).
Recover p from the previous transaction, p = (s0-r0)/H(r0*G, P,mi).
2) Recover ri from each signature; ri = si - H(Ri, P, mi)*p. Recover
the data ai as ai = ri - H(p,r(i-1)).
Dealing with the points being 32-bytes might require carrying over a
sign-bit; but that should be possible in the spare ~7 bits since each
block was only 31 bytes not 32 bytes. Left as an exercise for the
reader, etc.
I believe that the privkey p is secure prior to k*G being revealed,
since all the nonces are distinct hashes seeded by that privkey; and q
remains secure because k is never revealed.
If you wanted to not reuse the pubkey p*G repeatedly, you could tweak it
to be p0 = p, p(i+1) = p + H(k*G, p(i)), or similar. That would allow you
to use an n-of-n multisig to get multiple blocks in a single transaction
without seeming weird, eg.
I believe the only way to distinguish this from a normal transaction
pattern where a wallet has a change output, is via the final transaction
that reveals k*G, and detecting the relationship between k*G and the
spending conditions of the transaction that created the coin being spent.
That's already somewhat expensive to check for every spend, but could
be made more so by publishing k*G on some other medium (ie the data is
in the blockchain, but you obtain the txid and key to find the data
from elsewhere), or by revealing (k+x)*G where x is a random 20-bit
(?) number, and a significant but tractable amount of grinding is needed
to recover the desired k*G and decode the data -- the idea being that
that is tractable for someone who knows there is data at that txid,
but not tractable when performed on every signature in the blockchain
in order to filter data publication.
I think if you did 20 such transactions per block, each spending a single
20-of-20 tapscript multisig, you'd get 12400 bytes of data per block
(without violating standardness constraints), at a cost of ~11800vb, so
much less efficient than inscriptions, but slightly more efficient than
OP_RETURN, and significantly less detectable than either. I think Knots
default policy currently allows up to 50-of-50 multisig in tapscript,
which would give you 31kB of data in ~26.6kvB of tx weight in a block.
If you're regularly making payments from a particular wallet, I think
that procedure would allow you to encode data in your change outputs at
the rate of 32B/tx for no additional cost. Though the data would only be
recoverable once complete, and it's probably worth noting that I haven't
provided any security proofs...
Very nice example. I am glad you took the trouble to write it out, because
I agree that examples like that are worth working through because as you
say they lean closer to being properly indistinguishable from ordinary
transaction patterns.
My analysis was narrower: output-side embedding (in a theoretical future of
P,R,s outputs). But that's a little confusing because (P, R, s) is still
there whether some of it is put in witness or not. So everyone seems to
agree that privkey reveal is necessary for that, but everyone is also
pointing out that with Bitcoin's actual consensus scripting system, that
doesn't quite mean what it seems! And the embedding rate is not very good.
In this framing, not much has changed in your "chained" example: once the
privkey p is revealed, you get the k value per chain link, so it's still
roughly a 1/3 ratio, or more realistically, as you mention (and I did
upthread), it's per *transaction* which is a much lower rate.
Your points about limits, standardness constraints are well taken; those
are the kinds of things that do actually matter today, but I was not
thinking about.
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/e4d271ad-9ea3-41e5-96e2-6cb0118943e4n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 10567 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-07 12:05 ` waxwing/ AdamISZ
@ 2025-10-08 5:12 ` Anthony Towns
2025-10-08 12:55 ` waxwing/ AdamISZ
0 siblings, 1 reply; 19+ messages in thread
From: Anthony Towns @ 2025-10-08 5:12 UTC (permalink / raw)
To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List
On Tue, Oct 07, 2025 at 05:05:24AM -0700, waxwing/ AdamISZ wrote:
> Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your description
> of the relevance of pubkey recovery is good, but there are some nuances.
> You can't quite (with ECDSA) get P to be the data and have a valid sig, but
> you can get 's' to be the data simply by backsolving for the private key x.
> Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in
> ECDSA causes that. And the second nuance, you did actually mention: you get
> "not leaking the key" for free, here. But it's still only a 32/96 bytes
> embedding rate though, the way I count it.
You've got 4x 32-byte values to play with: s, r, p and m. The verification
equation determines one of these, reducing it to 3x. m isn't able to be
freely chosen, reducing it to 2x. And being able to reverse the equation
in order to calculate anything requires the receiver to know one of the
secrets, which reduces it to 1x. (Grinding can bump that back up to a
factor of 1.something) So that's the 32. On the other side, you need to
transmit everything but m which is otherwise determined by the setup,
so that's the 96.
> I think the logic of that is not quite right. Suppose I want to embed
> pictures into the unpruneable utxo set specifically (and not only 'in
> transactions').
Sure, but then I'll also suppose your goal is to harm Bitcoin by bloating
the utxo set. If that weren't one of your fundamental goals, you'd use
other, cheaper and easier, ways of encoding the data.
> Very nice example. I am glad you took the trouble to write it out, because
> I agree that examples like that are worth working through because as you
> say they lean closer to being properly indistinguishable from ordinary
> transaction patterns.
I think the (P,R,s) outputs could be an interesting design for a
non-programmable system that was intended purely for payments -- a
FEDwire/SWIFT replacement without the possibility of vaults, lightning,
etc. Presumably more mimblewimble friendly etc too. Presumably the "R,s"
values could also be a signature of P by the operator's well known pubkey,
giving you a KYC/CBDC-like system too.
You could get programmability back in this scenario by allow P to sign
a script, which you then satisfy, rather than signing a payment directly
(ie, the graftroot approach).
Anyway, once you make the system programmable in interesting ways, I
think you get data embeddability pretty much immediately, and then it's
just a matter of trading off the optimal encoding rate versus how easily
identifiable your transactions can be. Forcing data to be hidden at a
cost of making it less efficient just leaves less resources available
to other users of the system, though, which doesn't seem like a win in
any way to me.
> Your points about limits, standardness constraints are well taken; those
> are the kinds of things that do actually matter today, but I was not
> thinking about.
Note that I mentioned the standardness constraints not because they're
limits today, but rather because they reflect the form existing txs take,
so mimicing that form would allow txs embedding data via this scheme to
be difficult to distinguish from other txs, and hence equally difficult
to censor/filter.
Cheers,
aj
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aOXyvGaKfe7bqTXv%40erisian.com.au.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-08 5:12 ` Anthony Towns
@ 2025-10-08 12:55 ` waxwing/ AdamISZ
0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-08 12:55 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 5579 bytes --]
Answers inline.
On Wednesday, October 8, 2025 at 5:45:06 AM UTC-3 Anthony Towns wrote:
On Tue, Oct 07, 2025 at 05:05:24AM -0700, waxwing/ AdamISZ wrote:
> Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your
description
> of the relevance of pubkey recovery is good, but there are some nuances.
> You can't quite (with ECDSA) get P to be the data and have a valid sig,
but
> you can get 's' to be the data simply by backsolving for the private key
x.
> Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in
> ECDSA causes that. And the second nuance, you did actually mention: you
get
> "not leaking the key" for free, here. But it's still only a 32/96 bytes
> embedding rate though, the way I count it.
You've got 4x 32-byte values to play with: s, r, p and m. The verification
equation determines one of these, reducing it to 3x. m isn't able to be
freely chosen, reducing it to 2x. And being able to reverse the equation
in order to calculate anything requires the receiver to know one of the
secrets, which reduces it to 1x. (Grinding can bump that back up to a
factor of 1.something) So that's the 32. On the other side, you need to
transmit everything but m which is otherwise determined by the setup,
so that's the 96.
Yeah I think so, roughly. It's not 100% watertight deductions but it seems
correct from where I'm sitting.
(I would only nit that 'm' isn't in consideration as it's implicit, not
published, in current signature usage; in a proposed signature-in-output, m
would obviously be constrained to something with no wiggle room (and
including P if we used ECDSA, but we wouldn't).
> I think the logic of that is not quite right. Suppose I want to embed
> pictures into the unpruneable utxo set specifically (and not only 'in
> transactions').
Sure, but then I'll also suppose your goal is to harm Bitcoin by bloating
the utxo set. If that weren't one of your fundamental goals, you'd use
other, cheaper and easier, ways of encoding the data.
But the goal can be simply this: my data is more marketable if I can
plausibly claim that it's embedded into bitcoin nodes for eternity (whether
true or not, it's marketable). AFAIK this is indeed a thing, in the real
world.
> Very nice example. I am glad you took the trouble to write it out,
because
> I agree that examples like that are worth working through because as you
> say they lean closer to being properly indistinguishable from ordinary
> transaction patterns.
I think the (P,R,s) outputs could be an interesting design for a
non-programmable system that was intended purely for payments -- a
FEDwire/SWIFT replacement without the possibility of vaults, lightning,
etc. Presumably more mimblewimble friendly etc too. Presumably the "R,s"
values could also be a signature of P by the operator's well known pubkey,
giving you a KYC/CBDC-like system too.
You could get programmability back in this scenario by allow P to sign
a script, which you then satisfy, rather than signing a payment directly
(ie, the graftroot approach).
I like this line of thought, and indeed I'd forgotten about graftroot and
the whole delegation angle.
(and just to repeat the point made earlier: we'd only need to sign over a
message including P for ecdsa, but we wouldn't use that.)
I guess if you're discussing a hypothetical permissioned system though it's
a whole different world, so I'm going to sidestep that one.
But it does sound interesting to do delegation and then ZkPOK outputs even
in a Bitcoin world. Albeit it's a long way from where we are today.
Of course we're firmly pie in the sky again here, but I think it helps
inform thinking about Bitcoin as it is concretely today.
Anyway, once you make the system programmable in interesting ways, I
think you get data embeddability pretty much immediately,
My main motivation in discussing this was indeed the extent to which you
get embeddability even without any programmability; as we've established,
it's not zero, and it's not restricted to grinding (exponential work). But
in *pure* unprogrammable, ZkPOK outputs of form P, R,s and nothing else
allowed, it *is*, I'm claiming, restricted to key leakage and doesn't
surpass 33%.
and then it's
just a matter of trading off the optimal encoding rate versus how easily
identifiable your transactions can be. Forcing data to be hidden at a
cost of making it less efficient just leaves less resources available
to other users of the system, though, which doesn't seem like a win in
any way to me.
> Your points about limits, standardness constraints are well taken; those
> are the kinds of things that do actually matter today, but I was not
> thinking about.
Note that I mentioned the standardness constraints not because they're
limits today, but rather because they reflect the form existing txs take,
so mimicing that form would allow txs embedding data via this scheme to
be difficult to distinguish from other txs, and hence equally difficult
to censor/filter.
I see. Good point.
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/323c2d13-e90f-49c5-bfe0-f161b8b8dbb4n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 7024 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
` (2 preceding siblings ...)
2025-10-07 8:22 ` Anthony Towns
@ 2025-10-31 9:10 ` Tim Ruffing
2025-10-31 13:09 ` waxwing/ AdamISZ
2025-10-31 13:19 ` Garlo Nicon
4 siblings, 1 reply; 19+ messages in thread
From: Tim Ruffing @ 2025-10-31 9:10 UTC (permalink / raw)
To: waxwing/ AdamISZ, Bitcoin Development Mailing List
Hey Adam,
I think something is wrong here.
Assume a group of order n=p*2^t where p is a large enough prime such
that the DL problem is hard. For example, Curve25519 has t=3 but the DL
problem still hard. Or, assuming n+1 is also prime, work in the
multiplicative group of integers modulo n+1 (which has group order n
then). I'm not aware of any obstacles to constructing such groups for
sufficiently large values of t.
The crucial point is that, in these groups, the Pohlig-Hellman
algorithm can be used to compute the t least significant bits of the
discrete logarithm k of a group element R efficiently. So to embed t
bits in a Schnorr signature (R, s), simply pick k such that its t least
significant bits t are exactly these bits.
Of course, this does not work in BIP340 because it uses the secp256k1
group for which t=0, i.e., the group has prime order. But it appears
that the reasoning in your write up is not specific to prime-order
groups. Thus I conclude that something must be wrong or insufficient in
your argument.
Let me clarify that I do not claim that data can be embedded in a
BIP340 signature. I only claim that your arguments for why data can't
be embedded do not appear to be sound. I believe any proof that data
cannot be embedded in a Schnorr signature (or in a group element R) in
a prime-order group must somehow exploit the fact that all bits of k
are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1]
for a proof that this is the case for prime-order groups.
Best,
Tim
[1] https://www.csc.kth.se/~johanh/hnrsaacm.pdf
On Wed, 2025-10-01 at 07:24 -0700, waxwing/ AdamISZ wrote:
> Hi all,
>
> https://github.com/AdamISZ/schnorr-unembeddability/
>
> Here I'm analyzing whether the following statement is true: "if you
> can embed data into a (P, R, s) tuple (Schnorr pubkey and signature,
> BIP340 style), without grinding or using a sidechannel to "inform"
> the reader, you must be leaking your private key".
>
> See the abstract for a slightly more fleshed out context.
>
> I'm curious about the case of P, R, s published in utxos to prevent
> usage of utxos as data. I think this answers in the half-affirmative:
> you can only embed data by leaking the privkey so that it (can)
> immediately fall out of the utxo set.
>
> (To emphasize, this is different to the earlier observations
> (including by me!) that just say it is *possible* to leak data by
> leaking the private key; here I'm trying to prove that there is *no
> other way*).
>
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
>
> Either way, I found it very interesting! Perhaps others will find the
> analysis valuable.
>
> Feedback (especially of the "that's wrong/that's not meaningful"
> variety) appreciated.
>
> Regards,
> AdamISZ/waxwing
>
> --
> You received this message because you are subscribed to the Google
> Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com
> .
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/5c15c2c265c92d5527fe3da510ac76c2a6e8e0e4.camel%40real-or-random.org.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-31 9:10 ` Tim Ruffing
@ 2025-10-31 13:09 ` waxwing/ AdamISZ
0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-31 13:09 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 5873 bytes --]
Hi Tim,
First, thanks for the considered reply! That is a very interesting point
for sure.
I guess I have 2 or 3 responses:
First, my "theorem 1" was deliberately specific about BIP340. I am aware of
the impact of Pohlig-Hellman on non prime order groups.
However despite me being able to "defend the thesis" in that literal sense,
I still think your overall critique is valid. I think the "framework" (at
least in the updated version of the paper; the first couple of drafts were
a bit incoherent) makes sense, but it's too vague in the most important
part of the reasoning, namely the invertibility of the functions described.
But w.r.t. the values P and R, throughout, I was assuming pseudorandomness
(uncontrollable output-ness) [1] of the mappings x -> P = xG and k -> R=kG.
That assumption was both explicit and implicit in several steps (or perhaps
leaps) I took (see e.g. how I refer to the function f(P, R, s) and in at
least one place basically "ignore" the P, R dependency because they are
uncontrollable); in my head , that was justifiable based on it being a
prime order group, but at the very least, I should have been explicit.
> I believe any proof that data
cannot be embedded in a Schnorr signature (or in a group element R) in
a prime-order group must somehow exploit the fact that all bits of k
are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1]
for a proof that this is the case for prime-order groups.
Nice reference, thanks! I definitely wouldn't have found that. As per
above, I just assumed this without justifying it; so my end conclusion that
there is a reduction to hash preimage resistance is I guess incomplete.
[1] so .. k -> kG is kind of a pseudorandom function, or generator, right?
If this is a DDH assumption, then perhaps that's what we should really
reduce to (well, plus hash preimage resistance)?
Cheers,
Adam
On Friday, October 31, 2025 at 7:51:48 AM UTC-3 Tim Ruffing wrote:
> Hey Adam,
>
> I think something is wrong here.
>
> Assume a group of order n=p*2^t where p is a large enough prime such
> that the DL problem is hard. For example, Curve25519 has t=3 but the DL
> problem still hard. Or, assuming n+1 is also prime, work in the
> multiplicative group of integers modulo n+1 (which has group order n
> then). I'm not aware of any obstacles to constructing such groups for
> sufficiently large values of t.
>
> The crucial point is that, in these groups, the Pohlig-Hellman
> algorithm can be used to compute the t least significant bits of the
> discrete logarithm k of a group element R efficiently. So to embed t
> bits in a Schnorr signature (R, s), simply pick k such that its t least
> significant bits t are exactly these bits.
>
> Of course, this does not work in BIP340 because it uses the secp256k1
> group for which t=0, i.e., the group has prime order. But it appears
> that the reasoning in your write up is not specific to prime-order
> groups. Thus I conclude that something must be wrong or insufficient in
> your argument.
>
> Let me clarify that I do not claim that data can be embedded in a
> BIP340 signature. I only claim that your arguments for why data can't
> be embedded do not appear to be sound. I believe any proof that data
> cannot be embedded in a Schnorr signature (or in a group element R) in
> a prime-order group must somehow exploit the fact that all bits of k
> are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1]
> for a proof that this is the case for prime-order groups.
>
> Best,
> Tim
>
> [1] https://www.csc.kth.se/~johanh/hnrsaacm.pdf
>
>
>
> On Wed, 2025-10-01 at 07:24 -0700, waxwing/ AdamISZ wrote:
> > Hi all,
> >
> > https://github.com/AdamISZ/schnorr-unembeddability/
> >
> > Here I'm analyzing whether the following statement is true: "if you
> > can embed data into a (P, R, s) tuple (Schnorr pubkey and signature,
> > BIP340 style), without grinding or using a sidechannel to "inform"
> > the reader, you must be leaking your private key".
> >
> > See the abstract for a slightly more fleshed out context.
> >
> > I'm curious about the case of P, R, s published in utxos to prevent
> > usage of utxos as data. I think this answers in the half-affirmative:
> > you can only embed data by leaking the privkey so that it (can)
> > immediately fall out of the utxo set.
> >
> > (To emphasize, this is different to the earlier observations
> > (including by me!) that just say it is *possible* to leak data by
> > leaking the private key; here I'm trying to prove that there is *no
> > other way*).
> >
> > However I still am probably in the large majority that thinks it's
> > appalling to imagine a sig attached to every pubkey onchain.
> >
> > Either way, I found it very interesting! Perhaps others will find the
> > analysis valuable.
> >
> > Feedback (especially of the "that's wrong/that's not meaningful"
> > variety) appreciated.
> >
> > Regards,
> > AdamISZ/waxwing
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Bitcoin Development Mailing List" group.
> > To unsubscribe from this group and stop receiving emails from it,
> > send an email to bitcoindev+...@googlegroups•com.
> > To view this discussion visit
> >
> https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com
> > .
>
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/61eb9abe-3e26-495d-9d00-dbda69fe018bn%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 7906 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
` (3 preceding siblings ...)
2025-10-31 9:10 ` Tim Ruffing
@ 2025-10-31 13:19 ` Garlo Nicon
2025-11-01 14:49 ` waxwing/ AdamISZ
4 siblings, 1 reply; 19+ messages in thread
From: Garlo Nicon @ 2025-10-31 13:19 UTC (permalink / raw)
To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List
[-- Attachment #1: Type: text/plain, Size: 3479 bytes --]
> if you can embed data into a (P, R, s) tuple (Schnorr pubkey and
signature, BIP340 style), without grinding or using a sidechannel to
"inform" the reader, you must be leaking your private key
You can embed data into a valid signature. For example:
R=k*G
P=d*G
k=first_chunk_of_data
d=second_chunk_of_data
And then, keys are "weak", because people can use "known plaintext attack",
to get them. However, if you want to push random data, that is unknown to
the reader, then it is known only by the holder of the data.
Which means, that the efficiency of this encoding is somewhere around 66%,
by grinding SHA-256 hashes, it could probably reach around 70% in practice.
Only s-value is something, that needs any grinding, for k-value and
d-value, you need only the data, and nothing else.
So, I guess it is a spectrum: something like 70% efficiency means, that you
need "known plaintext attack" to get the data. And then, you can use less
and less bits per public key, to make it arbitrarily weaker. Then, instead
of relying on a timelock, you can rely on computation difficulty for the
reader, for example: "how many bits I need to leak, to make it breakable by
lattice attack".
śr., 1 paź 2025 o 21:50 waxwing/ AdamISZ <ekaggata@gmail•com> napisał(a):
> Hi all,
>
> https://github.com/AdamISZ/schnorr-unembeddability/
>
> Here I'm analyzing whether the following statement is true: "if you can
> embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340
> style), without grinding or using a sidechannel to "inform" the reader, you
> must be leaking your private key".
>
> See the abstract for a slightly more fleshed out context.
>
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.
>
> (To emphasize, this is different to the earlier observations (including by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).
>
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
>
> Either way, I found it very interesting! Perhaps others will find the
> analysis valuable.
>
> Feedback (especially of the "that's wrong/that's not meaningful" variety)
> appreciated.
>
> Regards,
> AdamISZ/waxwing
>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAN7kyNhE39gJyV7xCRNpZAu-jkP7bu2DvkhZ7FdLsGxa-QLjQw%40mail.gmail.com.
[-- Attachment #2: Type: text/html, Size: 4568 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-10-31 13:19 ` Garlo Nicon
@ 2025-11-01 14:49 ` waxwing/ AdamISZ
2025-11-02 9:11 ` Garlo Nicon
0 siblings, 1 reply; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-11-01 14:49 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 5975 bytes --]
Hi Garlo Nicon,
Before I answer your point I want to mention (to readers): probably some
things remained tacit in this thread but are worth emphasizing:
1. It's always trivial to get a 100% embedding rate if it's OK to assume
the embedder is choosing to share data off-blockchain with others (just xor
the real signature with their chosen data and call that the key). This is
of course is a bit silly (though not entirely silly); if the purpose is to
*communicate* then they can use the communication channel for the data,
instead of the xor value, and forget about the blockchain. On the other
hand if their purpose is to publish data, and rely on the immutability and
persistence of the blockchain, then there is the problem that the xor key
can be lost; it's that offchain data that represents the actual semantics
of what they published, and so they're in rather the same position as they
would have been without the blockchain existing at all. (insert
finesses/caveats but, basically).
2. All of the above theoretical analysis doesn't work for ECDSA *as an
algorithm outside of Bitcoin*. You get 32 bytes of embedding without
leaking the private key, there. (the s-value can literally be made to say
"hello world" 3 times or whatever). this is the non-pubkey-committing
nature of standard ECDSA. I *think* you can make it behave the same as
Schnorr in terms of pubkey-unembeddability-without-key-leakage by putting
the pubkey in the message, but it's even harder to analyze than Schnorr
(which is already hard).
3. In contrast to 2., the pubkey is in fact embedded in the message
(indirectly), at least usually, in Bitcoin (except sighash_noinput type
stuff which isn't live), so you can't put hello world in the signatures for
now, at least AFAIK. Still even then you're stuck at a 33% rate if we
include all of P, R, s, which seems reasonable (in fact, that's a generous
measure). Again, I am ignoring grinding which always adds a bit more.
Anyway, you say:
> So, I guess it is a spectrum: something like 70% efficiency means, that
you need "known plaintext attack" to get the data. And then, you can use
less and less bits per public key, to make it arbitrarily weaker. Then,
instead of relying on a timelock, you can rely on computation difficulty
for the reader, for example: "how many bits I need to leak, to make it
breakable by lattice attack".
I think it's an interesting idea to use lattice attacks but I can't find a
way to agree with 66 or 70%. Here's why:
We assume a "few" signatures are all on the same private key. If there are
N such signatures, then once LLL or similar lattice method is successful,
you retrieve the 1 private key (32 bytes) and the N * 27 bytes (or so;
imagining 5 bytes are biased; it *can* go lower, requiring more signatures;
doesn't change the situation).
So you embedded successfully 27N+32 (all the nonces and the private key)
into 64N + 32N [1] for a ratio that is a bit less than 33%. Compare with
just using a repeated nonce in 2 equations, where you get 64 bytes (nonce,
privkey) from 2*P + 2*(R,s) or so a total of 196, i.e. 33% exactly.
Basically, at least in a bitcoin context, there is no gain in doing a
partial exposure of the nonce; you may as well just reveal all of it,
either by repetition or as noted in the pdf, by using something public like
a block hash. Notice that if my note [1] did not apply, then all the above
isn't correct, the ratios work differently.
Can you let me know how you're getting 66%+? I'm guessing you're just
saying "the k and the d values" but as per above I don't see it. Maybe
write out concretely what the data-reader would be doing?
[1] It's easy to slip up here - I know I did - when considering publication
*on bitcoin* compared with just publishing signatures. In the latter case,
I can publish 100 signatures with the tacit assumption that they all refer
to the same key (or, you can verify, to check). In bitcoin the pubkey is
never tacit, it's always published in the scriptPubKey or scriptSig or
whatever, so you can't gain efficiency from repeated uses of the same key
(i.e. you can't write 64N + 32, it must be 64N + 32N for (P, R, s) tuples).
Cheers,
Adam
On Friday, October 31, 2025 at 10:25:30 AM UTC-3 Garlo Nicon wrote:
> if you can embed data into a (P, R, s) tuple (Schnorr pubkey and
signature, BIP340 style), without grinding or using a sidechannel to
"inform" the reader, you must be leaking your private key
You can embed data into a valid signature. For example:
R=k*G
P=d*G
k=first_chunk_of_data
d=second_chunk_of_data
And then, keys are "weak", because people can use "known plaintext attack",
to get them. However, if you want to push random data, that is unknown to
the reader, then it is known only by the holder of the data.
Which means, that the efficiency of this encoding is somewhere around 66%,
by grinding SHA-256 hashes, it could probably reach around 70% in practice.
Only s-value is something, that needs any grinding, for k-value and
d-value, you need only the data, and nothing else.
So, I guess it is a spectrum: something like 70% efficiency means, that you
need "known plaintext attack" to get the data. And then, you can use less
and less bits per public key, to make it arbitrarily weaker. Then, instead
of relying on a timelock, you can rely on computation difficulty for the
reader, for example: "how many bits I need to leak, to make it breakable by
lattice attack".
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 6721 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-11-01 14:49 ` waxwing/ AdamISZ
@ 2025-11-02 9:11 ` Garlo Nicon
2025-11-02 13:30 ` waxwing/ AdamISZ
0 siblings, 1 reply; 19+ messages in thread
From: Garlo Nicon @ 2025-11-02 9:11 UTC (permalink / raw)
To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List
[-- Attachment #1: Type: text/plain, Size: 10875 bytes --]
> Can you let me know how you're getting 66%+?
You have three chunks, which are needed: (P,R,s). You can control "P" and
"R" directly and fully, by feeding it with your data. That means, you can
get 66%, because it is just 2/3, if you assume, that all values have the
same size.
Then, to get 70% or more, grinding s-value is needed, which is doable, if
you want to for example grind two or three bytes of s-value, and stop
there. But let's assume, that you want to make it as fast as possible, so
you don't grind anything, and then stop at 66%.
> Maybe write out concretely what the data-reader would be doing?
I already told you, when I said "known plaintext attack". If you want to
put random data into private keys or signatures, then things are hard to
break. However, if it is something useful for the reader, then usually,
that kind of data are non-random. For example: some users store
transactions inside OP_RETURNs, and they use ASCII hex representation. If
they would use binary encoding, then they would save 50% space. But people
simply don't care.
And the similar case is possible here: if you want to store random data,
then it is hard to use this method. However, if you want to store ASCII
text, where many words can be found in a dictionary, or where the format of
the data is known upfront, or can be easily guessed, then the security of
the keys, is comparable to the brainwallets.
Which means, that you can just put your data into the private key of the
user, and a "signature nonce" (which is nothing else, but yet another
private key, placed on secp256k1). And then, if you know, that your data,
is for example "ASCII string", then it means, that each and every key, that
you produce, simply leaks at least 32 bits per 256-bit key, if not more.
And then, if the attacker can get coins from brainwallets, then decoding
such data is not much harder than that. If your data contains simple words,
then even dictionary attacks can be used.
So, let's say that you want to encode 64 bytes in a signature:
d="This is a test of storing data
i"=0x5468697320697320612074657374206f662073746f72696e6720646174612069
k="n private keys inside
signatures"=0x6e2070726976617465206b65797320696e73696465207369676e617475726573
P=d*G=02A2EF730B26A905A7D91940E3A512C5771D8BC8BCCA153D714E328043856CBB2B
R=k*G=02E19FCA1025CFD67409309E2B1711D723BFB67EC520917D9A0AD9432414DA0D0A
And then, s-value comes from SHA-256 hashing, so it is harder to control.
But grinding a few bytes can give something around 70%. However, even if we
stop at 66%, then still: useful data are regular. There are many patterns.
If something is an ASCII string, then 1/8 bits are cleared, and it is
known, which ones should be set to zero. If it is in English, then the
entropy is even lower. Which means, that the private key is not directly
"leaked", by being passed to the reader, but there is an assumption, that
it will be easy enough to get.
Also, if the key won't be leaked, then it can be used as an advantage:
first, NFTs can be minted, and transferred, and then, you can pass the data
directly, and say: "See? You can confirm, that they are encoded into
private keys properly". And as long as the data in question is difficult
enough to fully guess, the key is not revealed, even if it is quite weak.
Which means, that my answer to your question is: it is a spectrum. You can
make a weak signature, and have 33% encoding efficiency, and leak every
private key immediately. But you can make something in a spectrum between
33% and 66%, and make something, that is "weak", but something, which won't
be broken "on the spot, immediately after being broadcasted" (so you cannot
really say, that the keys are "leaked", because you need to know
"something" about the plaintext inside private keys, or about its format).
And it is good for spammers, because then, funds can be safely confirmed,
and later revealed, that "hey, I encoded that data here, by wasting 3 MB of
block space, to encode 2 MB of ASCII strings, here is your NFT, that you
can buy here".
sob., 1 lis 2025 o 16:47 waxwing/ AdamISZ <ekaggata@gmail•com> napisał(a):
> Hi Garlo Nicon,
>
> Before I answer your point I want to mention (to readers): probably some
> things remained tacit in this thread but are worth emphasizing:
>
> 1. It's always trivial to get a 100% embedding rate if it's OK to assume
> the embedder is choosing to share data off-blockchain with others (just xor
> the real signature with their chosen data and call that the key). This is
> of course is a bit silly (though not entirely silly); if the purpose is to
> *communicate* then they can use the communication channel for the data,
> instead of the xor value, and forget about the blockchain. On the other
> hand if their purpose is to publish data, and rely on the immutability and
> persistence of the blockchain, then there is the problem that the xor key
> can be lost; it's that offchain data that represents the actual semantics
> of what they published, and so they're in rather the same position as they
> would have been without the blockchain existing at all. (insert
> finesses/caveats but, basically).
>
> 2. All of the above theoretical analysis doesn't work for ECDSA *as an
> algorithm outside of Bitcoin*. You get 32 bytes of embedding without
> leaking the private key, there. (the s-value can literally be made to say
> "hello world" 3 times or whatever). this is the non-pubkey-committing
> nature of standard ECDSA. I *think* you can make it behave the same as
> Schnorr in terms of pubkey-unembeddability-without-key-leakage by putting
> the pubkey in the message, but it's even harder to analyze than Schnorr
> (which is already hard).
>
> 3. In contrast to 2., the pubkey is in fact embedded in the message
> (indirectly), at least usually, in Bitcoin (except sighash_noinput type
> stuff which isn't live), so you can't put hello world in the signatures for
> now, at least AFAIK. Still even then you're stuck at a 33% rate if we
> include all of P, R, s, which seems reasonable (in fact, that's a generous
> measure). Again, I am ignoring grinding which always adds a bit more.
>
> Anyway, you say:
>
> > So, I guess it is a spectrum: something like 70% efficiency means, that
> you need "known plaintext attack" to get the data. And then, you can use
> less and less bits per public key, to make it arbitrarily weaker. Then,
> instead of relying on a timelock, you can rely on computation difficulty
> for the reader, for example: "how many bits I need to leak, to make it
> breakable by lattice attack".
>
> I think it's an interesting idea to use lattice attacks but I can't find a
> way to agree with 66 or 70%. Here's why:
>
> We assume a "few" signatures are all on the same private key. If there are
> N such signatures, then once LLL or similar lattice method is successful,
> you retrieve the 1 private key (32 bytes) and the N * 27 bytes (or so;
> imagining 5 bytes are biased; it *can* go lower, requiring more signatures;
> doesn't change the situation).
>
> So you embedded successfully 27N+32 (all the nonces and the private key)
> into 64N + 32N [1] for a ratio that is a bit less than 33%. Compare with
> just using a repeated nonce in 2 equations, where you get 64 bytes (nonce,
> privkey) from 2*P + 2*(R,s) or so a total of 196, i.e. 33% exactly.
> Basically, at least in a bitcoin context, there is no gain in doing a
> partial exposure of the nonce; you may as well just reveal all of it,
> either by repetition or as noted in the pdf, by using something public like
> a block hash. Notice that if my note [1] did not apply, then all the above
> isn't correct, the ratios work differently.
>
> Can you let me know how you're getting 66%+? I'm guessing you're just
> saying "the k and the d values" but as per above I don't see it. Maybe
> write out concretely what the data-reader would be doing?
>
> [1] It's easy to slip up here - I know I did - when considering
> publication *on bitcoin* compared with just publishing signatures. In the
> latter case, I can publish 100 signatures with the tacit assumption that
> they all refer to the same key (or, you can verify, to check). In bitcoin
> the pubkey is never tacit, it's always published in the scriptPubKey or
> scriptSig or whatever, so you can't gain efficiency from repeated uses of
> the same key (i.e. you can't write 64N + 32, it must be 64N + 32N for (P,
> R, s) tuples).
>
> Cheers,
> Adam
>
> On Friday, October 31, 2025 at 10:25:30 AM UTC-3 Garlo Nicon wrote:
>
> > if you can embed data into a (P, R, s) tuple (Schnorr pubkey and
> signature, BIP340 style), without grinding or using a sidechannel to
> "inform" the reader, you must be leaking your private key
>
> You can embed data into a valid signature. For example:
>
> R=k*G
> P=d*G
> k=first_chunk_of_data
> d=second_chunk_of_data
>
> And then, keys are "weak", because people can use "known plaintext
> attack", to get them. However, if you want to push random data, that is
> unknown to the reader, then it is known only by the holder of the data.
>
> Which means, that the efficiency of this encoding is somewhere around 66%,
> by grinding SHA-256 hashes, it could probably reach around 70% in practice.
> Only s-value is something, that needs any grinding, for k-value and
> d-value, you need only the data, and nothing else.
>
> So, I guess it is a spectrum: something like 70% efficiency means, that
> you need "known plaintext attack" to get the data. And then, you can use
> less and less bits per public key, to make it arbitrarily weaker. Then,
> instead of relying on a timelock, you can rely on computation difficulty
> for the reader, for example: "how many bits I need to leak, to make it
> breakable by lattice attack".
>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups•com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAN7kyNgyoA5rb8hYuxai6bSaPdon%3Dy%3D9Z%2BdAfqP6Mf%3DPyniJLw%40mail.gmail.com.
[-- Attachment #2: Type: text/html, Size: 12275 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr
2025-11-02 9:11 ` Garlo Nicon
@ 2025-11-02 13:30 ` waxwing/ AdamISZ
0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-11-02 13:30 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 4249 bytes --]
> I already told you, when I said "known plaintext attack". If you want to
put random data into private keys or signatures, then things are hard to
break. However, if it is something useful for the reader, then usually,
that kind of data are non-random. For example: some users store
transactions inside OP_RETURNs, and they use ASCII hex representation. If
they would use binary encoding, then they would save 50% space. But people
simply don't care.
> And the similar case is possible here: if you want to store random data,
then it is hard to use this method. However, if you want to store ASCII
text, where many words can be found in a dictionary, or where the format of
the data is known upfront, or can be easily guessed, then the security of
the keys, is comparable to the brainwallets.
> Which means, that you can just put your data into the private key of the
user, and a "signature nonce" (which is nothing else, but yet another
private key, placed on secp256k1). And then, if you know, that your data,
is for example "ASCII string", then it means, that each and every key, that
you produce, simply leaks at least 32 bits per 256-bit key, if not more.
Ah, right; I had originally written a response to this idea but then
discarded it on the basis that it's kinda "obvious" that we shouldn't think
about that, and focused on the more in-the-weeds concept of a lattice
attack instead.
But it isn't obvious.
So let's think of the spectrum here. First, the most trivial nonce to
break: one consisting of a single bit (OK technically you can't encode k=0,
heh, but, whatever, put it in the second bit of the string). Obviously that
is extractable, getting 32 bytes plus one bit. That one extra bit above the
33% is achievable because of "grinding" except here grinding is the most
trivial version possible: trying 2 alternatives. This still fits my
original claim, which is "33% plus whatever you can get from grinding, and
you leak the secret key in the process".
Other end of the spectrum: not 1 bit or 5 bytes but say 20 bytes represent
an actual message, and let's say the rest of the 256 bit k-string is zero.
Now clearly one can't grind that, if it's random. Which brings us to your
point about weakness: let's say the 20 bytes of message comes from a space
of possible messages, known to all potential readers, whose size is
actually 40 bits. Because they can grind 40 bits, they can retrieve the
message, but that message is only 40 bits of information. E.g. most crude
idea; a table of 2^40 messages, you are picking one .. notice it doesn't
matter if the length of each message is 40 bits or 160 bits or 256 bits;
you are only conveying 40 bits of *information* if you do this.
From this point of view it's pretty clear that we haven't changed the
general conclusion: you only get 33% (say 32 bytes), *plus* whatever you
can get from grinding, and since that's exponential work, it's never going
to be very big, say 5 bytes or possibly 6? And you leak the key of course.
I do agree with you that there could be scenarios where this "mode" of
publication/embedding might be the preferable one, because we're gliding
over that line between "pure publication" and "publication with
sidechannels". As I argued here and elsewhere, if there is a proper,
viable, sidechannel, then most of this analysis doesn't apply but a sort of
mixup where "if you know information X you can grind out more information Y
from the onchain data" is possible.
But no, as per the above, you are definitely not conveying 66% (that is to
say , 64 bytes out of 96) in the P, R, s tuple using this method. That'd
only be true in the sense that if the space of possible messages is "hello
world\n\n" and "goodbye world" and then you claimed you were sending 13
bytes because a reader can find the message.
Cheers,
AdamISZ/waxwing
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups•com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/31d18bd9-62e0-4035-b04f-f70ff4253257n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 4720 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-11-02 13:33 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
2025-10-01 22:10 ` Greg Maxwell
2025-10-01 23:11 ` Andrew Poelstra
2025-10-02 0:25 ` waxwing/ AdamISZ
2025-10-02 15:56 ` waxwing/ AdamISZ
2025-10-02 19:49 ` Greg Maxwell
2025-10-06 13:04 ` waxwing/ AdamISZ
2025-10-03 13:24 ` Peter Todd
2025-10-04 2:39 ` waxwing/ AdamISZ
2025-10-07 8:22 ` Anthony Towns
2025-10-07 12:05 ` waxwing/ AdamISZ
2025-10-08 5:12 ` Anthony Towns
2025-10-08 12:55 ` waxwing/ AdamISZ
2025-10-31 9:10 ` Tim Ruffing
2025-10-31 13:09 ` waxwing/ AdamISZ
2025-10-31 13:19 ` Garlo Nicon
2025-11-01 14:49 ` waxwing/ AdamISZ
2025-11-02 9:11 ` Garlo Nicon
2025-11-02 13:30 ` waxwing/ AdamISZ
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox