public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
* [bitcoin-dev] Transaction Input/Output Sorting
@ 2018-10-21 19:00 rhavar
  2018-10-21 21:54 ` Pavol Rusnak
  0 siblings, 1 reply; 7+ messages in thread
From: rhavar @ 2018-10-21 19:00 UTC (permalink / raw)
  To: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]

Right now it's just *way* too easy to spot the boundaries between different wallets. There's a lot of things that contribute to that, but the one that concerns me the most is the way wallets sort transaction inputs and outputs.

Some wallets and protocols (especially HW wallets) have  a strong preference for deterministic sorting (i.e. using bip69), while other wallets have a lot of objections to this.

I'm not sure I fully understand the objections, but I think they can be summarized as "during the transition period there will be a lot of privacy loss" and "if in the future someone wants to use bitcoin in a way that's not compatible with bip69 their transactions will stick out heavily".

I wonder if this impasse could be solved with deterministic sorting, but based on a semi-secret.  Like  `sortingSecret = hmac(walletSeed, "sortingSecret")` and then there's a standardized sort order based on the sortingSecret. e.g. sort inputs/output by the  `hash(data || sortingSecret)`.   Wallets could come up with their own way of computing (or storing) the "sortingSecret" but from there it's standardized.

I has the advantages of deterministic sorting (as long as you know the sortingSecret) you can verify it's done correctly and externally looks totally randomized.

Am I missing something, or could this be the way forward?

-Ryan

[-- Attachment #2: Type: text/html, Size: 2709 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bitcoin-dev] Transaction Input/Output Sorting
  2018-10-21 19:00 [bitcoin-dev] Transaction Input/Output Sorting rhavar
@ 2018-10-21 21:54 ` Pavol Rusnak
  2018-10-22  1:54   ` rhavar
  0 siblings, 1 reply; 7+ messages in thread
From: Pavol Rusnak @ 2018-10-21 21:54 UTC (permalink / raw)
  To: rhavar, Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 1802 bytes --]

Your solution in the second part of the email does not solve the problem
you indicated in the first part of your email.

On Sun, Oct 21, 2018, 23:41 Ryan Havar via bitcoin-dev <
bitcoin-dev@lists•linuxfoundation.org> wrote:

> Right now it's just *way* too easy to spot the boundaries between
> different wallets. There's a lot of things that contribute to that, but the
> one that concerns me the most is the way wallets sort transaction inputs
> and outputs.
>
> Some wallets and protocols (especially HW wallets) have a strong
> preference for deterministic sorting (i.e. using bip69), while other
> wallets have a lot of objections to this.
>
> I'm not sure I fully understand the objections, but I think they can be
> summarized as "during the transition period there will be a lot of privacy
> loss" and "if in the future someone wants to use bitcoin in a way that's
> not compatible with bip69 their transactions will stick out heavily".
>
> I wonder if this impasse could be solved with deterministic sorting, but
> based on a semi-secret.  Like  `sortingSecret = hmac(walletSeed,
> "sortingSecret")` and then there's a standardized sort order based on the
> sortingSecret. e.g. sort inputs/output by the  `hash(data ||
> sortingSecret)`.   Wallets could come up with their own way of computing
> (or storing) the "sortingSecret" but from there it's standardized.
>
> I has the advantages of deterministic sorting (as long as you know the
> sortingSecret) you can verify it's done correctly and externally looks
> totally randomized.
>
> Am I missing something, or could this be the way forward?
>
> -Ryan
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists•linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>

[-- Attachment #2: Type: text/html, Size: 3940 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bitcoin-dev] Transaction Input/Output Sorting
  2018-10-21 21:54 ` Pavol Rusnak
@ 2018-10-22  1:54   ` rhavar
  2018-10-23 14:29     ` Chris Belcher
  0 siblings, 1 reply; 7+ messages in thread
From: rhavar @ 2018-10-22  1:54 UTC (permalink / raw)
  To: Pavol Rusnak; +Cc: Bitcoin Protocol Discussion

[-- Attachment #1: Type: text/plain, Size: 2661 bytes --]

On Sunday, October 21, 2018 2:54 PM, Pavol Rusnak <stick@satoshilabs•com> wrote:

> Your solution in the second part of the email does not solve the problem you indicated in the first part of your email.

Sorry, I'm not quite sure what parts you are referring to. I assume you might mean my first paragraph, so I'll try explain myself a bit clearer how this makes it harder to find wallet boundaries.

Right now you can generally tell if a transaction is using bip69 or not (as long as you account for the probability that it's randomly sorted to accidentally be bip69). And generally wallets are consistent if they use bip69 or not.

This can often make it massively easier to detect what is change and not. Let's say I'm clustering a wallet and know they're using a wallet that always uses bip69, and I'm looking at a transaction in that cluster and trying to guess which is the change and which is not. There's a lot of things you can use to assign a probability. The most obvious thing is looking at the amount of significant-digits of the output amounts  (if they vary a lot, change tends to be the one with more), but a much more powerful one is looking at how the outputs are spent (and if they end up spend-linking back into the original cluster).

So let's say that the transaction output is spent by a non-bip69 transaction -- I right away know that it's going to (almost certainly) be a different wallet (e.g. the destination).

My  (shower-thoughty) "solution" fixes this problem, because an outside observer has no way of knowing if a transaction is using deterministic sorting or not, so can not use this information to establish wallet boundaries.

--
On somewhat of a tangent I was actually fortunate enough to have someone with access to the biggest(?) bitcoin analysis service help me with a few experiments. While I was genuinely taken aback by how accurate some of their analysis can be, I also found it pretty easy to trick -- implying it relies heavily on some fragile heuristics.

I don't like to be alarmist, but I worry a lot about the fungibility of bitcoin when we have such effective blockchain analysis and a *LOT* of the ecosystem using a centralized analytics service. And in fact, we're already starting to see some minor effects of this (e.g. people already know that if they gamble their funds, they'll probably have trouble using an exchange later). And I don't think we're too far from the point where any "unidentified" bitcoin is instantly flagged as "suspicious" (and for instance, requires more explaining for by exchanges) potentially seriously harming bitcoin fungibility and it's value determined also by it's history.

[-- Attachment #2: Type: text/html, Size: 2979 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bitcoin-dev] Transaction Input/Output Sorting
  2018-10-22  1:54   ` rhavar
@ 2018-10-23 14:29     ` Chris Belcher
  2018-10-24 16:12       ` Gregory Maxwell
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Belcher @ 2018-10-23 14:29 UTC (permalink / raw)
  To: Ryan Havar via bitcoin-dev

Thanks for bringing our attention to this important topic.

According to (https://p2sh.info/dashboard/db/bip-69-stats) around 60% of
transaction follow bip69 (possibly just by chance).

If its useful, a bitcoin wiki page that tracks wallets which use bip69
can be created. A similar page exists for bech32
(https://en.bitcoin.it/wiki/Bech32_adoption). If we had this at least
we'd know which open source wallets we can write code for or which
closed source wallets we can bug about bip69.


On 22/10/2018 02:54, Ryan Havar via bitcoin-dev wrote:
> On Sunday, October 21, 2018 2:54 PM, Pavol Rusnak <stick@satoshilabs•com> wrote:
> 
>> Your solution in the second part of the email does not solve the problem you indicated in the first part of your email.
> 
> Sorry, I'm not quite sure what parts you are referring to. I assume you might mean my first paragraph, so I'll try explain myself a bit clearer how this makes it harder to find wallet boundaries.
> 
> Right now you can generally tell if a transaction is using bip69 or not (as long as you account for the probability that it's randomly sorted to accidentally be bip69). And generally wallets are consistent if they use bip69 or not.
> 
> This can often make it massively easier to detect what is change and not. Let's say I'm clustering a wallet and know they're using a wallet that always uses bip69, and I'm looking at a transaction in that cluster and trying to guess which is the change and which is not. There's a lot of things you can use to assign a probability. The most obvious thing is looking at the amount of significant-digits of the output amounts  (if they vary a lot, change tends to be the one with more), but a much more powerful one is looking at how the outputs are spent (and if they end up spend-linking back into the original cluster).
> 
> So let's say that the transaction output is spent by a non-bip69 transaction -- I right away know that it's going to (almost certainly) be a different wallet (e.g. the destination).
> 
> My  (shower-thoughty) "solution" fixes this problem, because an outside observer has no way of knowing if a transaction is using deterministic sorting or not, so can not use this information to establish wallet boundaries.
> 
> --
> On somewhat of a tangent I was actually fortunate enough to have someone with access to the biggest(?) bitcoin analysis service help me with a few experiments. While I was genuinely taken aback by how accurate some of their analysis can be, I also found it pretty easy to trick -- implying it relies heavily on some fragile heuristics.
> 
> I don't like to be alarmist, but I worry a lot about the fungibility of bitcoin when we have such effective blockchain analysis and a *LOT* of the ecosystem using a centralized analytics service. And in fact, we're already starting to see some minor effects of this (e.g. people already know that if they gamble their funds, they'll probably have trouble using an exchange later). And I don't think we're too far from the point where any "unidentified" bitcoin is instantly flagged as "suspicious" (and for instance, requires more explaining for by exchanges) potentially seriously harming bitcoin fungibility and it's value determined also by it's history.
> 
> 
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists•linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bitcoin-dev] Transaction Input/Output Sorting
  2018-10-23 14:29     ` Chris Belcher
@ 2018-10-24 16:12       ` Gregory Maxwell
  2018-10-24 17:52         ` rhavar
  0 siblings, 1 reply; 7+ messages in thread
From: Gregory Maxwell @ 2018-10-24 16:12 UTC (permalink / raw)
  To: belcher, Bitcoin Dev

On Wed, Oct 24, 2018 at 3:52 PM Chris Belcher via bitcoin-dev
<bitcoin-dev@lists•linuxfoundation.org> wrote:
>
> Thanks for bringing our attention to this important topic.
>
> According to (https://p2sh.info/dashboard/db/bip-69-stats) around 60% of
> transaction follow bip69 (possibly just by chance).

A two input randomly ordered transaction has a 50% chance of
'following' bip-69.  So 60% sound like a small minority.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bitcoin-dev] Transaction Input/Output Sorting
  2018-10-24 16:12       ` Gregory Maxwell
@ 2018-10-24 17:52         ` rhavar
  2018-10-24 18:21           ` rhavar
  0 siblings, 1 reply; 7+ messages in thread
From: rhavar @ 2018-10-24 17:52 UTC (permalink / raw)
  To: Gregory Maxwell, Bitcoin Protocol Discussion

That's pretty easy to quantify. I wrote a quick script to grab the last few blocks, and then shuffle the inputs/outputs before testing if each transaction is bip69 or not.

The result was 42% of all transactions would accidentally be bip69 when randomized.

So clearly randomization is a lot more popular than bip69 at the moment, but I'm not sure that it matters much. As soon as you have more than a few inputs/outputs, you can tell with a high confidence if the transaction is bip69 or not.

And of course if you're clustering a wallet, you can figure out extremely easily how that wallet behaves wrt bip6.


-Ryan

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, October 24, 2018 9:12 AM, Gregory Maxwell via bitcoin-dev <bitcoin-dev@lists•linuxfoundation.org> wrote:

> On Wed, Oct 24, 2018 at 3:52 PM Chris Belcher via bitcoin-dev
> bitcoin-dev@lists•linuxfoundation.org wrote:
>
> > Thanks for bringing our attention to this important topic.
> > According to (https://p2sh.info/dashboard/db/bip-69-stats) around 60% of
> > transaction follow bip69 (possibly just by chance).
>
> A two input randomly ordered transaction has a 50% chance of
> 'following' bip-69. So 60% sound like a small minority.






^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bitcoin-dev] Transaction Input/Output Sorting
  2018-10-24 17:52         ` rhavar
@ 2018-10-24 18:21           ` rhavar
  0 siblings, 0 replies; 7+ messages in thread
From: rhavar @ 2018-10-24 18:21 UTC (permalink / raw)
  To: Gregory Maxwell, Bitcoin Protocol Discussion

Actually, I think it can be calculated a bit smarter using maths (which unfortunately I'm not very good at...). But I assume it's something like:

```
falsePositiveChances := 0.0

foreach( transaction of transactions) {
	falsePositiveChances += (1 / factorial(transaction.Inputs)) * (1 / factorial(transaction.Ouputs))
}

totalFalsePositives := falsePositiveChances / transactions.length
```

If so, I get 42.4% false positive rate. So clearly bip69 is getting used a fair bit, but not nearly as much as randomization.


-Ryan

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, October 24, 2018 10:52 AM, <rhavar@protonmail•com> wrote:

> That's pretty easy to quantify. I wrote a quick script to grab the last few blocks, and then shuffle the inputs/outputs before testing if each transaction is bip69 or not.
>
> The result was 42% of all transactions would accidentally be bip69 when randomized.
>
> So clearly randomization is a lot more popular than bip69 at the moment, but I'm not sure that it matters much. As soon as you have more than a few inputs/outputs, you can tell with a high confidence if the transaction is bip69 or not.
>
> And of course if you're clustering a wallet, you can figure out extremely easily how that wallet behaves wrt bip6.
>
> -Ryan
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Wednesday, October 24, 2018 9:12 AM, Gregory Maxwell via bitcoin-dev bitcoin-dev@lists•linuxfoundation.org wrote:
>
> > On Wed, Oct 24, 2018 at 3:52 PM Chris Belcher via bitcoin-dev
> > bitcoin-dev@lists•linuxfoundation.org wrote:
> >
> > > Thanks for bringing our attention to this important topic.
> > > According to (https://p2sh.info/dashboard/db/bip-69-stats) around 60% of
> > > transaction follow bip69 (possibly just by chance).
> >
> > A two input randomly ordered transaction has a 50% chance of
> > 'following' bip-69. So 60% sound like a small minority.




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-10-24 18:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-21 19:00 [bitcoin-dev] Transaction Input/Output Sorting rhavar
2018-10-21 21:54 ` Pavol Rusnak
2018-10-22  1:54   ` rhavar
2018-10-23 14:29     ` Chris Belcher
2018-10-24 16:12       ` Gregory Maxwell
2018-10-24 17:52         ` rhavar
2018-10-24 18:21           ` rhavar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox