Re: [bitcoin-dev] Tainting, CoinJoin, PayJoin, CoinSwap

From: Chris Belcher <belcher@riseup•net>
To: bitcoin-dev@lists•linuxfoundation.org
Subject: Re: [bitcoin-dev] Tainting, CoinJoin, PayJoin, CoinSwap
Date: Wed, 10 Jun 2020 21:10:19 +0100	[thread overview]
Message-ID: <e7ab27e5-e235-f6a2-5023-1cdda5c12d0b@riseup.net> (raw)
In-Reply-To: <CAEPKjgfbQoXkB=cEp5Jc28ZihRSQe50M2x7k6=AjW+Vo5f=79g@mail.gmail.com>

Hello nopara73,

On 10/06/2020 13:32, nopara73 via bitcoin-dev wrote:
> The problem with CoinJoins is that desire for privacy is explicitly
> signalled by them, so adversaries can consider them "suspicious." PayJoin
> and CoinSwap solve this problem, because they are unnoticeable. I think
> this logic doesn't stand for scrutiny.
> 
>>From here on let's use the terminology of a typical adversary: there are 3
> kinds of coin histories: "clean", "dirty" and "suspicious".
> The aftermath of you using a "dirty" coin is knocks on your door. You using
> a "suspicious" coin is uncomfortable questions and you using a "clean" coin
> is seamless transfer.
> 
> In scenario 1, you start out with a "clean" history. By using CoinJoins you
> make your new coin's history "suspicious" so you have no incentive to
> CoinJoin. By using CoinSwap/PayJoin your new coin can be either "clean" or
> "dirty". What would a "clean" coin owner prefer more? Take the risk of
> knocking on the door or answering uncomfortable questions?
> 
> In scenario 2, you start out with a "dirty" history. By using CoinJoins you
> make your new coin's history "suspicious" so you have an incentive to
> CoinJoin. By using CoinSwap/PayJoin your new coin can either be "clean" or
> "dirty". What would a "dirty" coin owner prefer more? And here's an
> insight: you may get knocks on your door for a dirty coin that you have
> nothing to do with. And you can prove this fact to the adversary, but by
> doing so, you'll also expose that you started out with a "dirty" coin to
> begin with and now the adversary becomes interested in you for a different
> reason.
> 
> You can also examine things assuming full adoption of PJ/CS vs full
> adoption of CJ, but you'll see that full adoption of any of these solves
> the tainting issue.
> 
> So my current conclusion is that PJ/CS does not only not solve the taint
> problem, it just alters it and ultimately very similar problems arise for
> the users. Maybe the goal of unobservable privacy is a fallacy in this
> context as it is based on the assumption that desiring privacy is
> suspicious, so you want to hide the fact that you desire privacy. And the
> solution to the taint issue is either protocol change or social change
> (decent adoption.)
> 
> PS.: Please try to keep the conversation to the Taint Issue as this email
> of mine isn't supposed to be discussing general pros and cons of various
> privacy techniques.
> 
> Any thoughts?
> 
> 
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists•linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
> 

There are two concepts here: Taint analysis and the detectableness of
privacy protocols.

Taint analysis is quite an old technique, I remember seeing the
blockchain.info explorer having a tool for calculating a value for taint
back in 2013, long before any widely-used CoinJoin implementations were
created. I think taint was first created to attack the privacy technique
of simply sending coins to yourself multiple times. If those coins were
for example stolen from an exchange's hot wallet then the taint between
the exchange addresses and the later addresses would still be 100% even
if the thief sent the coins to himself multiple times.

A very important point is that it's difficult to reason about taint
analysis algorithms because they are often hypothetical, likely
closed-source, not available to the public for review and changing all
the time. OP talks about the three categories "clean", "dirty" and
"suspicious" which is one possibility. I've read about other taint
analysis algorithms which result in a numerical score out of 100.
Blockchain.info's algorithm calculated taint as a number expressing the
relation between any two addresses, so it wouldn't make sense to say "an
address" is tainted, instead you have to talk about a pair of addresses
being tainted with each other. So even though it's hard to reason about
the exact algorithm we can still talk about likely situations, and
imagine what an adversary could do in the worst case or best case.

One way to resist a likely taint analysis attack is to involve other
parts of the bitcoin economy in your transactions. For example our
exchange thief could deposit and then withdraw his stolen coins through
a Bitcoin Casino or other bitcoin service hot wallet. His coins might no
longer be 100% tainted from the exchange hack but perhaps have 5%
exchange hack, 5% bitcoin ATM, 5% mined coins, etc etc. The numbers are
made up and they depend on the exact algorithm but the main point is
that involving the rest of the bitcoin economy in your transaction is
one practical way to stop taint analysis being a useful attack against
on you.

Another important point is that taint isn't part of bitcoin's code
anywhere. It is an external reality that surveillance companies impose
on users. The only reason taint has any influence is because of
censorship, for example an exchange which uses the services of a
surveillance company has the power to freeze funds (i.e. censor a
transaction) if they believe the user's deposit transaction is tainted.

Therefore a way to resist the taint analysis attack is to actually use
bitcoin as money, I.E. earn bitcoin, spend it with merchants, who then
spend it with other merchants or pay their employees, where most
entities along those links actually dont use a taint analysis algorithm.
This is a general principle of bitcoin privacy by the way, if every
entry- and exit-point requires giving up personal information then
privacy is dead, regardless of whether we use
CoinJoin/PayJoin/CoinSwap/whatever in between.
This is a good place to again shill this list of peer-to-peer exchanges:
https://github.com/cointastical/P2P-Trading-Exchanges/

So that's taint.

Now for privacy protocols like CoinJoin. They also involve the rest of
the bitcoin economy, because many different users link their coins
together when using CoinJoin/PayJoin/CoinSwap/etc, so such protocols can
be a way to resist taint analysis too just like the Bitcoin Casino
mentioned earlier.

However, what I think OP is talking about is the case where taint
algorithms are reprogrammed to not just track exchange hack addresses,
but also track privacy protocol transactions. So for example if the
hypothetical taint algorithm comes across an Equal-Output CoinJoin it
will assign it a different taint score even if its not linked to an
exchange hack or anything like that.

Such a reprogramming wouldn't be possible in undetectable privacy
protocols like PayJoin and CoinSwap. They will have the economy-mixing
effect of reducing taint (just like the Bitcoin Casino example above),
but as OP writes that can just lead to the wrong person being under
suspicion. And so such protocols on their own cant resist taint analysis
forever, which is the point is OP making as well.

The only permanent solution to taint analysis as I've mentioned is to
use bitcoin as money, away from centralized choke points that can censor
transactions and demand personal information. It's worth pointing out
that using bitcoin as money wont help our exchange hacker much, this
hacker will never be able to buy mansions or sports cars with their
stolen bitcoin, because the authorities already require proof of the
origin of funds before, for example, buying a big mansion.

Nonetheless, unobservable privacy is also useful for other reasons than
resisting taint analysis:

* It improves the privacy of people who do not use it.
* It helps stops censorship of privacy protocols (I.E. miners could one
day refuse to mine equal-output CoinJoin transactions but still mine
regular transactions)
* It typically uses less block space, because information is removed
from the blockchain rather than adding to the blockchain.

Regards

Chris Belcher