As government scrutiny of cryptocurrencies has intensified over the last several months, concern has grown over the possibility of a two-tier market for cryptocurrencies that have a transparent ledger. On one side, there are "clean" coins that are held and disbursed by centralized exchanges that require identity documents to comply with government-mandated Know-Your-Customer (KYC) regulations. On the other side, there are "dirty" coins of a unknown provenance that may come from mixers, CoinJoin transactions, and peer-to-peer trading. The world of "dirty" coins includes thieving hackers but also ordinary people who merely want to maintain their privacy amid increasingly intrusive mass surveillance.
A CoinJoin is a special type of cryptocurrency transaction that obscures the source and destination of payments by bringing together the coins of many different users within a single transaction. Bitcoin Cash (BCH) developers created a CoinJoin implementation called CashFusion, officially released in July 2020. BCH users have participated in over 125,000 CashFusion transactions.
The Financial Action Task Force (FATF)'s "Virtual Assets Red Flag Indicators of Money Laundering and Terrorist Financing" report identifies CoinJoins (referred to in the broader mixer category) as one of several "red flags" that centralized exchanges and payment processors may want to investigate or block:
The various technological features below increase anonymity and add hurdles to the detection of criminal activity by [Law Enforcement Authorities]. These factors make [Virtual Assets] attractive to criminals looking to disguise or store their funds. Nevertheless, the mere presence of these features in an activity does not automatically suggest an illicit transaction...
Transactions making use of mixing and tumbling services, suggesting an intent to obscure the flow of illicit funds between known wallet addresses and darknet marketplaces.
Indeed, users of BCH have recently discussed reports of CoinJoined BTC coins being blocked from established services, wondering if BCH that has passed through CashFusion transactions may one day be blocked. These concerns have only increased with the introduction of new rules requiring Coinbase users in Canada, Singapore, and Japan to provide more identity information about recipients of cryptocurrency transactions and today's vote on stronger KYC requirements in the EU.
These issues raised in my mind the question of how well-integrated CashFusioned coins are with the rest of the coins on the BCH blockchain. Is CashFusion restricted to a small corner of the blockchain -- either due to isolated users or established services segregating the coins -- or are they spread widely through the blockchain?
In other words, if established services wanted to completely avoid dealing with BCH that had ever passed through a CashFusion transaction, would they be able to?
After a couple of months of on-and-off coding and several weeks of computation time, I can finally answer that question: No, not as a practical matter.
Bitcoin transactions create a series of outputs specifying who gets paid and how much. These transaction outputs (TXOs) are then spent by other transactions as inputs. Unspent transaction outputs (UTXOs) form a set representing all coins on the blockchain available for spending.
I have found that 94 percent of the value in the UTXOs created between July 29, 2020 and December 23, 2021 (corresponding to block heights 646085 and 719602) is a direct or indirect descendant of a CashFusion transaction. This represents 10 million of the 19 million BCH that currently exist.
Counting by the proportion of outputs (and not by value), CashFusion's integration with the rest of the blockchain is even higher: 98.6 percent of all outputs in that UTXO set are a CashFusion descendant. The total number of outputs in that UTXO set is about 26.7 million.
A transaction that spends the coins (i.e. outputs) of a previous transaction is a child transaction. And, in turn, a transaction that spends the child transaction's coins is the grandchild of the original transaction. If the outputs of a transaction can be traced backward in time to a particular transaction through a series of parent-child relationships, then that transaction is said to be a descendant of that earlier transaction.
The UTXO set is all of the outputs that have not yet been spent. In other words, the UTXO set is the grand total of all coins that BCH owners possess in their wallets, available for them to spend.
The figure below illustrates these concepts, using a fictional transaction graph (i.e. network relationship) with various scenarios. The red circles are CashFusion transactions. Purple circles are non-CashFusion transactions that are descendants of CashFusion transactions that have already been spent at this point in time in this fictional scenario. Orange circles are unspent outputs that are descendant from CashFusion transactions.
The blue circles, on the other hand, represent spent outputs that do not have any CashFusion ancestors. If an output is unspent but remains untouched by a CashFusion transaction, then it is green. Notice that when blue circles combine with purple circles, their descendants are all purple.
In the figure, the total value of the UTXO set is the sum of all coins contained in both the orange and green circles. When I say that 94 percent of BCH in the UTXO set created in the last year-and-a-half is a descendant of a CashFusion transaction, I mean that the orange circles contain 94 percent of the value of the orange and green circles combined. Another way of describing this UTXO set is that these are coins that are in "active addresses" on the BCH blockchain.
To identify the CashFusion descendants I wrote code in the R statistical programming language to query the BCH blockchain, construct the transaction graph consisting of over 200 million vertices and 280 million edges, and traverse the transaction graph to identify all UTXO vertices that were reachable starting from the CashFusion transactions. Pivotal in the process were the
rbch package, which I adopted from the
rbtc package with support from my Flipstarter campaign, and the
igraph package for fast graph analysis operations.
94 percent of BCH may be a CashFusion descendant, but how many "generations" of transactions separate a typical UTXO from its closest CashFusion ancestor? I took a random sample of 1,000 CashFusion-descended UTXOs from the UTXO set and computed the distance to the nearest CashFusion transaction.
The median number of transaction separating a UTXO and its nearest CashFusion ancestor is 463. The mean is 1215. The 25th and 75 percentile are 30 and 1819, respectively.[*] From these statistics we may conclude that most CashFusion descendants are only weakly linked to CashFusion transactions, despite the fact that the vast majority of the UTXO set is a CashFusion descendant.
Would a chain analysis firm be able and willing to track CashFusioned coins through so many transactions? Yes. Chainalysis reported that they are tracking BTC associated with the PlusToken scam even though they have been involved in tens of thousands of transactions: "The scammers have transferred the Bitcoin more than 24,000 times, using more than 71,000 different addresses..."
The large distance between the typical UTXO and its nearest CashFusion ancestor indicates that many transactions must have occurred between the CashFusion transaction and the UTXO -- and therefore, most likely much time had passed. To gain insight into this issue, I re-ran the analysis for just the month of February 2022 (block heights 725290 to 729371).
The total quantity of BCH in the UTXO set created in February 2022 was 1.9 million BCH. 44 percent of the value of that UTXO set was a descendant of a CashFusion transaction that occurred in that month. 60 percent of the number of outputs in the UTXOs set (2.2 million outputs in total) were CashFusion descendants.
It is not surprising that restricting the analysis to shorter time frames results in a smaller proportion of the UTXO set becoming a CashFusion descendant. In technical terms, this is because "CashFusion descendant" is an absorbing state. In other words, once a quantity of BCH becomes a CashFusion descendant, it will forever remain a CashFusion descendant, by the definition of this analysis. Therefore, as more time passes, the (unconditional) probability of some quantity of BCH being a CashFusion descendant gradually increases.
From my observations, it seems that all CashFusion transactions do not contribute equally to the spread of CashFusion descendant status to the UTXO set. I wrote the code so that the process of identifying CashFusion descendants started at the most recent CashFusion transaction and worked its way backward toward the oldest CashFusion transaction.
I noticed that the proportion of the UTXO set that is a CashFusion descendant did not rise steadily. Instead, it would rise very slowly for some time as it worked its way through each CashFusion transaction and then experience sudden jumps of several percentage points once it processed a particularly well-connected CashFusion transaction. For example, in the February 2022 sub-analysis, at one point the proportion of the UTXO set that is a CashFusion descendant jumped from 15 percent to 57 percent due to a single CashFusion transaction. About 57 percent of the UTXO set created in February 2022 is a descendant of that one transaction. It is likely that one of the outputs from that CashFusion transaction eventually found its way into an important wallet like a centralized exchange or service, and the wallet then spread the output far and wide.
Can this analysis be done for BTC? Yes and no. The code that I wrote should in theory be inter-operable with the BTC node software. The main barrier to a corresponding analysis of BTC CoinJoins is the effort required to identify all of the BTC CoinJoin transaction.
Identifying BCH's CashFusion transactions was easy. Every CashFusion transaction has "FUZ" in its
OP_RETURN field, so I only needed to write a script to gather the transaction IDs of all CashFusion transactions, which I have done for my cashfusion.redteam.cash web app. Identifying all the various CoinJoin implementations on BTC would be much more challenging. If anyone is aware of a reliable database of BTC CoinJoin transaction IDs, please let me know.
Speaking of the cashfusion.redteam.cash web app, could this analysis become one of the metrics that the web app tracks? Perhaps. This analysis that takes July 2020 as the starting point required weeks of computation time. Because of the way that the transaction graph is connected, an "update" to the data would still require a similar time for re-computation. However, since completion of the analysis I have thought of a shortcut that could dramatically reduce the computation time. I have not yet tested the idea, but if successful I could incorporate this metric into cashfusion.redteam.cash.
Frankly, I was a little shocked at my results. I didn't expect that CashFusion would be so well-integrated into the rest of the BCH blockchain. Of course, to some extent the level of integration is high due to how I've defined it: A descendant is a descendant no matter how many transaction separate it from its ancestor CashFusion transaction.
In spite of the fact that CashFusion is so well-integrated, I have seen no credible reports of a centralized exchange rejecting deposit of BCH due to CashFusion ancestry, whereas rejection of BTC due to a recent CoinJoin is a fairly common occurrence at this point in time.
On the other hand, that may be because of -- not in spite of -- the fact that CashFusion is so well-integrated. Perhaps centralized exchanges choose not to ban recent CashFusion descendants because they would be locking themselves out of a substantial proportion of BCH in circulation. Another possibility is that BCH overall has "flown under the radar" of anti-privacy policies since it is a minority-hashpower hardfork of BTC with little reputation of nefarious activities happening on its blockchain.
As use of CashFusion expands and more wallets integrate it as a feature, will BCH become akin to Dash or Decred, which have optional CoinJoins integrated into their protocols?[**] And therefore will centralized exchanges and services be forced to accept CashFusioned coins as a routine part of dealing in BCH? Or will there be a crackdown?
The article also appeared on my personal website.
[*] : Since this is a sample of a population, there is sampling error. In my professional opinion, a nonparametric bootstrap is appropriate for constructing confidence intervals in this setting since graphs can have unusual statistical properties. The 95 percent confidence intervals on median, mean, 25th percentile and 75th percentile are (367, 595), (1128, 1304), (29, 31), and (1648, 2059), respectively.
[**] : Note that Dash has sought to distance itself from its reputation as a privacy coin.