Solving the Long Unconfirmed Chains problem

6 349

There is currently a hot issue caused by a functionality that is important to serveral major players of the Bitcoin Cash ecosystem. The issue, the Maximum Length of Unconfirmed Chains of Transactions (I will call it MLUCT to make this post shorter) has caused a significant distress and conflict in the last 6 months of Bitcoin Cash communities because two different developer camps have proposed different approaches to dealing with this problem.

Wanting to solve this problem, I had first talked with Amaury Séchet and proposed to start & seed a fundraiser aimed at developing the needed code. Amaury quickly talked me out of it, because -as it came out - the issue is a very technical, difficult optimization problem so just throwing more money at it is not going to help.

So we proceeded to technicals and after week or two-long discussion I thought out a solution. I am going to present the issue to you, propose a (hopefully) simple and effective algorithm, include criticism from Amaury plus my counter-arguments. I hope together with other developers we will be able to improve my proposition, fix any imperfections so that the scheme may work and we can have large MLUCT in Bitcoin Cash, working on all nodes: ABC, BU, bchd, flowee and others.

The first Problem With Long Unconfirmed Chains

The first problem with long unconfirmed chains is, that they are very CPU-intensive and to make it worse, their calculation requirements rise not in linear, but quadratic fashion. To make it very simple, I will use an example. Intuitively we would think that when doing certain work on a chain of trsansactions which is 2-elements-long, which requires 2 watts of power to be used by the CPU, when processing a 4-elements-long chain, the CPU would proportionally use 4 watts of power, right?

No, it's not that simple. Above example is called "linear complexity increase". But the sad reality is, while processing 2-elements long chain takes 4 watts of power, processing 4-long chain takes 16 watts of power, 10-long chain takes 100 watts of power and 25-long chain takes 625 watts of power. This is called quadratic complexity increase. Of course this example is way too oversimplified, but the quadratic complexity increase is exactly what is happening during processing of incoming transactions on a Bitcoin node software.

Above is the reason why Amaury Séchet may be somewhat resistant to the idea of increasing MLUCT. Any good software developer should be, because this is an optimization horror. Every node, whether it is relay node or mining node would have to do this increasingly complex calculations - imagine: if complexity of 25-long chain is already 625, relatively to it, complexity of a 500-long chain would be 250000. And this counts for every such transaction. This could allow a DDoS attack and break some nodes. Of course, code optimizations that lower complexity are theoretically possible, but they are sometimes extremely difficult and time-consuming to achieve. Such optimization is one of the greatest achievements of IT, developers and huge companies pay heavy money for it, because it saves them billions.

How to solve the problem... without solving the problem - my idea.

My proposed solution is based on the same concept Bitcoin was based on from 2009 on. That users have to pay for transactions proportionally to their cost - meaning their burden on the network. Currently it is realised by paying for transaction size. So we need to devise a way for wallets to pay for the length of unconfirmed transaction chains, depending on their computational cost.

How would that work?

Just like currently paying for "satoshis per byte", wallets will pay extra "satoshis per level" for each level over the "free" level. Currently "free level" is 25-long chain. Each level over that needs to be paid extra.

Each relaying and mining node would set its own pricing scheme (or just leave them at the default from basic configuration).

Nodes that have weak CPU or low processing power will set high or prohibitive prices so that a massive influx of long-chained TXs won't cause problems for them.

Nodes that require very long chains for viability of their business models (likely SatoshiDice, SLP-related companies, Memo & Member and others) can set the pricing model to zero/flat level. Meaning they will mine/relay transactions with chain length over [25] for no extra cost.

Businesses will be able to mine & relay these long-chained transactions themselves or pay a miner to do it for them, so even if most of the network sets prohibitive prices, long chains will still be viable and their business model will keep working.

My hipothesis is that, after some time a natural, market-induced equilibrium between the real cost of the mining & relaying transactions and current & future potential gains of all market participants should form, which would allow for a seamless usage of very long chained transactions without causing too much stress on the network.

How do we know how much should be paid?

We don't know it yet. Today's operating systems and CPUs come with added optimizations and complexity, so it is not clear right away just how much extra resources will be used for each length of unconfirmed chains, this this is the reason why a simple benchmark needs to be done.

By processing thousands sample transactions with different chain lengths, the benchmark will measure the increase of CPU processing time required for each consecutive level of unconfirmed chain depth, like in this oversimplified example [note: linear complexity increase, unlike real life]:

uConfLength26 = 1 milisecond

uConfLength27 = 2 miliseconds

(...)

uConfLength100 = 75 miliseconds

(...)

uConfLength500 = 475 miliseconds

This benchmark can of course be done multiple times on different AMD/Intel/ARM CPUs, so average data can be found for each TX chain length. Additional benchmark for mempool/RAM usage can be created too.

Basing on data produced by such benchmarks, we can devise a rational pricing schemes that will be acceptable to all network participants.

Then, we create an array in which we assign minimum additional TX fee increase for each CPU time taken, the values can increase in quadratic, linear or logarithmic way - this is a matter for discussion. Miners & relays could be even allowed to choose their preferred pricing algorithm (linear, logarithmic or quadratic), or define price for each level separately in configuration files, like:

extraPriceOfChainLength[26] = 1 satoshi

extraPriceOfChainLength[27] = 2 satoshi

(...)

extraPriceOfChainLength[100] = 75 satoshi

(...)

extraPriceOfChainLength[500] = 475 satoshi

And now, the big problem

So why hasn't anyone thought of this before? This is easy, right? Oh, there is a little problem with this approach. Unlike transaction size, which is absolute and objectively known by entire network on receival, length of unconfirmed chains of transactions is relative. Meaning it may differ from node to node. Each node, miner and wallet on the network may see it differently.

A length 26 as seen by the wallet node which is broadcasting transaction may be seen as 27 or 24 by other relaying or mining node. There are multiple reasons for this and it problem cannot be solved and the underlying mechanics cannot be changed, because this is how Nakamoto consensus works - it's written in stone. It's the basic construct.

Solution I devised is to create a buffer that will "swallow" the difference between chain tips. The buffer will be derived from maximum difference of relative chaintips spotted on the network - in my example I use [5].

Warning: This will get a little more technical now.

The old MLUCT hard limit of 25 gets increased to 500, like:

int MAX_UCONF_LEN_HARDLIMIT = 500

A new, "soft" limit is added,

int MAX_UCONF_LEN_SOFTLIMIT = 30

Then, a "buffer" variable is added

int MAX_UCONF_LEN_VARIATION = 5

When the a wallet sends transaction, it checks relative length of chain from his point of view. When the relative chain length is greater than MAX_UCONF_LEN_SOFTLIMIT [30] - MAX_UCONF_LEN_VARIATION[5] (= 25), an extra fee must be added to the new transaction in order to pay for the increased computational complexity of the chain of transactions, like

If (NEW_TX_CURRENT_CHAIN_LENGTH > (MAX_UCONF_LEN_SOFTLIMIT - MAX_UCONF_LEN_VARIATION)) {

do addExtraFee(TX, NEW_TX_CURRENT_CHAIN_LENGTH);

}

Note the extra fee added pays for extra computation above the limit of MAX_UCONF_LEN_SOFTLIMIT [30], but it is added on lower chain length, to make sure that if relative chain length varies on other nodes, the fees will still be enough to cover other nodes' expenses in CPU power and Memory used. The difference is the buffer (MAX_UCONF_LEN_VARIATION[5]).

Now, when another node is relaying or mining or just putting in mempool the received transaction, it will use MAX_UCONF_LEN_SOFTLIMIT [30] instead of the current old limit of 25. When the depth of chain exceeds [30], then it will stop processing the chain of unconfirmed transactions when enough fee was not provided - to save its resources.

The difference between 30 and 25 is the new "safety buffer" which makes sure that even when relative chain of unconfirmed TX ends up longer on another node (having different chain tip), enough fees are still paid for all the extra necessary computations. How much fee needs to be paid per each level over MAX_UCONF_LEN_SOFTLIMIT - MAX_UCONF_LEN_VARIATION [=25] will be determined by the CPU/Resources benchmarks I proposed earlier.

There are obvious upsides and downsides that are immediately visible after understanding this scheme.

Obvious upsides:

  • The implementation should be pretty straightforward and code should be short and low-maintenance

  • Thus: Low technical debt, low work-hours cost

  • It is compatibile with the current philosophy and inner workings of Bitcoin

  • It is a market-based, not "I know better what the limit should be"-based solution

  • Allows longer unconfirmed chain lengths - up to 500 and more, while still being fair and not burdening the network

  • Allows network participants to choose their own pricing scheme, allowing businesses in critical need of large MLOCTs to increase the limit themselves, without asking other participants for help or compliance

  • This proposal is future-proof, because if transaction processing code gets optimized some day to lower complexity so longer chains of transactions may be processed with the same cost, this functionality will be still useful, because it allows immediate lowering of pricing level to encompass that event

Obvious downsides:

  • Wallet or Node broadcasting a long-uconf-chained transaction has to overpay a little when the chain length is longer than [25]. How much it overpays depends on how different are chain tips on different nodes. Maximum overpay is [5] levels, assuming a relative MLUCT difference of [5] between different nodes.

  • This scheme may probably temporarily stop working properly when there are too many orphaned blocks on the network, causing the relative difference between chain tips to widen and become bigger than the buffer

Additional caveats:

  • I am not sure [5] is the valid number for the buffer, it is just an example based on my imagination of the situation. To devise the actual proper number will require some testing done in real life scenarios on actual nodes or existing statistical data as it may already exist somewhere.

Discussion, criticism and improvements

I am open to any criticism, improvements and even complete re-hauling of the concept presented above. In this spirit, I shall include existing criticism produced by Amaury Séchet during our week-long conversation:

"It is not possible to pay a fee, that's going to be the same network wide, that is charged based on an information that isn't the same network wide."

  • The relative difference of information (the length of unconfirmed chain of transactions) is not infinite, it is a small random number, dependant on various network conditions. When we determine what is the average maximum value of this number during normal network operation, then we can encompass this problem by overpaying a little, to cover the possible differences. This is achieved by the "buffer" variable

"There is also no market because these who pay the cost (namely every miner and business) are not these who pocket the fee (just one miner). This is a tragedy of the common type of setup"

  • Payment operators and every other non-mining entity who runs nodes these days also only pay for electricity costs while not getting miner's fees. But they still run the nodes, because they have other sources of income. So this situation will not change at all after the implementation of my proposal. As already said, the proposal is based on the same economic and social principles that Bitcoin was built on since its inception.

  • The nodes that really matter are miner nodes anyway. When Satoshi wrote his whitepaper, every node was a miner node. So we should only calculate whether it is profitable to accept these long unconfirmed chains for mining nodes, because miner nodes are the ones who will include these transactions in a block, so they bear the ultimate responsibility. The rest of nodes will decide themselves if it is acceptable for them to process longer chains, the same as today they decide if it is profitable to accept transactions based on size only.

  • By implementing the proposed idea we are not really making everybody accept long chains. Instead, we are giving them choice. Nodes are not forced to accept these long chains. But they can, because they have a choice. I find it wiser to at least try to give people choice rather than immediately force them to do as I say.

OK, who will code this, if it works?

So far in the Bitcoin Cash community I have been mostly a guy who finds holes and problems and points them out - so, despite the best intentions, it has been in large part divisive and contention-producing activity. What can I say? This is apparently my natural talent - to find problems and point them out. But maybe it is time I tried a new approach? Maybe this is the time where I try the reverse - to help accomodate and join two camps at odds with each other?

I have also been a software developer of multiple languages for about 21 years, if it comes to it I could code it. I am not a C/C++ dev specifically and I am not sure if I really have the 2 months to learn and set up dev environment it right now, so most probably I will write an incomplete code having all the logic plus executing all critical functionality and just crowdfund fixing of small semantic / language-related errors that come up.

The concept is extremely simple and can probably be coded by any current Bitcoin developer on a lunch break, so if anybody wants to do it right away, feel free.

Waiting for your input

So, having said this, I await your input, your suggestions and your criticisms so we can solve this issue for the betterment of Bitcoin Cash.

3
$ 2.75
$ 1.00 from @unitedstatian
$ 0.50 from @btcfork
$ 0.50 from @jonny
+ 4

Comments

This is an interesting new thought direction for a solution: changing the fee structure instead of changing the algorithm. But I think this is a poor idea, because it's essentially a temporary quick fix instead of an actually desirable long-term solution. It's likely more efficient to work directly towards a good long-term solution without detours - and working efficiently should be preferred given the limited resources for development.

$ 0.00
4 years ago

Thanks for your input.

I agree with you on that we should work on optimization and efficiency-related solutions.

But saying this is a temporary quick fix does not seem entirely accurate. This solution could be extended to create market for transactions with higher than usual computational complexity.

It is more about creating a market for "difficult" transactions rather than quickly fixing the current problem.

And it is not a dirty solution. Because of its simplicity, it can be implemented cleanly, without creating complex code or adding much technical debt.

$ 0.00
4 years ago

Related Reddit thread, for people to read what's already been said :

https://www.reddit.com/r/btc/comments/epijvo/solving_the_long_unconfirmed_chains_problem/

$ 0.00
User's avatar btcfork
This user is who they claim to be.
We have manually verified this user via some other channel.
4 years ago

Was reading when stumbling on this part that for some reason I'm not able to parse:

A length 26 as seen by the wallet node which is broadcasting transaction
may be seen as 27 or 24 by other relaying or mining node.
There are multiple reasons for this and it problem cannot be
solved and the underlying mechanics cannot be changed,
because this is how Nakamoto consensus works - it's written
in stone. It's the basic construct.

how is this even possible?

if TX1 belongs to a chain of unconfirmed txs it will admitted to the mempool only if all its ancestors are already admitted in the mempool, otherwise it will be staged in the orphanpool.

if the above is true there's no way that a node see a 26th tx as 24th and whereas another sees it as 27th. the first two nodes would stage the tx in the orphanpool until all ancestors will came in and then will move it to the mempool and then all nodes will see the tx as 27th.

Am I missing something obvious here?

$ 0.00
4 years ago

Thanks for your input.

This is how I understood it when explained by Amaury Sechet and by re-reading Bitcoin Whitepaper. He said that mining nodes may choose different chain tips to work on and the different chain tips may contain or not contain the previous transactions in chain, which would probably result by the received transaction being perceived as having more or less unconfirmed depth.

This is how I understood it, I apologize upfront if I am wrong. For my defense, Amaury was not very talkative about the issue, despite my multiple repetitive questions. Also I am not a Bitcoin developer (yet). My mind construct of how transactions are admitted into mempool is incomplete.

$ 0.00
4 years ago

Wow I found this article awesome! Thanks for sharing this I've got a lot of information and now I know how this works. I loved to use bitcoin cash so much ❤️ I also followed you. I hope you published more so I could read and learn a lot of information from you 😇

$ 0.00
3 years ago