When millions of people use a central service, like reddit, google or facebook, we see how quality of service is important for the brand and for the user experience. Every now and then those big services fail, sometimes an whole day, sometimes just an hour.
You can talk to your fellow colleagues or students and they all realize the problem, because everyone is affected. Quality of service is pretty important for those big companies.
When it comes to a decentralized system like Bitcoin Cash the effect is different. Most of the time a server suddenly going offline or failing to answer is a minor issue because the nature of decentralization means your friend is likely using a different server than you.
That doesn't mean Quality of Service is not applicable, it is. Just in more subtle ways. The red thread here is that there is nobody ultimately responsible.
Scenario 1: A transaction is lost by the 'network'
This is rare, but not impossible. Much like facebook being down is rare but not impossible. For instance the service you delivered your transaction to for mining may go down instantly after it accepted your transaction. It could have crashed, someone did something to the poor machine's Internet, or it was just bedtime.
The bottom line here is that your wallet or your favorite website use some full node to accept the transaction. We are not talking to some company, and the node operators have no agreement with you on quality of service. It probably will be Ok, but it may fail in 10 years, or tomorrow.
Scenario 2: A Tx sent to you may fail node-local policies.
A friend just send you some funds to pay back that time they didn't bring their wallet. You saw him press "send" for the transaction and you feel everything is Ok. But it never arrives!
What may have happened here is that your friends transaction hit some boundaries of node-policies. The most well known is the transaction chain-depth. But the mempool may just have gotten full and the transaction was evicted. Or he paid a fee that some nodes didn't think was enough.
Now, when a wallet talks to a node to submit the transaction they wait for a rejection message in case any of the above happen. So a simple view makes you think this is an impossible situation.
Until you realize that there are thousands of nodes on the network and not all have to be configured the same. The ones that reject your transaction may not be the one who you talked to. This is the nature of decentralization. Just like the first scenario, this will be fine 10,000 times until it fails without warning or without reason.
Scenario 3: Your favourite node has a bug!
There really is a wide array of issues that may occur in such a case. A fun one is when you regard money sent to you as invalid because your favorite node got stuck and provides a view that is several hours old.
How would you like to apologize to the customer you treated unfairly when some service you rely on, but do not have a quality of service contract with, gave you wrong info?
How to provide QoS
The overall thinking in centralized systems has been that we fix each scenario one at a time and continuously improve our quality of service. But this approach can not work in a decentralized system. If only because there is not one central company in control. People can connect to nodes that just don't care about QoS. People can try to hurt the network by spinning up thousands of badly performing nodes!
We have to let go of our notion that the facebook, reddit and google have created where the "service" needs to take care of the quality of service. We consider reddit failing if it shows a cat picture eating through the cables instead of the content we want.
What if instead the browser automatically switches to another server and we just wait a couple more seconds? Not possible for reddit, quite easy for Bitcoin Cash.
We are onto something here. The concept of quality-of-service is directed at the end-user. While we have been thinking it was directed towards the network. We want the user to not see any service interruptions! We can do this by giving the software on his phone some responsibility of helping provide that service even if some full node servers fail.
High QoS through shared responsibilities.
Much of our network and commercial companies put the responsibility of quality on some central servers. As we discovered above, this ends up pretty bad for decentralized systems.
We propose instead a couple of simple rules about responsibility and we demonstrate that this creates anti-fragile systems by merly having best-effort services, but lots of them.
Rule 1: the responsibility of data you care about stays with you.
This sounds pretty obvious and is easy to implement. If you want to send a transaction, or you care about one sent to you getting mined, you store a copy of that transaction. You then send it to the network and repeat sending it after a block is mined that did not include it. Preferably sending the transaction to a different node.
If you want information from the network, like block updates, you are responsible for asking multiple nodes and verify they agree with each other. And verify that the data they sent you is sane.
Rule 2: Zero-conf transactions are your risk to accept.
Again, not too earth-shattering. But needs to be said. We encourage people to accept instant transactions, while having no (read: zero) confirmations. But you may feel its too much to take a risk. And that's fine. Please do be up front about it.
What Bitcoin Cash does provide are double-spend-proofs. Those will severely lower the risk of accepting instant transactions. It is essentially the network telling your wallet that something bad happened. Not getting that signal is thus a pretty big deal.
And that is really all there is to it!
Clear blue skies.
The above was mostly written as a rant. It was written as push-back against the self-imposed rules that we inherited from Bitcoin Core. If I wrote this correctly then it read like a much less emotional and even obvious article.
What I want to see happening in Bitcoin Cash is that we address several stupid designs we inherited from Bitcoin Core as well. This would severely simplify our ecosystem, it would make it much more resiliant to attacks and the clouds will part and angels will sing. I might not be able to guarentee all of those.
Wallet and online services will get the clear message that they should keep the transaction and retain responsibility for it until it has been mined. Some older wallets (like the Satoshi one) already did this. New ones may need to be reminded.
Full nodes should limit the amount of time a transaction stays in the mempool severely. Bitcoin Core increased this several times, from no limit to 3 days to 14 days. Lets bring that down to 4 or 5 hours.
Make sure that infrastructure (like nodes and REST services) are easy to setup and run before you start to build on it. You can build your new project on some free server from another company, but if it grows you should be able to run your own (or pay someone to run one for you). Hold infrastructure responsible for their uptime!
This one is definitely not obvious (yet - it will be common knowledge eventually). Lots of developers depend on their connected node/server to provide ground truth, even for their own transactions. If there ever ends up being an actual conflict, then of course the network "wins" but until then yeah - you are responsible for your own data.