corgi is a novel framework for testing emergent properties of bitcoind networks

5 1324

Written by

No bio yet...

4 years ago

Hi all, I'm @mtrycz and I try to be an evidence-oriented person. Recently I joined Bitcoin Cash Node as a contributor, based on this inclination. corgi is a result of my contribution and it's a project under the BCHN umbrella. Find it here.

Just look at its smug, self-confident smile. Isn't it adorable? Logo by @Leandrodimarco

What is corgi then? It is a framework for testing emergent properties of bitcoind networks. What that means is that there are properties of node software that enable complex interactions when they form a network. The properties of these networks can not always be readily apparent - these are called "emergent" properties, and their study is a fascinating topic. I really recommend that wiki page, but the tl;dr is that it is difficult to infer emergent properties of a system by observing the properties of its parts - it's often the complex interactions of the parts that yield the emergent properties.

The origin of the name is pretty simple: a little known fact about corgis is that they're actually a shepherd breed, and since we'll be managing a sizeable herd of bitcoind nodes, corgi will make things simpler for us.

bitcoind networks can be simulated, up to a degree, and there already is effort in that direction. There are a couple of things more that I wanted to create, besides regular simulation:

evidence is worth something only if it's verifiable/reproducible by peers
properties of the live Bitcoin Cash mainnet can be gauged

So how does it work?

corgi automates the tedious work of setting up a network of bitcoind nodes around the world. It uses Amazon Web Services, which is a service that lets people rent servers on-demand. I've chosen AWS over other services because it has the most locations around the globe (16 by default, up to 20 available as of July'20). Since running corgi will transfer your money to them, I think a disclaimer is in order: I have no other association with AWS than as a customer.

Automating AWS deployment is kinda tedious, though. corgi will take care of that so you can focus on what matters to you: the actual tests.

What can be done with it

My first test case is testing block and transaction propagation. Bitcoin Cash Node has inherited the relay code from Bitcoin Core and Bitcoin ABC, and it's actually unfit for Bitcoin Cash: there is a long random delay with a mean of 5 seconds. The reasoning is that the randomness can conceal the origin of a transaction and the delay can help with batching transactions.

On BTC this is not a problem because it is designed for just a few transactions per second, while BCH aspires to do a whole lot more than that. Also, the privacy concern is moot on BCH, because people aren't expected to run nodes to be able to use it.

So how does it do in practice? Really poorly, actually:

Transaction propagation on a network of 256 BCHN nodes with 8 outgoing connections each. x-axis is time in seconds, y-axis is the number of samples, each sample represents the time for a transaction to reach all nodes

On a medium-sized network of 256 nodes, the mean propagation time is some 9 seconds. This would only increase with the number of nodes. It is quite slow, considering that blocks propagate across the whole network in less than 0.5 seconds. (Read up on the methodology here).

Visualisation of a network of 126 nodes with 3 outgoing connections each. The color of the node indicates its in_degree (darker is more heavily connected)

This doesn't match the actual experience on BCH mainnet, mainly because other implementations (like Bitcoin Unlimited, Flowee and possibly others) relay transactions much faster, so the network (and user experience!) benefits as a whole.

BUT! There is a tradeoff to be made between reducing the delay and increasing bandwidth usage, so that's the next thing we'll be investigating. BCHN is looking at removing, or at least reducing, this delay and the tradeoffs with bandwidth is the next test case we'll be studying.

This is just one test case, and a whole load of other properties could be tested, among others: behavior under high congestion, block compression algorithms performance, pre-consensus strategies, and so on.

Cool, how do I use it?

As of now, it's sufficient to clone the repository, satisfy the prerequisites (mainly, have your AWS credentials ready) and run the main.py script. Some useful logs as to what is happening will be provided, and a cool chart as the one above will be generated at the end - the default test case is to test block and transaction propagation with 256 nodes. You might to lower that a bit if you just want to test things out!

Be aware that running corgi costs money, so it might make sense to use a different framework to run local simulations first to refine your test case, and only then transfer that to AWS when you're actually ready. AWS costs do add up!

When not to use it: when you're doing iterative research by approximation and operating on a small-to-medium number of nodes.
When to use it: when you need bigger scale; when you need to test something on the mainnet; when you're done testing and want to produce verifiable results.

For now the framework is meant to be edited freely; think of it as an elaborated script that you'd need to tailor to your needs. The way it abstracts the AWS shenaningans is what's cool about it, but you still need to provide the actual test case. On the bright side, I can help with that (find me in the BCHN Slack) and am actively on the lookout for further test cases.

The documentation provides some overview and a tutorial is in progress.

Let me know what you think, and reach out if you want any help! And have a great time creating reproducible evidence.