Many cryptocurrencies (most notably Bitcoin) rely on special functions called cryptographic hash functions to produce security and decentralization. They achieve this by providing a reliable and fair Proof-of-Work. In this article, I'm going to break down these concepts!
A hash function is a deterministic function that maps an arbitrary size input to a fixed size output. The input data is known as the message, while the output is known as the hash, or digest. Hash functions have a wide variety of uses. One such use is to verify that data was transferred without errors. Both the sender and receiver can compute a hash of the data, and exchange hashes. If they match, the data was sent successfully.
A cryptographic hash function takes things one step further: it is specifically engineered such that a user cannot invert the hash (that is, find the input that gives a given hash). This is known as a preimage attack.
The cryptographic hash function used by Bitcoin (among others) is SHA-256, which was designed by the National Security Agency. Here's a sample input/output:
> sha256("My name is Ken.")
Notice that if we modify the input string ever so slightly, we get a wildly different output! This is a desirable property of hash functions.
> sha256("My name is Ken?")
The output is 64 hexadecimal digits, which encodes a 256-bit number. SHA-256, as the name suggests, always produces a 256-bit output. In addition, SHA-256 hashes have all the statistical properties of random numbers. If you computed the hashes of 10,000 random inputs, the outputs would be evenly distributed, each bit independent from the rest.
If we have a solid cryptographic hash function at our disposal, we can use it to make a system where people can "prove" that they did a certain amount of computational work. I mentioned earlier that quickly inverting a SHA-256 hash is thought to be computationally intractable - an attacker would be forced to use a brute-force search to find the input that produces a given hash. This is known as a full hash inversion. What if we made the problem slightly easier?
Instead of finding an input that produces an arbitrary hash, what if we asked for the input that produces any hash that starts a zero? Due to the statistical properties of SHA-256, approximately 1/2 of the hashes start with 0, while 1/2 of the hashes start with 1. Therefore, you would have to compute the hashes of approximately 2 random inputs to find the target hash. What if we asked for a hash that starts with two zeroes? Since this occurs approximately 1/4 of the time, we need to compute approximately 4 hashes. In general, you need to compute 2^n hashes, where n is the number of zeroes in the target hash.
This easier problem is known as the partial hash inversion, and is the foundation of Bitcoin mining. We want transactions to be validated by independent, decentralized volunteers, rather than a centralized entity. If we choose a random Bitcoin node to validate the next block, what stops an attacker from running thousands of malicious nodes? If we give everyone a chance to participate in the network, but only if they prove that they did a certain amount of computational work, we create a system that cannot be controlled by a single individual.
Hash functions simply provide a fair game for everyone to play, to ensure decentralization. I hope you enjoyed this article!