Wednesday, June 7, 2017

blockchain in a nutshell

Cryptocurrencies are all the rage today, and if recent price movements are any indication, will be in the news for some years to come yet. Whether they can mount a true challenge to fiat currencies remains to be seen, though it is heartening to see the gradual uptake in countries like Japan.

In this post, we look at the underlying technologies for cryptocurrencies - things like blockchain and bitcoin - by looking at ideas from the ground up. So let's get to it...

There are ten simple ideas we need to understand first before we get into any of these, so we can use them as building blocks for what is to come. These ideas are:

  1. hashes or cryptographic message digests - these are one-way functions like SHA that when given any length string of input, generate a unique 128 or 160 bit string as output. The output string may be regarded a digital fingerprint of the document. Changing even one bit in the input document will on average change half the number of bits in the output, and it is computationally very expensive for anyone to try to find another document that has the same hash as a given document (for a 160 bit hash, they will on average have to look through 2^160/2 documents before they find a match - computationally infeasible), though it is trivial for anyone to compute the hash of a document they are given (hence the term "trapdoor function" or "one-way function").
  2. public key cryptography - symmetric key encryption is widely understood. A and B have a shared secret key S. A encrypts plaintext into ciphertext using S. B uses S to decrypt whatever ciphertext A sends her. But asymmetric or public key encryption is much more interesting. This uses ideas from the difficulty of factoring the products of large prime numbers, or from elliptic curve functions (more efficient to implement on smaller devices, and more resistant to attack), to support the notion of a key-pair in encryption. Both A and B have their own pair of keys called their public and private keys respectively - they both publish their public keys, and guard their private keys as secrets, with their lives. Anything that is encrypted with the public key can only be decrypted with the corresponding private key (only from the same key pair) and vice-versa. Now if A wants to send B a secret message, A can optionally sign the message with his private key, then encrypt the signed message with B's public key, and send it to B. Then only B can read the message (needs B's private key to decrypt it), and B can check A's signature on the message - anyone with A's public key can check that A applied his private key to the message, so B can know with confidence that A generated that message, and that only B can read it.
  3. Digital wallet - every user in the system is identified by a wallet ID which is a unique, long string of alphanumeric characters. All transactions are carried out using this ID. This ID is called the user's digital wallet. A digital wallet holds digital cash.
  4. Pseudonymity - Only the user knows that the digital wallet ID links to her. Network nodes only know the value this ID carries. Thus transactions are not completely anonymous as with cash. Here the network knows how much each ID carries. But the network does not know who owns the ID. If Alice ever loses her wallet ID, she loses all her money - it cannot be recovered.
  5. Transaction idempotence - imagine you are talking to an ATM over the network. You want to send John $10. You tell the ATM "Debit self, credit John, $10". Let's say after you send the message, but before you receive confirmation, your network link dies. Many possibilities arise - a. your message was never received by the ATM, b. your message was received by the ATM, was processed successfully, and you never got the confirmation, c. your message was received by the ATM, there was a problem, and the ATM's reply never reached you. If you re-send the message, your account may see a net debit of $20, which is not what you want. But if you don't resend the message, John will be unhappy if (a) or (c) happened. So what to do? Let's say your account has $100 at the start. If you send the ATM a message saying "<timestamp>,my account, $100, $90, John's account, $10", even if you resend this to the network 1000 times, the message leaves your account in a predictable state. This is called transaction idempotence.
  6. No double spending - the number of bitcoins available is capped at 21 million. But how to ensure Alice cannot copy her money (since it is digital after all), and spend the same digital cash with Bob and then with Charlie? cryptocurrencies do this by using the idea of blockchain. Every transaction is recorded in blockchain. So somewhere in blockchain there is a record of Alice getting (say) $10 when she opened her account. Thereafter, all transactions into her account also show up in blockchain. So tracing her wallet ID backwards through processed blocks gives network nodes a clear view of how much money Alice's wallet ID has at the current time - even though they might not be able to associate that wallet ID with Alice's real world identity given pseudonymity.
  7. Merkle trees - if you have (say) 32 hashes, you can arrange them into a tree structure - take hashes of 2 hashes each to start with (gives 16 hashes from the original 32), then hash those 16 two at a time to get 8 hashes, then 4 from the 8, 2 from the 4, and then 1 from the two. In other words, you can take 32 independent items, hash them, then melt the hashes together to get a single hash of a fixed length that ensures that if even one bit of any of the original documents changes, the top-level hash is off. This construct is called a Merkle tree.
  8. Proof of work - we said earlier that it was very difficult, if not impossible for anyone to find a document that has the same hash as another one. What if there were a large number of nodes in a network, and they were all tasked to solving this problem? If they each concentrated in a different random area of the search space with minimal overlaps, they might be able to find a match. In fact, blockchain relies on this idea. A hash is given to all nodes (how this comes about is covered later) along with some data, and they are asked to find a secret of fixed length, which when concatenated with the given data and hashed, will result in the same hash as the one given, but with a pre-specified number of leading zeros. Since trapdoor functions are easy to compute one way, but hard the other way, once any node stumbles upon a solution, when it shares it with other nodes, they can easily verify the solution's correctness. In providing a solution to everyone else, the node provides a "proof of work" - since there are leading zeros in the published hash, no node is at an advantage when the problem is posed to the network.
  9. Consensus algorithm - in a network with 2N nodes, when there are conflicting opinions or pieces of data, what N+1 or more nodes believe to be true, goes. Simple majority wins. Always.
  10. Trustless entities - every element connected to the network believes only in maximizing their own gains and trusts no one else.... at all.


With the above background, we are now able to look at how cryptocurrencies and blockchain more broadly, work.

Transactions are organized into blocks - each of which has a long alphanumeric string as an ID. Each block also has a number that every node in the network knows. Every (say) ten minutes or so, nodes in the network (we will see later which nodes) gather up all as yet unprocessed transactions, verify them to see that the transactions are legitimate (e.g. signed by the spending party authorizing payment, and that the cash is not being double-spent), and then build a Merkle tree with them. Then they concatenate the block ID with the Merkle tree hash of collected transactions, and a secret they generate, compute the hash of these things together, and publish it out there to everyone on the network.

Let's say Alice who has $10 in bitcoin buys some coffee with it for $1.50. Alice's transaction then shows in the network with a timestamp, her wallet ID (a pseudonymous identifier for her wallet), and the fact that she pays $1.50 to the coffee vendor wallet ID, and pays $8.45 (say) to herself (the $0.05 difference is the fee she is paying the blockchain network for processing her transaction). This transaction is idempotent since even if it were reissued to the network, it goes by the total amount of money in Alice's account, and that can only be debited once from its value of $8.45 with the given timestamp.

Nodes can verify that Alice signed the transaction by checking the signature with Alice's public key that is widely known since it is published. Nodes can verify that Alice has $10 in bitcoin to spend, by looking through all the old accepted blocks in the blockchain till they find enough evidence that Alice has $10 to her name (rather her pseudonymous wallet ID).

Once verification is complete, her transaction is bundled with other transactions that happened since the last block was published to blockchain, a Merkle tree hash is computed with it, this is concatenated with the blockID of the last block in the blockchain, and a "salt" - a secret the creator of the current block generates, and a hash computed and sent through the network along with the block of transactions used in this computation. 

All nodes in the network now scramble to find a "proof of work" - a secret that will generate the given hash but with a number of leading zeros (this number is calibrated by the number of blocks in the blockchain at any point in time, so as more nodes join and the technology gets more mature, and there is more competition to mine the search space for the solution, it keeps the degree of difficulty of solving the proof of work roughly the same). The first node (X say) to find this secret that meets the leading zeros criteria then publishes this to all other nodes in the network. 

Every node that hears of this, first checks the hash to ensure the proof of work is satisfied. The node that solved the problem earns 25 BTC (bitcoin) as well as the right to build the next block in the chain, for its trouble. All nodes that hear this solution first add the disseminated new block to their view of the chain. X then gathers up the as yet unprocessed transactions, constructs a new block with a new proof of work challenge, and sends it out to the network. And the race starts again - every node competes to solve the challenge to further its own interests to make more money. The process of trying to solve the challenge is called mining. Bitcoin miners solve the proof of work challenge to get paid 25 BTC for each challenge they solve, and to collect payment for each transaction that is processed (5 cents for Alice's transaction above, for example).

If two nodes solve the challenge at the exact same time, the one that propagates faster into the network gets accepted, because the sooner it is seen, the sooner the nodes that see it start working on solving the challenge from the next block, which means the longer the accepted chain, the more likely it is accepted as true. This results in a consensus protocol and also means any attacker who wants to subvert fair processing of transactions has to take over at least N+1 of the 2N nodes in the network to exert any influence.

This also means as the chain gets longer, any records of ownership in the blockchain get more ingrained into the system. This ownership cannot be easily disputed, unless a node can show a longer chain of blocks from that block in the past, whose hashes are all correct looking forward - a task that gets incrementally more monumental given a new block is being produced by the network of nodes every ten minutes. This nearly eliminates the possibility of cheating and fraud.

That, in a nutshell, is how blockchain technology works. Of course, there are constructs like Smart Contracts that can be layered atop it to make it even more useful, but we leave those for another day.