1:3 Other Ways to Make the Sausage
This chapter works to further the ideas presented in the previous brute force consensus chapter by showcasing the three main ways transactions can be validated to reach a consensus about what is placed into a shared ledger database.
Treat this chapter as a very high level reference guide of sorts to refer back to in subsequent chapters. Many of the topics introduced here will be discussed in much more detail in future chapters. Keep in mind this “three buckets” approach is a cheap short cut to try and create an easier mental map of a very complex topic.
By "miner" we mean the computer(s) that place transactions in to the shared ledger commons.
By “transactions” we mean any piece of data sent to the shared ledger. Data can be anything from a simple peer-to-peer transactions, all the way to uncompressed 4K video feeds.
“They Are”, “You Elect”, and “You Are” refer to who (e.g. which computers) are performing the computations.
"They"-are-the-miner: are various bitcoin like strategies where miners race to find winning random numbers either in proof-of-work like setups illustrated in the previous chapter, or other "sacrifice" based approaches such as proof-of-capacity where something more useful like hard drive storage space is used to tie validation to the real world.
The key element is anyone with a computer can join the race and have a random chance of being selected to validate transactions. As we will see, anyone is a bit of a misnomer due to economies of scale favoring larger and larger pools of miners.
Examples include Bitcoin, Litecoin, Monero, etc.
"You"-elect-the-miner: work by token holders in the system electing who they want to validate transactions, rather than through purely random chance competition. Voters cast their ballots most often with a one-token-per-vote type system, and expect to receive a portion of the network fees back for casting their vote for a particular computer that maintains a full copy of the ledger called a validator.
Within the elected group of validators, an element of randomness must be present for who is chosen to submit new transactions, or the system would work no differently from a centralized database. (E.g., the validators must not be able to consistently win or the ledger risks being co-opted)
Examples include publically elected networks like NEO, Ark, EOS, Lisk, as well as private implementations like Hyperledger, Corda, R3, and Libra where validators are not publicly elected, but rather chosen by a closed-end consortium.
"You"-are-the-miner: A catchall term for systems that do not fall neatly into the previous two paradigms. In these systems, there are flavors of storing your own local chain that matches up with some subset of a global chain of transactions to verify consensus. In effect "you" do some of the work of a miner to push out transactions. “You” is a loose term which could be anything from running your node on an old laptop or smartphone, to an cloud based virtual machine, to specially designed networking hardware.
Frameworks like Holochain and Radix that allow developers to launch their own encrypted networks, as well as provide hosting for those not wanting to operate their own cloud hardware.
There are no fast and set rules separating these three main approaches, and any distributed ledger network is free to fundamentally alter the way it validates transactions by either:
The majority choosing to support an updated version of the ledger (no split)
The minority choosing to support an updated version of the ledger (split into two projects - legacy version & updated version)
Starting a brand new ledger from scratch with new rules and new owners
Due to the decentralized nature of distributed ledgers, gaining consensus to change critical network parameters is a difficult and often contentious process. Based on which upgrade path is chosen, there will be different winners and losers which becomes a fundamentally human rather than computer science problem.
"They" Are the (Un-elected) Miner
Bitcoin was the first true distributed ledger, setting the precedent for a sacrifice based approach to ensure transactions were honestly processed in a provable, unmodifiable way.
Within this they-are-the-miner electricity burning race approach, there are three primary sub approaches which seek to improve the scalability and security of the network.
Off-chain: as the name implies is method allows for transactions to be sent without recording the transaction onto the ledger each time, but instead referencing the ledger only when there is a conflict.
Big Blocks: simply increase the block size to increase the amount of transactions the network can process. While Bitcoin has chosen to stay with 1 megabyte blocks, forks of the Bitcoin such as Bitcoin Cash allow for 32 megabyte blocks to be posted every 10 minutes with plans to increase the block size furher as needed.
New Sacrifice: approaches replace the SHA-256 algorithm bitcoin uses with a different type of sacrifice using a different random number finder, or another type of potentially more useful sacrifice such as sacrificing hard drive space, or searching for large prime numbers.
The dominant version of Bitcoin (as of this writing) chose to implement a multi-year development called the lightning network which creates a second layer of transaction processing on top of the primary bitcoin blockchain.
The lightning network system works by opening a channel of transactions where parties can make many transactions with each other without needing to send each transaction to the ledger. Instead, they send transactions through open lightning channels which only use the base bitcoin blockchain as an arbitrator if there is a conflict within the channel.
This approach can be hugely beneficial for:
privacy: as there is no blockchain record of the transaction, and
scalability: as the 1 megabyte of transactions bitcoin can process every 10 minutes is not filled up with off-chain lightning transactions
When a channel between parties is opened or closed, a single transaction is registered with the blockchain. Once established, a lightning channel removes the need to register every intermediate transaction that takes place in the channel with the underlying blockchain. Like a traditional network router, transactions “hop” between “relay nodes”. (Eg. each individual running a lightning node can forward transactions without being able to steal/co-opt/modify transactions)
In theory, a system like the lightning network will work well for someone like a large retailer with many thousands or even millions of transactions per day. The retailer might pay your one time blockchain fee to open up a channel on your behalf, then be able to credit and debit an unlimited number of transactions with you for zero miner fee, as none of the intermediate transactions are posted to the underlying bitcoin blockchain. Only if there is a dispute between parties does bitcoin get involved as the ultimate arbiter who can move the underlying bitcoin collateral. By “bitcoin getting involved” we really just mean you can revoke a transaction and get your bitcoin back from the lightning network as long as you have your private key.
As off-chain solutions effectively create multiple layers in which to route transactions, off-chain solutions can be seen as the evolution of distributed ledger systems into something that more closely resembles the current internet. Over the decades, the internet continues to add many layers of architecture that act in coordination to run a massive interconnected global ecosystem such as FTP, HTTP, TCP/IP, etc.
Building a new internet based on shared ledgers and data integrity through hashing is not an easy task. There are many challenges left to be solved before lightning networks can become mainstream including:
The need for both parties to remain online for transactions to process. The base Bitcoin protocol conversely is “asynchronous” in that you can send “Real Bitcoin” to a public key address that is offline, as the record of the event is stored in the shared ledger that can be synced up to at any time in the future by providing the correct private key.
More insidiously, lightning networks quasi recreate the existing banking system by giving lightning channel operators certain power over transactions passing through their network such as what fees to charge, and if they want to ban certain addresses.
Lightning channels are also limited to actual bitcoin collateral backing them (which is a good thing lest we accidentally recreate the fractional reserve system). This however favors routing transactions through lightning hubs with large amounts of collateral, which can be seen as further centralization of the protocol.
The alternate version of Bitcoin called "Bitcoin Cash" removed the ability to perform these off-chain transactions in favor of increasing the block size and retaining the simplicity of the original Bitcoin protocol. As of May 2018 each Bitcoin Cash block can process 32 megabytes worth of transactions every 10 minutes, though in practice most blocks are very far from being full.
This approach increases the storage requirements to host a full node (or full copy of the ledger). Larger storage requirements to maintain a full copy of the ledger can limit the number of distributed ledger “full nodes” people are willing to run. If you need to store 32 megabytes of new information every 10 minutes just to send a payment, most users will not be willing to purchase a new hard drive every year just to use the protocol.
At 3.2 megabytes per minute X 525,600 minutes in a year if all blocks are full roughly 1.6 terabytes of new blockchain data will be created each year.
Thus most users will not download the full copy of the ledger (called a full node) and instead rely on a lite wallet: software that does not need a full copy of the ledger to work, but instead references someone else's full copy. A program called simplified payment verification (SPV) is used to quickly compute the hashes of the current state of the blockchain to ensure lite wallet transactions are just as secure as full node wallet transactions.
The debate between using small blocks to maintain the accessibility of users to download a full copy of the blockhain, and using big blocks to grow the capacity of the network in a simple way is far from over. A thought provoking blog post imagines a world with many terabyte blocks that can record every waking moment of life. In such an extreme example, the hosting of the nodes would by necessity become centralized as storing such vast amounts of information could only be done by large scale operations.
The Bitcoin protocol does not provide any incentives to host full copies of the ledger, and instead only rewards the miners that process the transactions. For a distributed ledger system to succeed at scale, creating incentives to continue hosting robust copies of the ledger long after transactions are first sent will become ever more crucial.
As mentioned in the previous chapter, Bitcoin uses a very specific math puzzle called SHA2-256 to race for random numbers that will win each new block. Since the introduction of SHA2-256, many newer variants of finding random numbers have entered the market. These newer approaches provide higher levels of encryption with varying hardware requirements to effectively solve these newer types of math problems.
These newer algorithms try to improve on SHA2-256 in three main ways:
Make things harder for big guys to encourage smaller users to mine
Make things more secure to safeguard against quantum computers
Find something new to sacrifice that is more energy efficient or even "useful"
Harder for the big guys
If the key strength of distributed ledgers is in being decentralized, having only a few main groups doing all of the mining is not ideal. For this reason, projects like Litecoin have replaced SHA2-256 with a different algorithm called Scrypt that requires much more computer memory to search for random numbers. Because the memory requirements are higher, mining for Litecoin favors smaller users over larger users, as large users have difficulty finding economies of scale when building out server farms with high memory requirements.
Like any competitive activity though, with a high enough price creating enough incentive, ASICs (specialized chips that only do one thing) will be developed to more efficiently mine for any successful protocol. These economic incentives led to the creation of Ethereum specific miners being released by ASIC manufacturer Bitmain in 2018 which ran the Equihash algorithm very efficiently, at the expense of all other functionality. Prior to the release of the Bitmain ASIC Ethereum miner, the only way to mine for Ethereum was with dedicated GPUs which were designed primarily for more general purpose gaming and video processing, and thus operated at less hashes per watt efficiency than a dedicated Equihash ASIC.
Instead of using 256 bit keys, other distributed ledger projects have decided to use a combination of longer keys and new peer reviewed encryption techniques. In 2013, the successor to SHA-2 (aptly called SHA-3) was released which has been adopted along with 512 or even 1024 bit encryption keys by projects such as NEM and Stellar Lumens.
The actual encryption mechanism is mostly trivial as any successful evidence of an algorithm like SHA-2-256 being broken would force a split in the network to a newer encryption technique.
The downside for such a move for Bitcoin however, is that the specialized ASIC computers which miners have invested billions into can only crack SHA-2-256, and entirely new ASICs would need to be developed if the math puzzle was updated.
The goal of many in the distributed ledger community is to find a more energy efficient or useful task to perform with all of the computing power available.
The only "useful" work performed by Bitcoin miners is to slowly weaken the security of SHA-2-256 by continually guessing random numbers until eventually (decades hopefully) the security no longer works.
One twist on finding random numbers is to try and find enormous prime numbers instead. While this approach helps accomplish something scientifically useful, it still uses large amounts of electricity to process transactions.
A concept called "proof-of-capacity" or "proof-of-space" mentioned briefly at the top of the chapter requires large amount of hard drive space as "sacrifice" which can be used to create a decentralized storage network that provides useful storage capacity.
The holy grail of useful sacrifice is a system where general computation can be performed by miners in a manner that rivals the efficiency of centralized computation.
Turing complete (systems that can run any arbitrary code) such as smart contracting platforms like Ethereum can perform any general computation, but are currently 40,000x less efficient in terms of computation cost as a centralized competitor such as Amazon Web Services.
The first major paradigm shift away from sacrifice based validation of blocks, is to replace this process with a series of full nodes (computers with a full copy of the ledger) that will validate blocks on your behalf. This approach solves the issue of incentivizing people to keep full copies of the ledger running. In this system, users of all sizes from individuals, to large companies, or even government scale operations maintain full ledgers in exchange for the fee income received when transactions are validated. As the computational resources can be a little as a single Virtual Machine with a few gigabytes of hard drive space, the barriers to entry for node operators vs Bitcoin miners is drastically lower.
While vastly more efficient in terms of energy spent per transaction validated, these "proof-of-stake" type systems have the extremely difficult task of ensuring the nodes validating the transactions will behave honestly.
Thus a random lottery system of some kind needs to be used to ensure a malicious group of nodes cannot take over the network, then process transactions however they see fit (including stealing tokens, refusing to process certain transactions, etc). A primary concern in a "you-elect-the-miner" system is when there is not a sufficient penalty for creating a network of many accounts controlled by one central party. This type of attack is known as a Sybil attack.
To deter Sybil attacks, many novel solutions have been developed that rely on mathematics to determine how trustworthy each node in the system is, and attach a cost to spamming the network with fake accounts. The cost is used to deter the Nothing at Stake problem, which is often illustrated using the spam email example:
If your email system was running on a distributed ledger system, every time someone wanted to send you an email it would cost a tiny fraction of a penny. The cost is low enough that it does not interfere with the normal course of sending and receiving emails, but deters bad actors from sending millions of spam emails.
Popular flavors of "you-elect-the-miner" are listed below. Common elements between all of these different variations involve how validator nodes are chosen and maintain consensus with each other to continually update a global shared ledger of transactions.
*Note not all “you-elect-the miner” approaches have an election mechanism. Confusing right? Some favors of this system simply pay-to-play where you can buy access to become a validator rather than being freely elected. Just like politics!
Delegated-proof-of-stake (DPOS): A system of voting your tokens to a select group of full nodes who perform the block validation. This system was invented by Dan Larimer, and was first implemented on the Bitshares platform, before subtle variations were used to create the Ark, Lisk, and EOS networks, among others. In these systems, only the top X number of full nodes with the most votes are allowed to validate transactions. This system incentivizes full nodes to process transactions honestly as they are rewarded with fees for doing so. However, these systems by design are as centralized as the number of elected validators.
Masternodes: A system similar to DPOS, though Masternodes are not elected directly. Instead, anyone that can pay the required amount to have the minimum number of tokens needed to create a masternode becomes a validator. Examples include Dash, PIVX, and other variations of the masternode architecture.
Leased proof of stake: A hybrid of DPOS and Masternode models where anyone with the minimum number of tokens can validate blocks, but owners with less than the required minimum threshold can pool their resources by "leasing" their tokens to a full node that has the required number of tokens. The Waves platform uses such a system that allows anyone with a pooled amount of 10,000 or more Waves tokens can validate blocks, with plans to lower the threshold requirement over time by voting in new network rules.
Proof of Importance: A validation system used by the NEM platform that uses an implementation of Eigentrust to deter sybil attacks by analyzing relationships between nodes to determine if users are trying to maliciously co-opt the ledger. This system currently employs "supernodes" with 3 million NEM as collateral to validate all transactions on the network.
Slot Leader: A validation system used by Cardano which employs a variation of game theory mathematics to ensure bad actors cannot co-opt the ledger by making the cost of co-opting the ledger higher than the benefits gained.
Practical Byzantine Fault Tolerance: The consensus mechanism governing the IBM Hyperledger, NEO, Stellar, and Ripple projects where interconnected networks of full nodes can be chosen by participants to validate transactions. Variations on solving the Byzantine General's Problem are employed by many other projects all with the same goal to choose a validator that can be trusted to process the transactions honestly.
Each of these validation systems deserves a chapter of its own to explain in full detail, which would involve dissecting each project's individual "whitepaper" (or blueprint) for how they implement their protocol.
In an ideal world, the project with the most efficient and secure consensus mechanism would ultimately win out over projects with less efficient, and thus costlier mechanisms. However, in the real world we know:
Any project can upgrade to any other project's consensus mechanism with enough network support.
While Betamax had the superior image quality, it did not win out over the technically inferior VHS. While the urban legend was the porn industry chose VHS over Betamax which spurred consumer demand for VHS players, the real reason had more to do with tape manufacturers choosing VHS for it’s lower licensing fees and larger run time capacity than Betamax. Read into this cautionary metaphor what you will.
For the last two chapters, we have built on the notion that there is an inherent inefficiency in distributed ledgers due to a monolithic single source of truth that needs to continuously grow as new transactions are added to the ledger. But what if you could create your own distributed ledger that did not rely on a single monolithic source of truth to work?
If you follow the DLT space for long enough, you'll notice how often projects tout "transactions-per-second”: or how many raw transactions the network is capable of processing.
Interestingly, you never hear about the existing centralized internet discussed in how many transactions per second it can process (because it's a absurd notion). The current internet has a maximum throughput limited only by the number of new web servers that can be spun up to meet demand.
This is the heart of the scaling debate: or how to gain the benefits of honest and immutable data records, with the capacity to run the entire existing internet plus the upcoming massive expansion of internet-of-things (IOT) connected devices.
Public chains vs private chains
So far we have only discussed public blockchains like bitcoin. Because bitcoin is an open source protocol, anyone is free to copy-paste the bitcoin source code (or any other open source protocol for that matter) onto their own private server network and create their own independent ledger.
This solution is excellent for keeping bloated transaction data off a universal main public chain, but is terrible for immutability. If a corporation employs a private blockchain purely inside of their network with no outside validation, there is no way to know if transactions are honest, which is the entire point of a distributed ledger in the first place. We will devote an entire chapter later on in the book to this point.
This segues nicely into why sidechains are excellent scalability solutions. In effect, a sidechain is just a private chain that decides to write transactions to a main public chain at regular intervals. As hashes are so efficient at turning data of an arbitrary size into a small fixed size, hashes of the private chain can be stored to the main chain instead of the entire private chain. This can increase the scalability of the protocol by orders the magnitude as each full node does not need to store every copy of every transaction on each sidechain, but only needs to maintain a copy of the more streamlined main chain.
From sidechains, we can move one step further by creating systems there the main chain can be broken into many smaller parts, where nodes only need to store a portion (or shard) of the main chain, rather than the entire thing. The mechanics of sharding are complex and are still in the early development stages as there are significant security issues involved with sharding that are fundamentally at odds with universal consensus.
The concept of sharding is so important it will get its own chapter, stay tuned!
What if there was a system where each end user hosted only his or her local copies of transactions relevant to them, plus a partial copy of their neighbors transactions for create redundancy in the system? This is the heart of implementations such as "hashgraphs", "block lattices" and many other deep jargon terms.
At a high level, these systems want any user to be able to create their own private encrypted network that does not need to communicate with a single main public chain to work. In such systems effectively "you" are the miner that validates your own transactions, or optionally enters into relationships with other nodes on the network to validate transactions for you. (Eg. users that do not wish to run their own software, can still interact with the system by paying a node operator to run networking software and hardware on their behalf)
Such a system does not necessarily need to batch transactions into distinct blocks that must be validated by a single monolithic chain.
Instead a user running a node posts an asynchronous (not at the same time) transaction to their local ledger.
Anyone they wish to share this transaction with can receive the transaction.
No universal copy needs to exist where everyone on the network agrees that the transaction happened. Whether hashes proving the transaction exists to all ledgers should be in the architecture will be explored in chapter 5.
If approached correctly, these new flavors of "multiple consensus" are at the vanguard of the distributed ledger space, and can potentially disrupt "monolithic consensus" ledger based systems. As we will find out further into the book, tying transactions from one ledger to another becomes crucial when trying to prevent someone from spending the same funds twice.
To Flood or Not to Flood
Distributed ledgers in their current incarnation are flood networks which require all nodes to update their ledgers to work. As transactions can occur asynchronously, it takes time for the latest transactions to propagate through the network. Unfortunately, the mathematics of this process are exponential making distributed flood networks orders of magnitude less efficient than centralized networks, as each node must redundantly sync itself with all other nodes for the system to function. (e.g. sending bitcoin is like sending an email to every person on earth with an email address just to get one email to the right person)
At a high level, we have established three possible ways distributed ledgers can reach consensus:
They-are-the-miner based proof-of-work systems maintain their relevance as users agree that "sacrifice" is important to maintain the integrity of the ledger. Rather than transacting directly on the main chain, side chains and off chain solutions could develop that occasionally write transactions to the main chain to for events important enough to spend the electricity for.
You-elect-the-miner systems are chosen over they-are-the-miner systems with the same logic. Eg. sidechain/off chain solutions will use main "you-elect-the-miner" based chain to prove globally that a transaction happened.
You-are-the-miner multi-consensus private chains that operate independently of main chains in overlapping networks of trust. "They-are" and "You-elect" based systems could still exist and be referenced by such "you-are" based systems if there is enough economic incentive to keep monolithic ledgers operating.
Keep in mind there is also nothing stopping an existing protocol from voting in a new system of rules to abide by. Developers can either choose to keep existing tokens relevant via transfers to a new chain, or go out on their own and issue new tokens in a new system in hopes of gaining traction.
Now that we have thoroughly exhausted the different flavors of how these base level consensus protocols work, we can expand to the many other layers of architecture needed to create a more trustworthy internet.