1:3 Other ways to make the sausage
This chapter works to further the ideas presented in the previous brute force consensus chapter by showcasing the three main ways transactions can be validated to reach a consensus about what is placed into the shared ledger database.
Keep in mind by "miner" we mean the computer(s) that place transactions in to the ledger.
"They"-are-the-miner: Various bitcoin like strategies where miners race to find winning random numbers either in proof-of-work like setups illustrated in the previous chapter, or other "sacrifice" based approaches such as proof-of-capacity where hard drive storage space is used for sacrificial purposes. The key element is anyone with a computer can join the race and have a random chance of being selected to validate transactions. As we will see, anyone is a bit of a misnomer due to economies of scale favoring larger and larger pools of miners.
"You"-elect-the-miner: Conversely works by token holders in the system electing who they want to validate transactions, rather than through purely random chance competition. Voters cast their ballots most often with a one-token-per-vote type system, and expect to receive a portion of the network fees back for casting their vote for a particular miner. Within the elected group, an element of randomness must still be present or the system would work no differently from a centralized database. (E.g., the validators must not be able to consistently win or the ledger risks being co-opted)
"You"-are-the-miner: A catchall term for all systems that do not fall neatly into the previous two paradigms. In these systems there are flavors of storing your own local chain that matches up with some subset of a global chain of transactions to verify consensus. In effect "you" (really your local computer) does some of the work of a miner to push out transactions.
There are no fast and set rules separating these three main approaches, and any distributed ledger network is free to fundamentally alter the way it validates transactions by either:
The majority choosing to support an updated version of the ledger (no split)
The minority choosing to support an updated version of the ledger (split into two projects - legacy version & updated version)
Starting a brand new ledger from scratch with new rules and new owners
Due to the decentralized nature of distributed ledgers, gaining consensus to change critical network parameters is a difficult and often contentious process.
"They" Are the (Un-elected) Miner
Bitcoin was the first true distributed ledger, setting the precedent for a sacrifice based approach to ensure transactions were honestly processed in a provable, unmodifiable way.
Within this "they-are-the-miner" electricity burning server farm approach there are three primary sub approaches which seek to improve scalability and security of the network.
Off-chain: as the name implies is method allows for transactions that are not recorded onto the ledger each time, but rather reference the ledger only when there is a conflict.
Big Blocks: increasing the block size to increase the amount of transactions
New Sacrifice: replacing the SHA-256 algorithm bitcoin uses with a different type of sacrifice using a different random number finder or other type of sacrifice.
The dominant version of Bitcoin (as of May 2018) chose to implement a multi year development called the lightning network which creates a second layer of transaction processing on top of the primary bitcoin blockchain.
The idea works by opening a channel of transactions where parties can make many transactions with each other without needing to send each transaction to the ledger.
This approach can be hugely beneficial for both privacy (as there is no blockchain record of the transaction) and scalability (as the 1 megabyte of transactions bitcoin can process every 10 minutes is not filled up with off chain lightning transactions)
When a channel (or special system for guaranteeing transactions without using the blockchain) between parties is opened or closed, a single transaction is registered with the blockchain removing the need to register every intermediate transaction on the chain.
This however can be problematic as both parties need to remain online for the system to work. This need for "always on" can tax user resources to maintain an internet connection, as well as become a potential security risk as you cannot air gap (store bitcoin offline) if the lightning wallet needs to be online to function.
Off-chain developments are a much larger topic that will be discussed in more detail in the next chapter. As off-chain solutions effectively create multiple layers in which to route transactions, off-chain solutions can be seen as the evolution of distributed ledger systems into something that more closely resembles the current internet where many layers of architecture act in coordination to run a massive interconnected global ecosystem.
The alternate version of Bitcoin called "Bitcoin Cash" removed the ability to perform these off-chain transactions in favor of increasing the block size and retaining the simplicity of the original Bitcoin protocol. As of May 2018 each Bitcoin Cash block can process 32 megabytes worth of transactions every 10 minutes.
This approach increases the storage requirements to host a full node (or full copy of the ledger). Larger storage requirements to maintain a full copy of the ledger can limit the number of distributed ledgers people are willing to run. If you need to store 32 megabytes of new information every 10 minutes just to send a payment, most average users would not be willing to purchase a new hard drive space just to use the system.
Thus most users will not download the full copy of the ledger (called a full node) and instead rely on a lite wallet (or software that does not need a full copy of the ledger to work, but instead just references someone else's full copy)
A thought provoking blog post imagines a world with many terabyte blocks that can record every waking moment of life. In such an extreme example, the hosting of the nodes would by necessity become centralized as storing such vast amounts of information could only be done by large scale operations.
The Bitcoin protocol does not provide any incentives to host full copies of the ledger, and instead only rewards the miners that process the transactions. For a distributed ledger system to succeed at scale, creating incentives to host robust full node copies of the ledger will become ever more crucial.
As mentioned in the previous chapter, Bitcoin uses a very specific math puzzle called SHA2-256 to race for random numbers that will win each new block. Since the introduction of SHA2-256, many newer variants of finding a random number have entered the market with even higher levels of encryption that provide new strengths and drawbacks to the ecosystem.
These newer algorithms try to improve on SHA2-256 in three main ways:
Make things harder for big guys to encourage smaller users to mine
Make things more secure to safeguard against quantum computers
Find something new to sacrifice that is more energy efficient or even "useful"
Harder for the big guys
If the key strength of distributed ledgers is in being decentralized, having only a few main groups doing all of the mining is not ideal.
Thus projects like Litecoin have replaced SHA2-256 with a different algorithm called Scrypt that requires much more computer memory to search for random numbers.
Because the memory requirements are higher, mining for litecoin favors smaller users over larger users, as large users have difficulty finding economies of scale when building out server farms with high memory requirements.
Like any competitive activity though, with a high enough price creating enough incentive, ASICs (specialized chips that only do one thing) will be developed to more efficiently mine. This processes happened to Ethereum in 2018 with the release of the Bitmain ASIC miners that run the Equihash algorithm only, rather than more general purpose GPUs that were used prior to mine Ethereum that could do many other tasks such as playing video games and training AIs.
Instead of using 256 bit keys, other systems simply use a combination of longer keys and new encryption techniques.
In 2013, the successor to SHA-2 (aptly called SHA-3) was released which has been adopted along with 512 or even 1024 bit encryption keys by projects such as NEM and Stellar Lumens.
The actual encryption mechanism is mostly trivial as any successful evidence of an algorithm like SHA-2-256 being broken would force a split in the network to a newer encryption technique.
The downside for such a move for Bitcoin however, is that the specialized ASIC computers which miners have invested billions into can only crack SHA-2-256, and an entirely new ASICs would need to be developed if the math puzzle was updated.
The goal of many in the blockchain community is to find a more energy efficient or useful task to perform with all of the computing power available.
The only "useful" work performed by Bitcoin miners is to slowly weaken the security of SHA-2-256 by continually guessing random numbers until eventually the security no longer works.
One twist on finding random numbers is to try and find enormous prime numbers instead. While this approach helps accomplish something scientifically useful, it still uses large amounts of electricity to process transactions.
A concept called "proof-of-capacity" or "proof-of-space" requires large amount of hard drive space as "sacrifice" which can be used to create a decentralized storage network that provides useful storage capacity.
The holy grail of useful sacrifice is a system where general computation can be performed by miners in a manner that rivals the efficiency of centralized computation.
Turing complete (systems that can run any arbitrary code) smart contracting platforms like Ethereum can perform this general computation, but are currently 40,000x less efficient in terms of computation cost as a centralized competitor such as Amazon Web Services.
The first major paradigm shift away from sacrifice based validation of blocks, is to replace this process with a series of full nodes (computers with a full copy of the ledger) that will validate blocks on your behalf. This approach solves the issue about how full nodes are incentivized to keep full copies of the ledger running.
While drastically more efficient in terms of energy spent per transaction validated, these "proof-of-stake" systems have the extremely difficult task of ensuring the nodes validating the transactions will behave honestly.
Thus a random lottery system of some kind needs to be used to ensure a malicious group of nodes controlled by a central party cannot take over the network and process transactions however they see fit (including stealing tokens, refusing to process certain transactions, etc).
A primary concern in a "you-elect-the-miner" system is when there is not a sufficient penalty for creating a network of many accounts controlled by one central party. (known as a Sybil Attack)
To deter sybil attacks, many novel solutions have been developed that rely on mathematics to determine how trustworthy each node in the system is, and attaches a cost to spamming the network with fake accounts. The cost is used to deter the Nothing at Stake problem, which is often illustrated using the spam email example:
If your email system was running on a distributed ledger system, every time someone wanted to send you an email it would cost a tiny fraction of a penny. The cost is low enough it does not interfere with the normal course of interactions, but deters bad actors from sending millions of undirected marketing emails.
Popular flavors of "you-elect-the-miner" are listed below. Common elements between all of these different variations involve how validator nodes are chosen and maintain consensus with each other to continually update a global shared ledger of transactions.
Delegated-proof-of-stake (DPOS): A system of voting your tokens to a select group of full nodes who perform the block validation. This system was invented by Dan Larimer, anf was first implemented on the Bitshares platform, before subtle variations were used to create the Ark, Lisk, and upcoming EOS network. In these systems only the top X number of full nodes with the most votes are allowed to validate transactions. This system incentivizes the full nodes to process transactions honestly as they are rewarded with fees for doing so, however these systems by design are as centralized as the number of elected validators.
Masternodes: A system similar to DPOS, though Masternodes are not elected directly. Instead anyone that can pay the required amount to have the minimum number of tokens needed to create a masternode becomes a validator.
Leased proof of stake: A hybrid of DPOS and Masternode models where anyone with the minimum number of tokens can validate blocks, but owners with less than the required minimum threshold can pool their resources by "leasing" their tokens to a full node that has the required number of tokens. The Waves platform uses such a system that allows anyone with a pooled amount of 10,000 or more Waves tokens can validate blocks.
Proof of Importance: A validation system used by the NEM platform that uses an implementation of Eigentrust to deter sybil attacks by analyzing relationships between nodes to determine if users are trying to maliciously co-opt the ledger. This system currently employs "supernodes" with 3 million NEM as collateral to validate all transactions on the network.
Slot Leader: A validation system used by Cardano which employs a variation of game theory mathematics to ensure bad actors cannot co-opt the ledger by making the cost of co-opting the ledger higher than the benefits gained.
Practical Byzantine Fault Tolerance: The consensus mechanism governing the IBM Hyperledger, NEO, Stellar, and Ripple projects where interconnected networks of full nodes can be chosen by participants to validate transactions. Variations on solving the Byzantine General's Problem are employed by many other projects all with the same goal to choose a validator that can be trusted to process the transactions honestly.
Byzantine General's Problem: Each ‘general’ maintains an internal state (ongoing specific information or status). When a ‘general’ receives a message, they use the message in conjunction with their internal state to run a computation or operation. This computation in turn tells that individual ‘general’ what to think about the message in question. Then, after reaching his individual decision about the new message, that ‘general’ shares that decision with all the other ‘generals’ in the system. A consensus decision is determined based on the total decisions submitted by all generals.
Each of these validation systems deserves a chapter of its own to explain in full detail, which would involve dissecting each project's individual "whitepaper" (or blueprint) for how they implement their protocol.
In an ideal world, the project with the most efficient and secure consensus mechanism would ultimately win out over projects with less efficient and thus costlier mechanisms. However in the real world we know:
Any project can upgrade to any other project's consensus mechanism with enough network support.
While Betamax had the superior image quality, it did not win out over the technically inferior VHS. This was partially due to the porn industry choosing VHS which spurred consumer demand for VHS players, but more likely because VHS tape manufacturers geared up production for the lower licensing fees VHS offered over the proprietary Betamax, and VHS being capable of recording longer videos. Read into this cautionary metaphor what you will.
(Onchain) Scaling Debate
For the last two chapters, we have built on the notion that there is an inherent inefficiency in distributed ledgers due to a monolithic single source of truth that needs to continuously grow as new transactions are added to the ledger.
But what if you could create your own distributed ledger that did not rely on a single monolithic source of truth to work?
If you follow the DLT space for long enough you'll notice how often projects tout "transactions-per-second" or how many raw transactions the network is capable of processing.
Interestingly, you never hear about the existing centralized internet discussed in how many transactions per second it can process (because it's a absurd notion). The current internet has a maximum throughput limited only by the number of new web servers that can be spun up if there is too much demand.
This is the heart of the scaling debate: or how to gain the benefits of honest and immutable data records, with the capacity to run the entire existing internet plus the upcoming massive expansion of internet-of-things connected devices.
Public chains vs private chains:
So far we have only discussed public blockchains like bitcoin. Because bitcoin is an open source protocol, anyone is free to copy-paste the bitcoin source code (or any other open source protocol for that matter) onto their own private server network and create their own independent ledger.
This solution is excellent for keeping bloated transaction data off a universal main public chain, but is terrible for immutability. If a corporation employs a private blockchain purely inside of their network with no outside validation, there is no way to know if transactions are honest, which is the entire point of a distributed ledger in the first place.
This segues nicely into why sidechains are excellent scalability solutions. In effect, a sidechain is just a private chain that decides to write transactions to a main public chain at regular intervals. As hashes are so efficient at turning data of an arbitrary size into a small fixed size, hashes of the private chain can be stored to the main chain instead of the entire private chain. This can increase the scalability of the protocol by orders the magnitude as each full node does not need to store every copy of every transaction on each sidechain, but only needs to maintain a copy of the more streamlined main chain.
From sidechains, we can move one step further by creating systems there the main chain can be broken into many smaller parts, where nodes only need to store a portion (or shard) of the main chain, rather than the entire thing. The mechanics of sharding are complex and still not available for mainstream usage in the distributed ledger space, as there are significant security issues involved with sharding fundamentally at odds with universal consensus.
It is unclear if and how sharding can securely be accomplished inside of existing monolithic ledger paradigms be it elected proof of stake systems or non elected proof of work systems. Thus, the final sausage making paradigm we will explore does away with the notion of universal consensus to open up possibilities of sharding, and thus theoretically infinite scaling.
What if there was a system where each end user hosted only his or her local copies of transactions relevant to them, plus a partial copy of their neighbors transactions for create redundancy in the system?
This is the heart of implementations such as "hashgraphs", "block lattices" and many other deep jargon terms.
At a high level, these systems want any user to be able to create their own private encrypted network that does not need to communicate with a single main public chain to work.
In such systems effectively "you" are the miner that validates your own transactions, or optionally enters into relationships with other nodes on the network to validate transactions for you. (Eg. users that do not wish to run their own software, can still interact with the system by paying a node operator to run a more full copy of the ledger on their behalf)
Such a system does not necessarily need to batch transactions into distinct blocks that must be validated by a single monolithic chain.
Instead user running a node posts an asynchronous transaction to their local ledger.
Anyone they wish to share this transaction with can receive the transaction.
No universal copy needs to exist where everyone on the network agrees that the transaction happened.
These "multiple consensus" systems are at the vanguard of the distributed ledger space, and can potentially disrupt "monolithic consensus" ledger based systems.
Distributed ledgers in their current incarnation are flood networks which require all nodes to update their ledgers to work. As transactions can occur asynchronously, it takes time for the latest transactions to propagate through the network. Unfortunately the mathematics of this process are exponential making distributed flood networks orders of magnitude less efficient than centralized networks, as each node must sync itself with all other nodes over time for the system to function.
At a high level, we have established three possible ways distributed ledgers can reach consensus:
They-are-the-miner based proof-of-work systems maintain their relevance as users agree that "sacrifice" is important to maintain the integrity of the ledger. Rather than transacting directly on the main chain, side chains and off chain solutions could develop that occasionally write transactions to the main chain to for events important enough to spend the electricity for.
You-elect-the-miner systems are chosen over they-are-the-miner systems with the same logic. Eg. sidechains will store hashes onto the main "you-elect-the-miner" based chain to prove globally that a transaction happened.
You-are-the-miner multi-consensus private chains will operate independently of main chains in overlapping networks of trust. "They-are" and "You-elect" based systems could still exist and be referenced by such "you-are" based systems if there is enough economic incentive to keep monolithic ledgers operating.
Keep in mind there is also nothing stopping an existing protocol from voting in a new system of rules to abide by. Developers can either choose to keep existing tokens relevant via transfers to a new chain, or go out on their own and issue new tokens in a new system in hopes of gaining traction.
Now that we have thoroughly exhausted the different flavors of how these base level protocols work, we can expand to the many other layers of architecture needed to create a new decentralized internet beyond the core consensus mechanism.