History
The idea for Ethereum was born at a time when Bitcoin was just beginning to gain traction and attract the interest of many developers who, understanding the innovative scope of Bitcoin and the blockchain, sought to extend its functionality, taking it beyond the simple idea of a cryptocurrency and a digital payments system.
To this end, developers were faced with two options:
- build on the infrastructure offered by Bitcoin, going off-chain, on subsequent layers, which, however, led to a loss of some of the advantages given by the blockchain;
- build a new blockchain trying to overcome Bitcoin’s limitations that prevented extending its functionality (block size, block creation time, non-touring-complete scripting language, etc.).
An example of the implementation of the first option was Mastercoin, often regarded as the first alt-coin, a protocol built on top of Bitcoin (for those who know more about computing, it is comparable to the relationship between HTTP and TCP/IP) that added new features to those existing in Bitcoin, including a rudimentary version of smart contracts
In this context, in late 2013, Vitalik Buterin , a Russian Bitcoin enthusiast programmer, then 19 years old, thought of a way to extend the capabilities of Bitcoin and Mastercoin, and shared a whitepaper
The whitepaper attracted the attention of many developers, including Gavin Wood (one of the authors of the book Mastering Ethereum) who began working with Vitalik, becoming co-founder, co-designer and CTO of the project.
Combining existing technologies with innovative ones, Vitalik and his collaborators thus developed Ethereum, and on July 30, 2015, the first blockchain was mined.
How it works
We have seen the reasons why Ethereum was born but now let’s try to understand what it is.
Ethereum is a decentralized, permissionless computational system that runs programs called smart contracts.
Ethereum uses a general-purpose blockchain to record state changes, brought about by an Ethereum Virtual Machine(EVM).
Ethereum also includes a built-in cryptocurrency, Ether, which serves two main purposes:
- to provide a first layer of liquidity to enable exchanges between various types of digital assets;
- to provide a mechanism for paying transaction fees (which we will see later).
Ethereum is proposed as a kind of distributed super-computer, on which decentralized applications (DApps) can be “installed” and run with economic functions but not exclusively.
It is thus clear that Ethereum has at the same time many commonalities with Bitcoin but also many differences.
It shares with the first cryptocurrency the fact that it is a peer-to-peer, permissionless and trustless network, as well as the fact that it bases the system state change on a Byzantine fault-tolerant consensus algorithm. Central also is the use of cryptography, just as in Bitcoin.
In addition, Bitcoin can be seen as a distributed machine that has a state (representing possession of the cryptocurrency), updated through distributed consensus, where transactions cause an update of this state.
Ethereum conceptually is very similar, with the difference that not only the state of possession of the cryptocurrency is tracked but also arbitrary data.
Here we come to the differences: while Bitcoin was born with the purpose of creating a cryptocurrency and a digital payment network saved in a shared and synchronized ledger, Ethereum turns its focus to the creation of a decentralized computer that allows for multiple operations, which is reflected in the use of a Turing-complete scripting language, used for writing smart contracts, where Bitcoin instead possesses a more essential and non-Turing-complete scripting language.
We can therefore say that the great innovation proposed by Ethereum is to combine the architecture of a program saved in a general-purpose computer with a decentralized blockchain.
A number of concepts emerge from this brief and summary description of Ethereum that will be explored individually throughout this overview.
Table of Contents:
- Turing completeness and gas
- Ether and accounts
- Transactions
- Network
- Consensus protocol
- Smart contracts, DApps and Web 3.0
- Ethereum Virtual Machine (EVM)
- Tokens
- Oracles
- Development process, forks and governance
- Ethereum 2.0
Turing completeness and gas
We have been talking about Turing Completeness, a concept that is fundamental to Ethereum, and to computer science in general, but which may not be so straightforward.
Alan Turing was an English mathematician, considered the father of computer science.
Among the many things he did, he came up with the concept of the Turing machine (Turing Machine), an abstract model defining a machine that can perform algorithms by reading and writing symbols
In computer science, a system is recognized as Turing-complete if it can implement any Turing machine, that is, if it can solve any problem that admits of solution (regardless of the time and memory it will have to take to solve it).
However, there are cases in which problems are unsolvable, and one of these is thehalting problem, which we can summarize as the impossibility of creating an algorithm capable of determining whether a program written in a Turing-complete language will come to an end with any input or whether it will instead continue its execution indefinitely
Given the existence of this problem, the idea of a Turing-complete system, which initially may seem exclusively positive, since it allows for any problem to be solved that has a solution, also shows an obvious criticality.
To give a practical example, a program can lead to an infinite loop from which it will never exit.
For this very reason, although Turing completeness is very easy to achieve, the risks involved are often considered so high that non-Turing-complete systems are preferred.
This becomes even more evident when talking about an open system as in the case of a permissionless blockchain.
This explains Satoshi Nakamoto’s choice to integrate a non-Turing-complete scripting language into Bitcoin, limiting functionality but promoting system security. Among the missing features, the absence of loops can be highlighted, not surprisingly.
In the Bitcoin overview, we explained how this scripting language is used to lock and unlock UTXOs (Unspent Transaction Outputs), establishing conditions under which someone can spend bitcoins.
We also know that every transaction must be validated by every node in the network.
The danger of Turing completeness is obvious. Think of the possibility of an infinite loop in one of these scripts that unlock UTXOs: a node that were to execute it would find itself trapped in an infinite loop that would consume its computational resources, effectively resulting in a DoS attack.
Given this risk, Ethereum still opts for Turing completeness, introducing a mechanism, called gas, aimed at avoiding problems like the one just described.
When the EVM (Ethereum Virtual Machine) executes a smart contract, it keeps track of each instruction, each of which has a cost calculated in gas.
The EVM stops the execution of a smart contract if the cost in gas is greater than the gas available to the person who requested the execution.
In this way, a script containing an infinite loop would have an infinite cost to be called.
Ether and accounts
So we have seen that interacting with the programs contained in this distributed computer costs gas.
But how does one come into possession of gas?
You cannot buy gas externally on Ethereum; it is not for sale on exchanges or other platforms.
You can only buy gas internally in Ethereum, specifically internally in a transaction, as we will see more about later.
For now, the key thing to understand is that in order to buy gas, you need to own the monetary unit of Ethereum: Ether.
Let’s quickly correct a mistake that is often seen made: it is wrong to refer to the cryptocurrency as “Ethereum,” a term that refers to the system. The name of the cryptocurrency is Ether (ETH).
1 ETH can be divided into smaller units, down to the absolute smallest, called wei, which is equivalent to one quintillionth of Ether.
To be precise, in the Ethereum system we do not reason in Ether but in wei, so within transactions we will see the figures expressed in wei.
A table with their respective names and the various units follows:
Value (in wei) | Exponent | Name | Nickname |
---|---|---|---|
1 | 1 | Wei | Wei |
1,000 | 103 | Kilowei or femtoether | Babbage |
1,000,000 | 106 | Megawei or picoether | Lovelace |
1,000,000,000 | 109 | Gigawei or nanoether | Shannon |
1,000,000,000,000 | 1012 | Microether or micro | Szabo |
1,000,000,000,000,000 | 1015 | Milliether or milli | Finney |
1,000,000,000,000,000,000 | 1015 | Ether | Ether |
1,000,000,000,000,000,000,000 | 1021 | Kiloether | Grand |
1,000,000,000,000,000,000,000,000 | 1024 | Megaether |
As we mentioned, Ether is used to pay transaction fees but also to provide an initial level of liquidity to enable economic exchanges within the system.
Finally, it should be mentioned that Ether differs from bitcoin in that the former has an inflationary supply while the latter has a deflationary supply.
This implies that bitcoin has a fixed maximum supply (of about 21 million total BTC) while Ether does not have an unlimited supply, but an annual cap of 18 million ETH.
In addition, with Ethereum’s genesis block 60 million Ether were created and another 12 million were allocated to the development fund and then distributed among early contributors and the Ethereum foundation
In addition to these first Ether releases, to understand Ethereum’s monetary distribution policy we need to talk about the concept of mining, as the two are very closely related. Therefore, this topic will be explored in more detail in the chapter on Ethereum’s consensus algorithm.
It therefore becomes necessary to have a tool to identify ownership within the system, which is accomplished through the use of accounts, created with cryptographic functions, as in the case of Bitcoin.
There are two types of accounts in Ethereum:
- Externally Owned Account (EOA) = identified by a private key that grants control over funds and contracts. These are the accounts that participants in the Ethereum system create;
- Contract accounts = are the smart contracts accounts, identified by a smart contract code, which EOAs do not own. Contract accounts do not possess a private key.
In either case, an account has the following 4 fields
- nonce = a counter indicating the number of transactions sent by an account (in the case of an EOA) or the number of contracts created by an account (in the case of a contract account);
- balance = the Ether balance of the account, denoted in wei;
- codeHash = the code of an account, an empty field in EOA accounts and used instead in contract accounts. This is the code of the contract that will be executed by the EVM, as we will see later;
- storageRoot = permanent data store, used only by contract accounts and empty in EOAs.
As with Bitcoin, one of the key technologies in Ethereum is cryptography.
We have mentioned private keys and addresses but it is necessary to go into more detail, as they are one of the foundations of the blockchain and one of the reasons for its security.
Cryptography is not only used to encrypt information but also to be able to prove the authenticity of data and knowledge about it, without revealing it.
In blockchain, public key cryptography (PKC) is crucial, and Ethereum is no exception.
PKC, which came into the public domain during the 1970s, is based on the use of a key pair: private key and public key.
The public key, comparable to a bank account number, is useful for publicly identifying an account (so, for example, to receive funds).
The private key, comparable to a bank account PIN, is needed to use the funds one holds.
One of the peculiarities of PKC is that these key pairs are based on mathematical functions that are easy to compute in one direction but difficult (if not practically impossible) to compute in the opposite direction, which is why it is also called asymmetric cryptography and is referred to as one-way functions.
In particular, there is a set of functions based onelliptic curve multiplication, themselves based on the discrete logarithm problem
These types of functions are particularly secure and form the basis of how the blockchain works.
In Ethereum, as in Bitcoin, we start with a private key, from which we derive a public key, from which we derive an address.
Theaddress is usually used within transactions to indicate the recipients of payments.
The private key is used to authorize payments, through the creation of digital signatures, the sort of fingerprints that certify that we actually authorized a payment, because they can only be generated by those who know the private key of an account. Of course, the private key remains secret, because the knowledge of an address and a digital signature attached to it will suffice to guarantee the authenticity of a transaction, that is, to guarantee that it was actually the owner of a certain amount of money who wanted to pay with his or her own money.
So how does one go about creating an EOA (Externally Owned Account)?
First you have to choose a private key, which is a number between 1 and2256.
At this stage the important thing is that the choice of the number is made with an appropriate level of entropy and randomness. It is recommended, for example, to flip a coin 256 times and mark the result (0 or 1) each time. Thus the private key is obtained in binary system.
Using a wallet, this is done automatically but one must always make sure that the level of entropy used in generating the number is adequate.
At first glance it might seem risky to choose a number randomly, as one might think that someone else might arrive at the same number as us.
With the appropriate level of entropy this is impossible. It is a number on the order of 1077, where the visible universe is estimated to contain1080 atoms.
Once the private key is chosen, the public key is obtained by elliptic curve multiplication, a very special, irreversible multiplication (because the opposite operation, division, does not exist), represented by the expression:
K = k * G
Where K is the public key, k is the private key, and G is a constant point on the elliptic curve.
Ethereum uses the same elliptic curve as Bitcoin (secp256k1).
Multiplication will yield the coordinates of a new point on the curve, which, serialized, will result in the public key.
Instead, a cryptographic hash function is used to obtain the address.
These are one-way functions that take data of arbitrary size as input and return strings of fixed size as output.
Being one-way, these functions are also irreversible, in addition to having other key features. For example, given the same input, the same output will always be produced; changing the input by even one character will totally change the output.
Ethereum uses the Keccak-256 hash function for many things.
Among the many uses is address generation from a public key.
Ethereum addresses are unique identifiers derived from a public key (in the case of EOA) or from contracts (in the case of Contract accounts) using the Keccak-256 hash function.
Indeed, it should be remembered that Contract accounts do not possess the private-public key pair but only the address.
Specifically, given the public key as input to Keccak-256, a hash will be produced of which only the last 20 bytes will be taken, which will be the Ethereum address. Often these 20 bits are preceded by “0x” to identify that they are encoded in hexadecimal.
We specify that the entire process of creating an EOA does not have to take place online; rather, it is a mathematical process that can be done offline.
Once you have created an EOA, you have to choose a wallet, an application that helps you manage your Ethereum account by providing a way to communicate with the Ethereum system.
There are several wallets, the most popular being Metamask, which installs as a browser extension.
Without going into the details, which you can find in the lessons dedicated to wallets, it should be emphasized that one must choose secure devices in which to use wallets, because these hold the keys that identify our account.
Ownership in Ethereum, as in Bitcoin, is established by having these keys, so it is essential to store them in the most secure way to avoid on the one hand the risk of losing them and on the other the risk of them being stolen.
Transactions
At this point we have a clear idea of what types of accounts exist in Ethereum.
Once one has an account, we are interested in how ownership is transferred in Ethereum, that is, how transactions take place.
We can start by making a comparison with Bitcoin, since, in this aspect, the two blockchains totally diverge.
In detail, Bitcoin is based on Unspent Transaction Outputs (UTXO), which are the outputs of transactions made that have not yet been spent.
The state of Bitcoin at any given time can be said to be given by the list of all UTXOs.
From the UTXOs the balance of individual accounts is derived.
In Ethereum, on the other hand, you have an account-based model, in which the global state at a given time is given by the balance of all accounts.
If Bitcoin can be compared to how cash works (where the various UTXOs represent coins and bills), Ethereum’s model can be likened to a bank account.
That clarification having been made, let us go on to expose how Ethereum’s transactions work, transactions that form the basis of the system as they are what makes it work, what creates the smart contracts, gets them executed by the EVM, and most importantly what causes the global state of the system to change.
rawTx = { nonce: '0x00', gasPrice: '0x09184e72a000', gasLimit: '0x2710', to: '0x0000000000000000000000000000000000000000', value: '0x00', data: '0x7f7465737432000000000000000000000000000000000000000000000000000000600057' r: '0x09ebb6ca057a0535d6186462bc0b465b561c94a295bdb0621fc19208ab149a9c', s: '0x440ffd775ce91a833ab410777204d5341a6f9fa91216a6f3ee2c051fea6a0428', v: '0x25' }
A transaction in Ethereum has a basic structure that must contain certain data. This data is serialized and transmitted to the Ethereum network. The serialized transaction version for the network is the only existing standard for a transaction.
A transaction is transmitted to the network as a binary message that contains:
- nonce = a scalar value equal to the number of transactions sent by the transaction creator;
- gasPrice = the price of gas that the transaction creator wants to pay;
- gasLimit = the maximum amount of gas the transaction creator wants to pay;
- recipient = the address of the recipient;
- value = the amount of Ether to be sent to the recipient;
- data = a payload of binary data of variable length;
- v,r,s = the three components of the ECDSA digital signature applied by the transaction creator.
The transaction is serialized using a Recursive Length Prefix (RLP) encoding scheme, so the fields just listed are not visible in the serialized transaction because they are encoded in RLP.
It can also be seen that the address of the person creating the transaction is not present, because from v,r,s we get the public key from which we get the address.
Let us now look at the above fields in detail.
The nonce is an attribute of the creator of the transaction and is meaningful only in relation to its address. It is a dynamically calculated value that indicates the number of confirmed (on-chain) transactions originated from the address of the transaction creator.
If you create transactions manually, it is important to keep track of the nonce, because it is critical that transactions are processed sequentially, based on the nonce.
Let’s take a practical example: let’s say the case where I create a transaction with nonce 0 and transmit it to the network, which will validate it and place it in a block. At this point the transaction will be confirmed. If I then transmit to the network a transaction with nonce 2, it will not be confirmed but it will remain in the node mempool, that is, the place where the still unconfirmed transactions are, because there is a transaction coming from my address that has nonce 1 missing.
The moment I transmit a transaction with nonce 1 to the network, it will be confirmed first and then later the one with nonce 2 that was left in the mempool.
Then there are the two fields related to gas, which we can see as the gasoline that feeds the EVM, as seen earlier.
With gasPrice we specify the price at which the transaction creator wants to buy gas. The price is measured in wei per unit of gas. The higher the gasPrice, the faster the transaction will be confirmed.
During periods when there is little congestion in the network, however, it is possible for transactions to be placed in a block even if the gasPrice is set to zero, a case in which it is called a fee-free transaction.
The gasLimit, on the other hand, specifies the maximum number of gas units the transaction creator wants to buy to complete the transaction.
For transactions directed toward an EOA, the gas limit is set at 21000. For transactions directed toward a Contract account, it is impossible to determine accurately.
The price to be paid for a transaction, therefore, can be calculated by multiplying the gas fee * gasPrice. In case the result is greater than the Ether held by the account creating the transaction, it will be invalid.
The recipient field is nothing but the 20-byte Ethereum address of the recipient of the transaction, whether it is an EOA or a Contract address.
This field is not validated by the Ethereum protocol, so it must be validated at the level of the user interface that is used to create the transaction.
This means that if this verification does not take place and you enter an address that does not exist (as long as it is 20 bytes), the transaction will still take place and Ether will be lost, because Ethereum has no way of knowing whether the address is being used or not.
The value field contains the amount of ether that the creator of the transaction wants to send to the recipient.
The date field, on the other hand, contains data that is usually used to specify how to interact with a smart contract.
Both of these last two fields are not mandatory and we can have 4 combinations:
Value | Date | Transaction Name | Recipient | Effect |
---|---|---|---|---|
Payment | EOA, Contract Account | It is usually used to send Ether to an EOA. If sending to a Contract account, it will first be checked if the contract has any default actions to perform, otherwise it will just increase the account balance. | ||
Invocation | Contract account | Usually the data sent is addressed to Contract account and is a function of the contract to be executed and the arguments to be passed to the function. | ||
Payment & invocation | Contract account | Union of the previous two items. | ||
Gas waste |
A special case of transaction, that of creating a new smart contract on the blockchain, should then be indicated.
This type of transaction has the special feature of always having the same recipient, called zero address because the address is 0x0.
This address does not belong to anyone and cannot spend Ether or create transactions, it only serves to indicate that the transaction is one of creating a smart contract.
This transaction needs only the date field, not the value field. In case it also possesses the value field, the Ether sent will form the initial balance of the new contract.
Once the raw transaction has been created with the fields seen so far, a process that can be done offline, it is necessary to create the digital signature, which is used to certify the authenticity of the transaction, as well as its provenance (without, of course, specifying the private key).
The digital signature, in Ethereum, is created using Elliptic Curve Digital Signature Algorithm (ECDSA).
The transaction seen so far is taken, performed RLP-encoding. The result is passed as input to the Keccak-256 hash function, which returns a hash.
The ECDSA signature is then computed, signing the hash obtained with the transaction creator’s private key, and the result is the v,r,s seen earlier, which is added to the transaction and from which the transaction creator’s public key and its digital signature can be derived. With these two pieces of data, the authenticity of the transaction can be verified.
Everything seen so far takes place offline. The Ethereum network comes into play at the point when you need to transmit the transaction to the network.
In order to do so, one simply needs to send it to a node on the network, which will verify it and, if the transaction turns out to be valid, will propagate it to the nodes with which it is in communication.
Before delving into this step, however, it is necessary to explain how the Ethereum network works.
Network
The Ethereum network is a peer-to-peer, decentralized, permissionless network consisting of nodes communicating with each other through a gossip protocol.
Actually, there are several networks based on the Ethereum protocol, which follow and conform according to the formal specifications defined in the Ethereum Yellow Paper.
These networks may or may not communicate with each other.
There are also different Ethereum clients, and not necessarily every client works for every Ethereum-based blockchain.
To join one of the Ethereum-based networks, a node of any type must be installed on a device.
In fact, there are different types of nodes, which perform different functions.
There are full nodes, which are those that contribute to the robustness, health and security of the blockchain on which they depend, because they own a full copy of the ledger, so they can help other nodes obtain data and can also independently verify all transactions and contracts.
A full node then allows them to interact directly with contracts published on the blockchain, without the need for intermediaries, as well as to publish contracts on the blockchain again without intermediaries.
Obviously, however, installing a full node involves onerous hardware requirements, which are continually growing.
Currently, the recommended requirements are:
- CPU with 4 cores
- 16GB RAM
- SSD of at least 1TB
- 25 Mbit/sec internet download speed
Once the node is installed, it must synchronize with the rest of the network, a process that takes a long time because every transaction must be verified.
Then there are Remote Ethereum Clients, lighter versions that have smaller hardware requirements but can only perform some of the functions of a full client and obviously do not possess a copy of the transaction log.
It should be specified that Remote Ethereum Clients are not like Bitcoin’s light clients. In fact, light clients validate block headers using Merkle proofs. Remote Ethereum Clients, on the other hand, validate neither block headers nor transactions but rely completely on full nodes that give access to blockchain data.
One can guess, therefore, that they lose a lot in security and anonymity.
Remote Clients can be of various types and have more or less functions. They range from smarphone wallets that can essentially manage key pairs, create, sign and transmit transactions to the network, to more complex browser wallets, such as MetaMask, that allow them to interact with DApps and offer external services such as block explorers.
So let us return to the transaction seen earlier. The most secure option is to possess a full node, so that it autonomously transmits the transaction to the network.
In the absence of a full node, one will need to use a Remote Ethereum Client that will send the transaction to a full node.
In each of the two cases, the full node will validate the transaction and, if it is valid, save it locally and propagate it to the nodes with which it is in communication, called neighbors (on average at least 13).
These will in turn validate the transaction, save it locally and propagate it to their neighbors. Within seconds, the transaction will reach the entire netowrk via this propagation mechanism, called gossip protocol, based on the exchange of messages between nodes.
Among the nodes in the network that receive the transaction, however, there are particular nodes, called mining nodes, which brings us to the next topic: consensus in Ethereum.
Consensus protocol
We have reached the point where the whole network has received the transaction, which has been validated by all the nodes.
Among the various nodes, there are some, called mining nodes, that perform the mining activity. Basically, each miner competes with the others, and the challenge is to solve a cryptographic puzzle. Miners select valid but as-yet unconfirmed transactions, place them in a new block, and start by solving the cryptographic puzzle. When a miner arrives at the solution, he places it in his block and spreads it across the network.
To better understand this mechanism, however, one must first understand the concept of a consensus algorithm, which we will explain superficially here. For further discussion, we refer to the lecture on consensus algorithms.
Consensus algorithm refers to a set of rules that all participants in a blockchain’s network must abide by, because they are what makes the system trustless, that is, secure and reliable without the need for a central authority to rely on.
The problem of consensus is not exclusive to blockchain but affects the whole world of distributed systems, and the fundamental issue is to get all the actors in a system to come to an agreement (consensus, precisely) about a change in the overall state of the system.
One must therefore arrive at a common state, while maintaining decentralization. In this process one must consider the possible presence of malevolent actors who, out of self-interest or a desire to destroy the system, may act against the system.
Various solutions to this problem have been proposed and the one proposed by Satoshi Nakamoto with Bitcoin, the famous proof of work (PoW), can be said to be the biggest innovation brought by the first blockchain.
How PoW works we have described at a high level a few lines above, so it is a competition for solving a cryptographic puzzle that involves a great expenditure of energy and has a system of rewards to incentivize this expenditure of energy, which however involves a risk because, if we lose the competition, the energy we will have used will be lost and no one will pay us back, resulting in an economic loss.
PoW is not the only consensus algorithm in existence and actually originated as an alternative to Proof of Stake (PoS), an algorithm based on a financial stake, i.e., blocking funds to become validators.
In PoS in each new round, validators propose and vote on new blocks, and the weight of the vote depends on the amount of funds staked.
In this case, the risk of validators is that the block voted is rejected by the majority of validators, a case in which the funds put in stakes are lost.
The incentive, on the other hand, is to earn with each new block a reward, proportional to the amount of funds staked.
In both cases, the mechanisms are designed to balance risk and reward in order to incentivize actors to act honestly and do the work that, in effect, allows a blockchain to stay alive and move forward.
Ethereum was born using a PoW algorithm called Ethash although, from the beginning, a transition to a PoS algorithm was planned, which is currently not quite ready and which Buterin and other developers have been working on for years.
In this overview we will start by first delving into Ethash, and then give some information regarding the future PoS algorithm.
But before that it is necessary to explain how blocks are made.
As we have said, miners group a number of transactions waiting to be confirmed and put them into a block. The transaction data, however, is not the only information in it.
First, one must think of the block as what really changes the overall state of the system. Transactions are a change of state but to be effective they must be placed within a block.
Let us now look at the data structure of a block:
The block consists of a Header and a Body.
The Header contains a lot of data about the block:
- parentHash = the unique identifier of the previous block, which is essential for cryptographically binding blocks together, hence the term “blockchain” (for more on this we refer to the lesson on the definition of blockchain);
- ommersHash = the Keccak256 hash of all ommers blocks;
- beneficiary = the address of the person who will receive the rewards for mining the block (thus the address of a miner’s account);
- stateRoot = the entire state of the system: account balances, contracts with their code, and account nonce;
- receiptsRoot = hash of the root node of the transaction receipt trie (the receipts of all transactions included in the block);
- Transaction Root = hash of the root node of the transaction trie (representation of the list of transactions included in the block);
- logsBloom = a filter for transaction logs;
- difficulty = the difficulty required to mine the block;
- blockNumber = the length of the blockchain in blocks;
- gasLimit = the gas consumption limit of the block;
- gasUsed = the total gas consumed by the block’s transactions;
- timestamp = the timestamp of when the block was mined;
- extraData = contains arbitrary data about the block;
- mixHash = the hash that uniquely identifies the block;
- nonce = is the result of the cryptographic puzzle that miners must solve, a hash that, combined with the mixHash, unambiguously proves that the block is valid because a valid proof-of-work was provided.
{ "difficulty": 134290, "extraData": "0xd783010506846765746887676f312e372e33856c696e7578", "gasLimit": 4038569465, "gasUsed": 23114, "hash": "0xa5d727b111f123e11b6dc5d271697b82a6238f83ab342088e0a1538afce7862d", }, "logsBloom": "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000008000000000000000000004000000000000200000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000008000000000000000000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000000000000", "miner": "0x4c558858289c4180d0d2b994a4e009e078731191", "mixHash": "0xba14e2b605aeacfa68f8345a9e30dd2d37dc4f3f4eba9e8ac8e744ded90ae566", "nonce": "0x38de415ea086671a", "number": "63", "parentHash": "0xa7186a94afe92f0c0fedd1b8aa96e9aa92321e63a0b79d3f68ffe888bfb0239a", "receiptsRoot": "0x303d0444bf28744722fd53a46144ead61f3515a82b9640603028ebf91f212126", "sha3Uncles": "0x3633e886ef643c067d6fb7bfc1d2a6bf3eba939bf3cbb522f3894779ac1dd090", }, "size": 1709, "stateRoot": "0xcd4abaafec19113743df0235f06482f3a0b49e546e708055aa7a2382b232601e", "timestamp": 1484750319, "totalDifficulty": 8362288, "transactions": ["0x1bd5825eac201f0f0e1e2c4a9ed5de026edc3f7fb02c4b912ca55c0f50a021fc"], "transactionsRoot": "0xbaf6819c91ed625a97e974bbc4efd5808970b3b4991b1e2aca0dd537750aafc7", "uncles": ["0x6f0a9ab0e468112fdcbbecd81ebebf929ca9a38e6a4c864e8164309d3bed42c6", "0xb4dd60c91ab18de3f72ba27ab5934bec66ab01798d18b599d1d298bd29e56204"] }
Instead, the Body of the block contains:
- list of transactions entered by the miner within the block;
- list of ommers = ommers in Ethereum are the equivalent of orphan blocks in Bitcoin. Two miners can resolve the competition for the same block together, creating a temporary fork in the chain. The longest fork (or rather, the one with the sum of the highest computational cost, which often coincides with the longest) will win, and then one of the blocks produced by the miners will be removed and become an ommer. There is the possibility of placing these ommer blocks within the winning block, so that the miner who mined the ommer block receives a partial reward
.[7]
A miner will therefore select unconfirmed transactions that are present in the local mempool, merge them into a transaction list, fill in the other fields of the block, and start trying to solve the cryptographic puzzle.
To do this, it will have to use a technique called brute force, that is, it will have to proceed by trial and error.
What it must do in fact, once it has filled in the block, is to find a mixHash for the block that meets the difficulty target. To do this he will have to insert a random nonce in the block and pass it inside the hash function that will give the result.
It will have to do this n times, until the resulting hash satisfies the target.
When a block is added to the chain, a new round will start in which the miners will compete for the next block.
This process occurs on average once every 14 seconds
This is a computationally onerous process that is a proof-of-work for the miner, the costs he had to pay (of electricity and hardware) that give him an incentive to act honestly.
Buterin has been pointing out, ever since the Ethereum whitepaper, the fact that centralizations of computational power have been created in Bitcoin, mainly due to the spread of ASIC (application-specific integrated circuits) hardware dedicated specifically to mining and much more performant than the GPUs (graphics cards) that were previously used.
The advent of this type of specific hardware has created a limitation for non-professional miners, posing a risk for the decentralization of the system.
Buterin’s proposed solution with Ehash is an “ASIC resistant” algorithm, which means that it is more difficult to create specific hardware to solve cryptographic puzzles, so as to favor the use of GPUs that are more accessible to everyone.
In addition, as mentioned above, a switch to a PoS algorithm has been planned from the beginning, which will erase the need for mining from Ethereum.
In this sense, specific hardware manufacturers were initially less incentivized to invest in the development of dedicated hardware for Ethereum, knowing that it would become unnecessary in the future.
However, with the emergence of other cryptocurrencies and Ethereum-based tokens, ASIC hardware manufacturers began to invest more in Ethereum, and over time the same concentrations of power akin to Bitcoin’s mining pools came into being.
Regarding mining another detail should be added. We mentioned that the miners, in return for their work, receive an economic incentive. As with Bitcoin, it is through this incentive that the distribution of new Ether takes place, which are created out of thin air (mined, precisely) each time a new block is added to the chain.
Unlike Bitcoin, however, as explained earlier, there is no max supply for Ether, at least currently.
The current monetary distribution of Ether proceeds at 4.5% per year, with 2 new Ether per block plus an additional 1.75 Ether per block ommer
The reward per block has already been changed several times, following Ethereum’s governance processes, which we will see below:
- Blocks 0 to 4369999 = 5 Ether reward;
- Blocks 4370000 to 7280000 = 3 Ether;
- Blocks 7280000 to now = 2 Ether.
Although it is not yet known what exactly will happen in the future, it is expected, in accordance with the specifications for Eth 2.0, that there will be a large decrease in rewards with the advent of the PoS algorithm, with the distribution tied to the total network stake:
ETH validating | Max annual issuance | Max annual network issuance % | Max annual return rate (for validators) |
---|---|---|---|
1,000,000 | 181,019 | 0.17% | 18.10% |
3,000,000 | 313,534 | 0.30% | 10.45% |
10,000,000 | 572,433 | 0.54% | 5.72% |
30,000,000 | 991,483 | 0.94% | 3.30% |
100,000,000 | 1,810,193 | 1.71% | 1.81% |
Specifically, 2 events are planned in relation to Eth 2.0:
- Phase 0, in which there will be an increase in the annual distribution percentage;
- Phase 1.5, in which PoW-related rewards will be removed.
Regarding the transition to PoS, which will take place in the near future, it is not the purpose of this overview to discuss the advantages and disadvantages of PoS and PoW, so we simply say that the proposed PoS algorithm is called Casper, has been in research and development for years, and has been developed in two directions:
- Casper FFG = proposed as a hybrid algorithm between PoW and PoS, to be implemented in a transition phase;
- Casper CBC = exclusively PoS algorithm.
Buterin stated that CBC seems to have better theoretical properties while FFG seems to be easier to implement
Smart contracts, DApps, and Web 3.0
As anticipated in the section on accounts, there are two types of accounts. One of them is contracts, also called smart contracts. Let us try to better understand what this is.
The expression “smart contracts” did not originate with Ethereum but was coined in the 1990s by cryptographer Nick Szabo, who defined them as “a set of promises, in digital form, that include protocols within which parties fulfill other promises.”
With the emergence and development of blockchains, the concept of smart contracts has evolved a great deal, arriving at Ethereum in which the meaning has changed almost completely, in fact Ethereum smart contracts are not contracts, nor are they smart.
Rather, they are actual programs that are executed deterministically by the EVM, on the global, decentralized computer formed by the Ethereum network.
Ethereum smart contracts are immutable, being saved on the blockchain. This means that once the code is written and saved on the blockchain, it cannot be changed.
The EVM runs programs written in a low-level bytecode called EVM bytecode. Although it is possible to write smart contracts directly in EVM bytecode, it is a very complicated language that is difficult for humans to understand, which is why several high-level languages have sprung up to make it easier to write.
Indeed, it was preferred to invent new languages rather than use existing languages, because smart contracts operate in a minimalistic and limited way, so it was easier to create a language on purpose than to adapt a general-purpose language.
Among the various languages that have emerged, Solidity, LLL, Serpent, Vyper, and Bamboo should be mentioned. Of these, the most widely used is undoubtedly Solidiy, created by Gavin Wood, co-founder of Ethereum, and which has become the language of choice for writing smart contracts in Ethereum but also in other similar blockchains.
Solidity is developed and maintained on GitHub
Of course, it is necessary to compile the Solidity code into EVM bytecode in order for the EVM to be able to execute it.
An example of a simple smart contract written in Solidity follows:
pragma solidity >=0.4.0 <0.6.0; simpleStoragecontract { uint storedData; function set(uint x) public { storedData = x; } function get() public view returns (uint) { return storedData; }}
Suppose we have an EOA account and have written this smart contract. How is it saved to the blockchain and what will happen once it is saved?
The first question has already been answered: there is a special type of transaction, called a contract creation transaction, which requires a special address in the recipient field (0x0).
Once the transaction to create the smart contract is sent to the network, a new address will be created, the contract address of our smart contract. This will be identified by an Ethereum address, which can be used as the recipient of a transaction.
The contract account will have an address but will not have the private-public key pair.
It should also be specified that the contract creator will not have privileges on the contract itself, at least not at the Ethereum protocol level. To have privileges, these will have to be specified within the contract.
At this point we have created the contract, which resides on the blockchain. A few points need to be clarified:
- The contract will never run on its own, it must always be called by a transaction, which will either request one of its functions or send funds to the contract;
- Contracts can call, within the code, other contracts, but starting the call chain will always be a transaction created by an EOA;
- The contract address has a budget, so it can receive funds;
- Calling a contract costs gas, and the more transactions the contract performs, the more gas it costs to execute. From this it follows that it is preferable to write code that is as simple as possible to reduce execution costs;
- If there is an error in the execution of a contract, transactions will be rolled back, that is, they will be cancelled, as if the transaction had never begun. However, the transaction will still be recorded as attempted and the gas spent will still be subtracted from the balance of the account that created the transaction;
- The code of a contract, once entered into a block, can no longer be changed. The contract author, however, can decide to delete the code, using a function called SELFDESTRUCT. The contract deletion operation will provide for a negative gas cost, i.e., the contract creator will be compensated a sum of gas, as an incentive for developers to remove unused contracts. It should be specified, however, that the contract account will not be deleted, simply the code within it will be deleted;
- As mentioned, running a contract costs gas. If the cost is greater than the gas indicated in a transaction such as maxGas, contract execution will be stopped and the gas spent up to that point will be subtracted from the balance of the account that created the transaction.
During the creation of Ethereum, Buterin’s initial smart contract idea was gradually expanded to the decentralized application (DApp) concept.
Underlying the DApp concept is a radical change in the conception of the Web, one might say a reversal of the way it has been understood so far.
The fundamental idea is to decentralize the way applications work, from all points of view.
This affects all aspects of an application and thus:
- Backend
- Frontend
- Database
- Message communications
- Name resolution
We often hear DApps spoken of out of hand, referring to applications that communicate with smart contracts but are still tied to the traditional logic of centralized applications.
Let us now look in detail at the various aspects of an application and how they should be developed to achieve a true DApp.
Backend, in computing, refers to the business logic and state of the application. It usually takes the form of a server-side program. In the case of DApps, this program is replaced by one or more smart contracts.
Frontend refers to the interface displayed by users, which allows them to interact with the backend. Usually web technologies such as HTML, CSS and JavaScript are used. In a DApp, this does not change. There are special libraries (the most famous is web3 for JavaScript) that allow the frontend of an application to be integrated with a backend consisting of smart contracts, passing through web browsers such as MetaMask.
Regarding the database, i.e., where the data is stored, it is clear that smart contracts are not suited to fulfill this function, given the high cost of gas.
Usually, in traditional applications, centralized databases are used.
A DApp should prefer decentralized databases, P2P platforms such as IPFS
Message communication protocols are application components aimed at exchanging messages between applications or different instances of the same application. Centralized servers are usually used, but again, a DApp should prefer decentralized alternatives, such as Whisper
Finally, there is the problem of name resolution, traditionally carried out by the Domain Name System (DNS). This is a process of associating an IP address with a name. When we go to a website, we specify its name, behind which are hidden numbers that constitute its IP address.
That of DNS is a centralized solution, which is contrasted by the decentralized alternative offered by the Ethereum Naming Service (ENS)
Currently, there are few real DApps and many applications that use smart contracts but are centralized in other aspects.
The idea of DApp as a totally decentralized application brings us to the concept of Web 3.0, a term coined by Gavin Wood, a name we are now very familiar with.
We are currently in a phase of transition from the second generation of the web, Web 2.0, to the third generation.
With the spread of the Internet during the 1990s, a type of web known as web 1.0 emerged, dominated by static sites in which users could only consult information offered by servers (central entities), in a unidirectional, top-down communication flow.
The transition to Web 2.0 has occurred with sites in which the communicative flow has become bidirectional, dynamic sites in which users can interact and participate. The emblematic example is social networks. However, we are still within a highly centralized paradigm.
Web 3.0 envisioned by Gavin Wood is a decentralized web, dominated by the DApps we have just explained, in which the client-server paradigm will be progressively replaced by the peer-to-peer paradigm.
Ethereum stands at the base of this paradigm shift.
Ethereum Virtual Machine (EVM)
We have mentioned the Ethereum Virtual Machine (EVM) several times, let’s try to understand what it is.
The EVM is the heart of Ethereum, the computational engine, the part of Ethereum that handles the deployment and execution of smart contracts.
It is the EVM that is responsible for Ethereum’s global state updates, with the exception of updates resulting from transactions that merely transfer Ether between two EOAs, for which the EVM is not needed.
The EVM is thus the decentralized computer we mentioned, containing all the contracts (executable objects), each with its own permanent data store.
The EVM is similar in some ways to the Java Virtual Machine
The EVM also runs as a single-threaded computer, like, for example, JavaScript.
This involves the fact that it must terminate the execution of one instruction before moving on to the next.
This is precisely why a clarification must be made to what has been said so far. We have talked about Ethereum as Turing-complete; in fact it is almost Turing-complete.
A Turing-complete system must be able to execute any program with at least one solution; the problem is the execution time of the program. As mentioned earlier, the halting problem consists precisely in the inability to calculate the execution time of a program.
Think of a smart contract containing an infinite loop, the EVM being single thread, this would imply that the EVM would be stuck on the execution of this smart contract forever, making it susceptible to DoS attacks.
The EVM is therefore almost Turing-complete because of the gas mechanism, explained earlier, which limits the number of executable instructions by applying an execution cost.
When the execution of a smart contract is called, with a transaction, the EVM is instantiated, loading storage from the contract account and the necessary environment variables.
The EVM begins executing the code, keeping track of the gas cost of each transaction. In the transaction, a special field specifies how much the user who created the transaction wants to spend in gas.
If the execution cost exceeds the amount specified by the user, the EVM immediately stops executing the code, otherwise it continues it until the transaction is complete.
In the first case, each transaction is canceled and the state of Ethereum changes only with regard to the data of the user who initiated the transaction, because the Ether needed to pay for the gas consumed to get to the interruption point is subtracted from it.
In the second case, on the other hand, once the transaction is completed, the global state of Ethereum is updated taking into account the changes made.
We can think of the EVM as an engine that performs these operations in a sandbox copy of the global Ethereum state. Changes are made in this sandbox and only if the operation completes correctly are they also carried over to the real Ethereum state.
Tokens
Among the various features offered by Ethereum’s global computer, one of the most important is the ability to create tokens.
The word “token” means “sign” or “symbol,” but in the blockchain context it has taken on an increasingly specific meaning.
Tokens in a blockchain are abstractions that can be owned and represent coins, assets or access rights.
There are many functions that tokens can perform, and we will only do a high-level overview here to give an idea of one of the most successful aspects of Ethereum ever.
Referring as always to the book Mastering Ethereum, we can identify several functions that can be fulfilled by a token:
- Coin = they can be used to create new coins on a blockchain;
- Resources = they can represent resources produced by an economy;
- Assets = can represent an intrinsic or extrinsic, tangible or intangible asset;
- Access rights = can represent access rights to a digital or physical property;
- Equity = can represent shares in a digital organization;
- Voting = can represent voting rights in a digital system;
- Collectables = can represent objects (physical or digital) for collection;
- Identities = can represent digital identities;
- Attestations = can represent the attestation of a fact issued by a certain authority;
- Utilities = can be used to access a service.
Hundreds of tokens with different functions and purposes have sprung up in recent years, and often a single token performs more than one of the listed functions simultaneously.
There are key aspects to be explored:
- Fungibility and non-fungibility of tokens;
- Intrinsicity or nonextrinsicity of tokens;
- Legal validity of tokens;
- Utility and equity.
Fungible is defined as an asset whose individual units are interchangeable with each other.
Tokens can be either fungible, such as a token that fulfills the role of a currency, in which the various units are interchangeable and equal to each other, or non-fungible, in the case of non-fungible tokens (NFTs).
A great deal of interest has arisen around NFTs in recent years, with famous cases such as CryptoPunks. The fact that an asset can be associated with a code that uniquely identifies it makes NFTs perfect for various applications, from the arts (such as uniquely identifying a work of art) to gaming.
NFTs are becoming increasingly popular and are being linked to other concepts such as the metaverse, to which we devote an entire section of our site. For this reason we refer interested parties to the introductory lecture on NFTs to approach this world.
Another important aspect of a token is its intrinsicity or extrinsicity.
Intrinsic is defined as a token that represents a digital asset internal to the blockchain, thus governed by the consensus rules of a blockchain.
Conversely, extrinsic is a token that represents something external to the blockchain.
This second case leads to the concept of counterparty risk, that is, the risk of noncompliance with the contractual terms of one of the parties involved in a contract, which, in the case of intrinsic tokens, is avoided by the consensus rules of the blockchain.
The legal validity of a token should also be considered. There is an increasing tendency to tokenize anything, be it a digital asset or a physical one. One must consider, however, that the laws that apply in a blockchain are not currently recognized by any legal system.
Simply put, one has to deal with existing laws regarding ownership in a given state or legal system, which, at present, do not cover blockchain tokens.
This is a hot topic that is sure to ignite much discussion in the future and could lead to a revision of the concept of ownership.
At the moment, one can understand the enthusiasm for tokens but one must be very cautious because their validity is not necessarily absolute.
A final aspect worthy of attention is that of utility and equity tokens.
Most projects on Ethereum are in fact launched with tokens linked to the project itself. Usually these tokens are of two types:
- Utility tokens = are tokens required to access a service or resource;
- Equity tokens = are tokens that represent shares or ownership, for example of a company.
The former case is not problematic, but the latter often clashes with the jurisdictions of many states, which regulate the distribution of equity.
For this reason, equity tokens are often passed off as utility tokens in order to circumvent jurisdictions.
Again, one must be very careful and well informed about the projects one chooses to invest in.
Let us now come to the more technical aspects of tokens. Once again it must be said that tokens did not originate with Ethereum and that the roots can be found in Bitcoin.
The Bitcoin cryptocurrency itself can be seen as a token, and moreover many token-based platforms have been developed on Bitcoin.
Nonetheless, tokens exploded with Ethereum, in which real standards for making tokens became widespread.
Here we point out only the two best-known standards, although there are many others:
- ERC20
- ERC721
First, let us explain what a standard is. It is an interface that can be implemented by a smart contract and obliges it to implement a number of features.
It is not mandatory to create a token based on an existing standard; it is possible, although more difficult, to create a token from scratch.
Standards offer solutions that are already widely used and tested and make it easier to create a token.
There are also particularly popular implementations of these standards that make it even easier to create a token.
Another key thing to understand is that tokens work differently than Ether, because Ether works at the level of the Ethereum protocol, while tokens work at the level of smart contracts.
Let’s take an example: the Ether balance of an account is managed at the protocol level, the balance of a token on the other hand is managed by a smart contract.
Many considerations follow from this, two particularly important ones:
- When making a transaction to exchange tokens, the recipient’s address will not be that of the actual recipient, but that of the smart contract that manages the token;
- In order to do a transaction in which tokens are exchanged, it is still necessary to own Ether because, currently, gas can only be bought in Ether and they are required for any transaction.
Having made these clarifications, let us turn to the two most widely used standards.
ERC20 was introduced in 2015 and stands for Ethereum Request for Comments. It was posted on GitHub and assigned the issue number 20
ERC20 is the most widely used standard for making fungible tokens, such as new coins.
It is an interface that must be implemented by a contract:
eRC20contract { function totalSupply() constant returns (uint theTotalSupply); function balanceOf(address _owner) constant returns(uint balance); function transfer(address _to, uint _value) returns (bool success); function transferFrom( address _from, address _to, uint _value) returns (bool success); function approve(address _spender, uint _value) returns (bool success); function allowance(address _owner, address _spender) constant returns(uint remaining); event Transfer(address indexed _from, address indexed _to, uint _value); event Approval( addressindexed _owner, address indexed _spender, uint _value); }
In essence, a contract that implements this interface will be required to have the methods specified in the interface, for example, a “transfer” method that is used to transfer tokens from one address to another.
Despite the widespread use of ERC20, this standard has several problems, for example, the fact that when tokens are transferred, the state of the affected accounts is not really updated, but rather the state of the contract.
Without going into detail, these limitations and critical issues have led to a search for new standards and new proposals continue to be made.
THEERC721
interface ERC721 /* is ERC165 */ { event Transfer(address indexed _from, address indexed _to, uint256 _deedId); event Approval(address indexed _owner, address indexed _approved,uint256 _deedId); event ApprovalForAll(address index ed _owner, address indexed _operator, bool _approved); function balanceOf(address _owner) external view returns(uint256 _balance); function ownerOf(uint256 _deedId) external view returns(address _owner); function transfer(address _to, uint256 _deedId) external payable; function transferFrom(address _from, address _to, uint256 _deedId) external payable; function approve(address _approved, uint256 _deedId) external payable; function setApprovalForAll(address _operateor, boolean _approved) payable; function supportsInterface(bytes4 interfaceID) external view returns (bool); }
Again, it should be specified that nothing prevents one from not following a standard and making an NFT from scratch.
Also, standards are minimum specifications for an implementation, that is, the implementation will have to have methods specified by the interface but nothing prevents it from later implementing other methods not in the interface.
This is to ensure interoperability between token contracts that will act in a standard and predictable manner.
Oracles
Smart contracts may need access to specific data in order to function. Imagine, for example, a contract that is based on a bet on the price of gold. It will then be necessary to have a way to access data about the price of gold.
This is done by oracles, systems that provide data from sources outside the blockchain to smart contracts.
The importance of the veracity of this data, which ideally should come from trustless, decentralized oracles, is obvious.
We can think of oracles as a bridge that connects the blockchain with real-world off-chain data.
The topic is very complex and we will explore it in more detail in dedicated lectures, here we just introduce this fundamental issue.
The practical applications for which an oracle is needed are potentially limitless.
Oracles can serve to provide real-world data (asset price trends, data from other blockchains, static data, weather data, data regarding political events, etc.) and can also serve to perform computationally onerous operations, give inputs, and return outputs.
In the latter case, this involves taking off-chain computations that are too complex to be performed by an on-chain smart contract, which would incur a disproportionate cost in gas.
What an oracle should do therefore is:
- Collect data from an off-chain source;
- Transfer the on-chain data with a signed message;
- Make the data available by putting it into a smart contract.
Several types of oracles can be identified, based on the type of data they need to provide:
- Immediate-read = provide data needed for an immediate decision. The required information is sought when it is requested and may not be needed again;
- Publish-subscribe = provide information that needs to be updated frequently, such as an RSS feed;
- Immediate-read = provide so much information that it is impossible to store in a smart contract, thus necessitating an off-chain infrastructure.
These three types are in order of complexity, from easiest to most complex.
Then there are even more complicated cases where, as we said, oracles not only have to provide data but are used to perform computations and return results.
There are various projects that have tried to solve these problems, either in a centralized or decentralized way.
Here we report on the cases of Provable
ChainLink is becoming increasingly popular, and it is a decentralized oracle that will most likely become fundamental to the blockchain world, which we will certainly explore separately.
Development process, forks and governance
As with all decentralized blockchains, it is important to understand how changes to protocol rules occur, how the system can evolve and change over time.
Ethereum works, in this respect, quite differently than Bitcoin.
Bitcoin, over the years, has earned the title of first blockchain and cryptocurrency in part because of the fact that it has demonstrated great stability and security, given by a predominantly conservative course of action.
Every change is studied very carefully and we rarely see hard forks, or drastic changes that result in the chain taking two branches that will continue on their respective paths.
Instead, soft forks, or changes to the protocol that are backward compatible, are preferred. This means that nodes that do not adopt these changes can continue to be part of the network and communicate with other nodes.
Ethereum, on the other hand, has a much less conservative course of action. Changes are implemented more easily and with less discussion, even when it involves a requirement to update clients to the new consensus rules, resulting in a hard fork for those who do not accept the new rules.
This increased flexibility and tendency to change has advantages and disadvantages.
It is a problem, for example, for those developing smart contracts, because they may be forced to abandon the contracts they have created and have to create new ones in accordance with the new rules.
In short, it lacks the stability that distinguishes Bitcoin.
Let us now try to understand how decisions are made in Ethereum.
We can identify two main types of governance in blockchain:
- On-chain governance
- Off-chain governance
The difference is immediate: in the first case, decisions are made through a voting process that takes place on-chain (an emblematic case is Algorand’s new experiment in on-chain governance, to which we devoted a lecture); in the second, voting takes place outside the blockchain, indeed, often no direct voting takes place at all but an “informal” discussion, as Vitalik Buterin himself says in a post he dedicated to the topic
Ethereum, like Bitcoin, preferred this second possibility.
Obviously, being in a decentralized system, power must be distributed and not centralized in the hands of one group in the Ethereum ecosystem.
Thus we have different groups of actors, involved in the governance process, who have their own, often conflicting interests:
- Ether holders;
- The users of applications on the Ethereum blockchain;
- The developers of applications running on the Ethereum blockchain;
- The full nodes;
- The authors of Ethereum Improvement Proposals (EIPs);
- Miners;
- The developers of the protocol (core developers).
Anyone can propose changes to the Ethereum protocol at any time. The difficult thing is to find following and convince the various groups of the usefulness of the proposed change.
A very important tool the Ethereum Improvement Proposals (EIPs), which are standards that specify new features or processes in Ethereum. They contain technical specifications for proposed changes and function as a “source of truth” for the community
All Ethereum network upgrades and standards are discussed and developed following the EIP process.
The formal process consists of 5 successive steps
- Proposal of a Core EIP = a change to the Ethereum protocol is formally proposed, laid out in detail in a Core EIP. This specification represents what the protocol developers will implement if the EIP is accepted;
- Presenting the EIP to protocol developers = one must present one’s EIP to developers, through the various channels used. Developers can consider the changes for the future, deem them important right away, or reject them;
- Iterating the previous steps to a final proposal = after receiving feedback from various stakeholders, changes will probably need to be applied to the EIP, until a final proposal is reached;
- Inclusion of the EIP in a new Network Upgrade = this stage is reached if the EIP is approved and implemented, then scheduled as part of a new upgrade. Usually several EIPs are combined into a single upgrade;
- Network upgrade activation = the upgrade is implemented and the EIP becomes active. Usually upgrades are first tried on the Testnet and then on the Mainnet.
When this process comes to an end, the two cases explained above can occur: that of a soft fork, if the changes are backward compatible, or that of a hard fork, if the changes are not backward compatible.
One of the most famous hard fork cases in Ethereum was the DAO Hack, which we will cover in a dedicated lesson
Finally, important is the decision-making power of the miners, because they can choose whether or not to approve protocol changes, installing them on their node or rejecting them.
Their power, however, is mediated by the fact that they are dependent on other groups participating in governance.
For example, even if most of the miners were to agree to reject a change and initiate a hard fork, there would not necessarily be transactions on their chain. In that case, it would be economic suicide for them.
As in the case of Bitcoin, governance rests precisely on this delicate balance between the parties involved.
In any case, as can easily be seen, Ethereum’s tendency to be less conservative than Bitcoin has led to a large number of forks during its short life.
Ethereum 2.0
We close this overview with a paragraph devoted to Ethereum 2.0.
Ethereum 2.0 (or Eth2) refers to a set of interconnected updates aimed at improving Ethereum by making it more scalable, secure, and sustainable.
Originally this set of updates was called Serenity and was started to be studied in 2014
The biggest change is the move from Proof of Work to Proof of Stake, as mentioned many times before, however, supporting this radical change are many other changes.
Here we will try to take a top-down view, without going into too much detail. We will certainly devote a lecture to the topic later.
Ethereum, with the passage of time, has faced many problems, some of which Vitalik Buterin attributed to Bitcoin and which later occurred on Ethereum as well.
Trying to list some of them we can point out:
- Network congestion resulting in increased transaction fees, which have become unaffordable for most users;
- Size of the chain grown considerably, making it difficult to install a full node, resulting in the risk of centralization;
- Very high power consumption, due to the Proof of Work consensus algorithm.
To explain how the upgrades were designed and developed, one has to take into account a fundamental concept when talking about blockchain, namely the famous blockchain trilemma, according to which it would be impossible to get a system at the same time:
- Decentralized;
- Secure;
- Scalable.
With Ethereum 2.0, the developers have tried to take these three points into account, going on to make changes that would lead to improvements in all three senses, plus further improvements in environmental sustainability.
Quoting from Ethereum’s website, “The updates to Eth2 will make Ethereum scalable, secure and decentralized. Sharding will make it more scalable, increasing the number of transactions per second while decreasing the amount of energy required to run a node and validate the chain. The beacon chain will make Ethereum secure by coordinating the validators of different shards. Staking will break down barriers to participation, creating a larger and more decentralized network.”
The basic concepts are those of:
- Beacon chain;
- Sharding;
- Staking.
The three concepts are related to each other and cooperate to achieve the ends specified above.
The Beacon Chain is a new proof-of-stake blockchain and will be the backbone of Ethereum 2.0. It has already been implemented and you can explore the data of this new blockchain here
Of course, it has not replaced Ethereum’s main blockchain, at the moment the two chains coexist.
As we were saying, the Beacon Chain is based on a Proof of Stake protocol in which therefore mining is replaced with staking, which we have already mentioned, which should on the one hand succeed in making the system more environmentally sustainable, drastically reducing energy consumption, and on the other hand safer, because those who act maliciously risk losing all their funds put in stakes, the equivalent of destroying mining equipment in the case of PoW.
In short, more severe penalties for offenders that should disincentivize malicious actions.
Sharding , on the other hand, consists of dividing Ethereum’s load across multiple chains, 64 to be precise.
Each chain will have a load equal to that of current Ethereum, it will have its own validators, and running a node will become simpler and less hardware-intensive since only one shard needs to be offloaded.
Lightening the burden of nodes should benefit decentralization, allowing anyone to install a node without major problems.
In addition, sharding also benefits scalability by increasing the number of transactions that can be processed per second, one of Ethereum’s big bottlenecks currently.
Currently, the target date for the full transition to Ethereum 2.0 is 2023.
The topics deserve much more in-depth discussion, and dedicated lectures will be devoted to all aspects of Ethereum 2.0.
Notes
[8] https://ethereum.org/en/glossary/#ommer