From the data structure of blockchain to the decentralization, from hashrate to mining, from consensus to fork, all our readers must have a general understanding of the blockchain now.
However, there’s a topic on which we always hoped to write something, but we hesitated because we worried our understanding might not be comprehensive and thorough enough and make some misleading.
After one month of understanding, we finally have the confidence to share this topic with all. Although this topic is inconspicuous, it is especially important, especially in the field of blockchain. I should say that it has laid the foundation for the entire decentralized application (DAPP) from now to the future, that is , the transaction models of the blockchain.
Address in the Blockchain
Account number is indispensable for transactions, including the payer and the payee. Like PayPal account transfer, a specific identifier is also needed in the blockchain to represent an account number, which is known as the wallet address. The mining workers or users, so long as they want to play the game, they will first create wallet addresses to represent their own accounts. It’s actually a 34-bit string as follows:
1CK6KHY6MHgYvmRQ4PAafKYDrg1ejbH1cE
The string above is generated by encoding of user’s public key. The introduction of the public and private keys will be explained later. Users can use this address to conduct balance inquiry and account transfer.
Many-to-Many Mapping
Generally, the trading in our mind are usually the transactions in our real lives. The most typical one is who transfers money to whom. For example, if you want to transfer money to someone, you must first know his or her account number. Therefore, a transaction record usually consists of a sender account and a receiver account. The one-to-one trading is a generally accepted transaction model, which is shown as follows:
However, such a model is very inefficient in the blockchain.
First, to confirm each transaction, you need to pay for the mining workers. Therefore, you must send a commission to the workers when you conduct account transfer, which means that the number of receiving address will be more than one.
Second, the transaction history of the blockchain must be traceable, so we need to establish a link to the past transaction so that the source of funds can be traced. However, each fund in current account may come from multiple addresses.
For the sake of convenience, we take the source of funds as the transaction input (TxIn) and the flow of funds as the transaction output (TxOut). The relationship between TxIn and TxOut is then a many-to-many mapping one, in which each TxIn points to the TxOut of one previous transaction. It’s shown as follows:
Based on the aforesaid structure, let’s have a look at what information is included in the transaction data in actual blocks. Take bitcoin as an example.
Apart from the TxIn and TxOut, it also includes a timestamp, that is, the time of transaction, hash value (tx hash) of this transaction as well as the block version number. It’s shown as follows:
The Tx hash is the hexadecimal output of SHA256, which is a 64-bit string, such as:
8152e25169fde1cd2a81ca794b94a144ff07ae7061d4afe72ae9282a4c654081
This value is calculated by the following formula:
Tx hash = SHA256(Timestamp + Version + Payload)
The Payload refers to the content of specific transaction, which includes the sequence of TxIn and TxOut. For the Timestamp, Version and Payload, the actual blocks are presented in the form of sequences of bytes, so the “+” in the formula means to concatenate all these sequences. We should be very familiar with the function, SHA256. It has no inverse function and it’s rare to have conflicts in the value range, so the input changes will cause the changes of output.
The Tx hash should not be underestimated, since it’s very vital. Introducing Timestamp as part of the SHA256 input is a clever move used by Nakamoto Satoshi to solve the problem of double spend.
What’s Double Spend?
In common expression, it means the banknote is spent twice.
It’s impossible for the legal banknotes in real life to be spent twice, because it is still difficult to make counterfeit banknotes.
But bitcoin and other cryptocurrencies can be easily copied because they are defined by TxIn. People can easily copy the input data and output into a different wallet address. The initial solution to this problem was to create a coin node, which means that all bitcoins were distributed from this node. However, if so, we have just returned to the centralization again.
How to Solve the Problem of Double Spend?
To realize decentralization, Nakamoto Satoshi used a method called Timestamp Server, which means that within a block, only one transaction should be allowed to quote the same input with the output to other addresses. Otherwise, it will be regarded as a “double spend.” Once it occurs, the mining workers will record the first transaction and reject the others. The Timestamp here can be used to judge the sequence of two transactions.
But why does the Timestamp need to be considered in calculation of hashrate?
Here, we can make use of the good encryption features of SHA256 to prevent tampering. If someone maliciously intends to modify the timestamp, the hash output will be changed so that the hash in the entire blocks will be changed eventually to exceed the range of target values, then other mining workers will naturally reject such blocks.
Therefore, the input by SHA256 belongs to protected sensitive data which cannot be easily tampered.
Structure of Transaction Input
Since the data structure of transaction has been introduced, let’s learn its input and output. Take bitcoin as an example, we will see what is included inside. First, have a look at the transaction input:
The Prev Tx Hash is the hash of previous transaction associated with the current input, which also aims to achieve the traceability of the blockchain. That’s to say, people can find out which of the previous transactions the current account transfer comes from.
According to the aforesaid transaction model, each transaction will have multiple outputs corresponding to different addresses, so we need an index value to specify a specific output, which is just the role that Tx Index plays here.
The SigScript means signature script. Here I’d like to introduce the concept of Script for you。
Script of Bitcoin Protocol
I believe the “script” is one of Nakamoto Satoshi’s most forward-looking innovations for bitcoin. We can even say it has laid the foundation for the future development of the entire decentralized applications. All the representatives of blockchain 2.0/3.0, such as Ethereum, EOS, etc., are derived from such a framework.
So what is a script?
A script is a set of instructions in a specific sequence which operate on the Stack elements. Let’s have a look at what instructions Nakamoto Satoshi has defined:
Due to the limited length of this article, it is impossible to explain all the instructions one by one. All instructions can be viewed on the official website: https://en.bitcoin.it/wiki/Script
Here, a few representative ones will be selected for introduction.
First, all the operations of such instructions are for the “stack” elements. “Stack” is a kind of data structure with first-in last-out feature in the computer.
Some instructions:
- OP_PUSHDATA1: The next byte contains the number of bytes to be pushed onto the stack.
- OP_DUP: Copy the header element of the stack and then push onto the stack.
- OP_HASH160: Hash the header element of the stack twice (first coded with SHA256, then with RIPEMD-160), and push the output onto the stack.
- OP_EQUALVERIFY: Determine whether the first two elements of the stack are equal.
- OP_CHECKSIG: Verify signature
Therefore, with these instructions we can define some complex logics to tell the mining workers how the money will be used and whether the user has the right to use the money.
Once we have understood the form of “script”, let’s have a look at the signature script. The signature script consists of two elements: the signature and the public key. It plays the role of verifying the signature.
Public Key and Private Key
One of the biggest features of blockchain applications is the security. It can ensure that the coins in your wallet cannot be stolen.
Certainly, everyone has the experience of using credit cards. You must sign the invoice when you use a credit card to confirm the transaction. If someone steals your credit card and his signature is inconsistent with your own, the bank will cancel the transaction. The blockchain also adopts a similar principle. Each transaction needs to be signed for confirmation.
Then, how to verify the signature?
It involves another important encryption algorithm in the blockchain.
The specific steps are as follows:
When you open an account with a wallet for the first time, the system will automatically assign you a private key, which is a random string. It’s very important, because it’s equivalent to the key of your safe, which shall not be shared with others.
With the private key, the system will use Elliptic Curve Digital Signature Algorithm or ECDSA to generate another string, the public key. The ECDSA is very complicated, so let’s just skip the relevant introductions here.
The public key, as the name suggests, is the key available to the public and it can be broadcast to the entire worker community. With specific hash encoding based on the public key, we can get the address of the wallet.
Next, if we want to transfer money to another wallet address, we need to sign the transaction with a private key, and then we can get another string, which is our signature.
The process of signing is also encrypted by ECDSA. With both the signature and the public key, the mining workers can verify whether the private key can unlock the bitcoin at that address. The advantages of this algorithm include:
- ECDSA has no inverse function, which means that you can get the public key through the private key so as to get the signature through the private key, but impossible vice versa.
- Private key needn’t to be exposed during the entire unlocking process.
The aforesaid two features will ensure that your private key will not be stolen during the transaction. The entire unlocking process is shown as follows:
Therefore, ECDSA is also the most common universal encryption method on the market.
Structure of Transaction Output
After the introduction of transaction input, let’s talk about the transaction output:
The transaction output contains the elements such as To Address, Value and ScriptPubKey.
To Address is the target address of account transfer, Value is the amount to be transferred, and PkScript is the public key script. The first two are easy to be understood expect for the last one, the public key script.
Public key script, as the name suggests, is script containing public key hash. In the bitcoin protocol, the TxIn signature script and TxOut public key script need to be combined during the whole transaction verification process, and we call it P2PKH (Pay-to-Public-Key-Hash). Moreover, there are also other scripts such as P2SH (Pay-to-Script-Hash), which will not be introduced here. We just take P2PKH as an example:
- Public key script: OP_DUP OP_HASH160 <public key hash> OP_EQUALVERIFY OP_CHECKSIG
- Signature script: <signature> <public key>
Let’s analyze how the script is executed step by step:
It’s a very classic and the most commonly used script in bitcoin, which implements transfers between addresses. We can also use different combinations of instructions to realize other complicated functions such as locked position.
Wait, isn’t it the function in smart contract?
Yes, in fact, Nakamoto Satoshi created “script” for the sake of application of “smart contract” in the future. Unlike the simple account transfer, the contract involves multi-party signature verifications. To achieve the function of “smart contract”, we only need to implement an extended function from single sign to multiple signatures.
However, the script of bitcoin protocol is not so perfect. Firstly, the instruction set is rather limited; secondly, it does not support loops, so it is not Turing completeness; thirdly, the readability is very poor. Therefore, Ethereum built a Turing-complete language like Python based on it so that the users can easily build their own contracts. But anyhow, the entire framework is still implemented based on this transaction model.
Summary
Whether the transaction model, which seems to be very common, has contained very complicated elements? With some important contents such as scripts, smart contracts, timestamp servers, encryption of public keys and private keys, it has actually played the role as the cornerstone of all blockchain protocols.