How bitcoin works

From Bitcoin Wiki
Revision as of 04:04, 11 August 2011 by Sgornick (talk | contribs) (Remove external link bringing user to a non-existent guide on mining.)
Jump to navigation Jump to search

This page explains the basic framework of Bitcoin's functionality.

Cryptography

There are several cryptographic technologies that make up the essence of Bitcoin.

First is public key cryptography. Each coin is associated with its current owner's public ECDSA key. When you send some bitcoins to someone, you create a message (transaction), attaching the new owner's public key to this amount of coins, and sign it with your private key. When this transaction is broadcast to the bitcoin network, this lets everyone know that the new owner of these coins in the owner of the new key. Your signature on the message verifies for everyone that the message is authentic. The complete history of transactions is kept by everyone, so anyone can verify who is the current owner of any particular group of coins.

This complete record of transactions is kept in the block chain, which is a sequence of records called blocks. All computers in the network have a copy of the block chain, which they keep updated by passing along new blocks to each other. Each block contains a group of transactions that have been sent since the previous block. In order to preserve the integrity of the block chain, each block in the chain confirms the integrity of the previous one, all the way back to the first one, the genesis block. Record insertion is costly because each block must meet certain requirements that make it difficult to generate a valid block. This way, no party can overwrite previous records by just forking the chain.

Both the chaining, and the difficulty, are achieved via the SHA256 cryptographic hash function. The hash function essentially takes a block of data, and transforms it, in an effectively-impossible to reverse or to predict way, into a large integer. Making the slightest change to a block of data changes its hash unpredictably, so nobody can create a different block of data that gives exactly the same hash. Therefore, by being given a short hash, you can confirm that it matches only a particular long block of data. This way, Bitcoin blocks don't have to contain serial numbers, as blocks can be identified by their hash, which serves the dual purpose of identification as well as integrity verification.

The difficulty factor is achieved by requiring that this integer is below a certain threshold - the data in the block is perturbed by a nonce value, until the data in the block hashes to produce an integer below the threshold - which takes a lot of processing power. This low hash value for the block serves as an easily-verifiable proof of work - every node on the network can instantly verify that the block meets the required criteria.

With this framework, we are able to achieve the essential functions of the Bitcoin system. We have verifiable ownership of bitcoins, and a distributed database of all transactions, which prevents double spending.

Bitcoin mining

We have mentioned in the previous section that adding a block to the block chain is difficult, requiring time and processing power to accomplish. So what incentive does anyone have to spend the effort to produce a block, if it takes up all these resources? The answer is that the person who manages to produce a block gets a reward. This reward is two-fold. First, the block producer gets a bounty of some number of bitcoins, which is agreed-upon by the network. (Currently this bounty is 50 bitcoins; this value will halve every 210,000 blocks.) Second, any transaction fees that may be present in the transactions included in the block, get claimed by the block producer.

This gives rise to the activity known as "bitcoin mining" - using processing power to try to produce a valid block, and as a result 'mine' some bitcoins. The network rules are such that the difficulty is adjusted to keep block production to approximately 1 block per 10 minutes. Thus, the more miners engage in the mining activity, the more difficult it becomes for each individual miner to produce a block. The higher the total difficulty, the harder it is for an attacker to overwrite the tip of the block chain with his own blocks (which enables him to double-spend his coins. See the weaknesses page for more details).

Besides being important for maintaining the transaction database, mining is also the mechanism by which bitcoins get created and distributed among the people in the bitcoin economy. The network rules are such that over the next hundred years, give or take a few decades, a total of 21 million bitcoins will be created. Rather than dropping money out of a helicopter, the bitcoins are awarded to those who contribute to the network by creating blocks in the block chain.

Double spending

The block chain which is constantly being generated by the Bitcoin network works as a common agreement about the order in which bitcoin transactions have taken place. Placing the transactions in a sequence is necessary so that things like the double-spending of coins and negative balances can be avoided. Simply announcing each transaction is not enough to establish the order of the transactions, because different computers may receive the announcement at different times.

Unlike conventional banking systems, there is no central place where this "log" of transactions is stored. What happens is the broadcasting of small pieces (the blocks), each stating that it is a continuation of a previous block. Thus, it is possible for many proposed continuations to the block chain to exist at the same time. There can also be multiple continuations to each of these continuations, forming many branches and sub-branches, like a tree. When this happens, each computer in the network must decide for itself which branch is the "correct" one that should be accepted and extended further.

The rule in this case is to accept the "longest" valid branch. Choose from the branches of blocks that you have received, the path, the total "difficulty" of which is the highest. This is the sequence of blocks that is assumed to have required the most work (CPU time) to generate. For Bitcoin, this will be the "true" order of events, and this is what it will take into account when calculating the balance to show to the user. "Valid" means that Bitcoin will reject branches which don't contain a strong enough proof of work or contain invalid transactions (i.e. calculating a wrong balance, giving more money than you have, generating more bitcoins than is allowed at the time, etc).

It is still possible that, as new blocks are constantly being generated, at some later time, some other branch will become the longest branch. However, it takes significant effort to extend a branch, and nodes work to extend the branch that they have received and accepted (which is normally the longest one). So, the longer this branch becomes compared to the second-longest branch, the more effort it will take for the second-longest branch to catch up and overcome the first in length. Also, the more nodes in the network hear about the longest branch, the more unlikely it becomes for other branches to be extended the next time a block is generated, since the nodes will accept the longest chain.

Therefore, the more time a transaction has been part of the longest block chain, the more likely it is to remain part of the chain indefinitely. This is what makes transactions non-reversible and this is what prevents people from double-spending their coins. What the receiver of each transaction does, after money has been supposedly transferred to him/her is to check how long the block chain following the said transaction has become, because the more blocks are added to the longest branch after the transaction, the less likely is it that some other branch will overcome it.

When the block chain after the transaction has become long enough, it becomes near-impossible for another branch to overcome it, and so people can start accepting the transaction as true. This is why 'blocks' also serve as 'confirmations' for a transaction. Even if another branch does overcome the one with the transaction, most of the blocks will have been generated by people who have no affiliation with the sender of the coins, as a large number of people are working to generate blocks. Since transactions are broadcasted to all nodes in the network, these blocks are just as likely to contain the transaction as the blocks in the previously-accepted branch.

Bitcoin relies on the fact that no single entity can control most of the CPU power on the network for any significant length of time, since, if they could, they would be able to extend any branch of the tree they chose, and faster than any other branch can be extended, making it the longest branch, and then permanently controlling which transactions appear in it.

See also

External Links