Bitcoin Wiki - User contributions [en]

Bitcoin Core 0.11 (ch 5): Initial Block Download

2016-01-21T20:32:44Z

Mrbandrews: edit footer

This page explains how Bitcoin Core downloads the blockchain when your node first joins the network.

==Background==

Once a new node joins the network, its first order of business is to download and validate the entire blockchain. This is an integral step to the distributed nature of bitcoin because only by doing this can a node claim that it has independently validated all transactions.

As the blockchain grows in size, the time required for IBD increases unless optimizations are made to the code. Various optimizations have been made since Satoshi's original client was released, but as of 2014, with increasing transaction volume, initial download on laptop hardware with an average connection could still take up to 24 hours. Developers agreed that this was unacceptable and a new approach was developed called "headers first" mode. This approach resulted in a substantial speedup.

=="Headers First" mode==

With "headers-first" mode, a new node downloads all of the block headers first, which are very small (about 80 bytes, whereas a block can be up to 1MB). Once the node has all of the headers, from the genesis block up to the current tip of the blockchain (380,000 as of October 2015), only then does it begins downloading the full blocks.

Now that it has the headers, the node downloads blocks in parallel from multiple peers. (It downloads headers from only one peer, but that's no big deal since headers are small.) The node will download from up to 8 peers at once and will disconnect any peer that stalls for more 2 seconds, attempting to connect to a faster peer.

Headers-first IBD was merged in 2014 in: [https://github.com/bitcoin/bitcoin/pull/4468 Pull Request 4468].

As summarized in the PR comment, some of its main features are [comment is edited slightly here]:

* Do not use 'getblocks', but 'getheaders', and use it to build a headers tree.
* Blocks are fetched in parallel from all available outbound peers, using a limited moving window. When one peer stalls the movement of the window, it is disconnected.
* No more orphan blocks. At all. We only ever request a block for which we have verified the headers, and store it to disk immediately. This means that a disk-fill attack would require Proof of Work.
* We sync from everyone we can, though limited to 1 during initial headers sync.

==Checkpoints==

The Bitcoin Core initial block download code makes sure that the block headers you are downloading (from a single peer) passes certain, hard-coded "checkpoints."

Checkpoints are block hashes corresponding to a long-ago block that everyone (where "everyone" means the network participants as recognized by the Bitcoin Core commit-access developers) recognizes as being on the longest chain. The checkpoints are set far enough in the so as to be non-controversial.

The IBD checkpoints check is at main.cpp:2784 (version 0.11).

The hardcoded figures are in chainparams.cpp. (As of 0.11, there were 13 checkpoints, the first at block 11,111, the last at block 295,000.)

From chainparams.cpp:

:: /**
:: * What makes a good checkpoint block?
:: * Is surrounded by blocks with reasonable timestamps (no blocks before with a timestamp after, none after with timestamp before)
:: * Contains no strange transactions
:: */

'''Purpose of Checkpoints'''

The purpose of checkpoints is primarily DoS protection. As Greg Maxwell explained on [https://bitcointalk.org/index.php?topic=194078.0 bitcointalk in 2013]:

: User: They [checkpoints] are there so some quantum computing farm (doesn't exist, but...) can't come out of nowhere and roll back blocks by mining a long chain from far in the past.

: gmaxwell: That's not really what they're for— though they have that effect too. Most of their usefulness is that they prevent a dos attacker from filling up bitcoin node's disk space with long runs of low difficulty blocks forked off low in the chain. e.g. you start off with difficulty 1 blocks at block 0, now mine-able by the millions by a single asic— _MAYBE_ a chain that starts off that way could eventually turn out to be the longest so absent the checkpoints a node would happily follow an endlessly long chain of them. They also make is so that an attacker who has complete control of your network (and thus can prevent you from hearing the longest chain from the honest bitcoin network) from putting you on a fake (low difficulty) isolated chain unless they can also trick you into running replaced software. With the checkpoints such an attacker hast to have a ton of mining power in order to continue the chain.

'''Origin of Checkpoints'''

According to Gavin Andresen, checkpoints were originally introduced in response to the "overflow" bug which would permit anyone to spend anyone's bitcoins. [https://bitcointalk.org/index.php?topic=1647.40 see here, page 3]

'''More info on Checkpoints:'''

: First checkpoint introduced by Satoshi in July 2010 (v0.3.2): [https://bitcointalk.org/index.php?topic=437 here]
: Bitcoin talk thread on checkpoints (Nov 2010): [https://bitcointalk.org/index.php?topic=1647 here]
: Bitcoin talk thread (2013): [https://bitcointalk.org/index.php?topic=194078.0 here]

==Bitcoin Core code implementing IBD==

Most of the code is in main.h/cpp.

'''Block Status'''

The concept of block status is important (see chapter 2 and/or chain.cpp). Your node assigns a newly received block a status and updates that status as it learns more about the block. (The block status could also change in the future, for example if there is a re-organization and the block is no longer on the longest blockchain.)

'''Downloading Block Headers'''

The code starts by downloading headers from a single peer.

The download begins in SendMessages (main.cpp), which is called periodically by the message-handling thread.

If the code sees that we are not caught up and we haven't started syncing from anyone:

* We send: "getheaders"
* Peer replies: "headers" (a chunk of 2000 headers)
* We send: "getheaders"
* Peer replies: "headers" (a chunk of 2000 headers)
* <continue...>
* Peer replies: "headers" (a chunk of LESS THAN 2000 headers)

When the peer sends less than 2000 headers, we conclude that we reached the tip of the peer's blockchain.

'''Block Locators'''

A Block Locator is used to find a fork point between two nodes, which is where the nodes should start exchanging block headers.

Here is the definition (in primitives/block.h):

/* Describes a place in the block chain to another node such that if the
* other node doesn't have the same branch, it can find a recent common trunk.
* The further back it is, the further before the fork it may be.
*/
struct CBlockLocator
{
std::vector<uint256> vHave;
...
(a few basic constructor & serialization methods)
...
};

So, a Locator is basically a vector of block hashes. The vector is populated with 32 hashes which are chosen to maximize the likelihood of quickly finding a common block with a peer.

In normal operation, we would generally expect to be within a few blocks of our peers. Thus, the vector starts with the 10 last block hashes before "jumping back" exponentially. Thus the comment "the futher back it is, the futher before the fork it might be:" what is meant is that firstly, if you and your peer are within 10 blocks of one another, the fork point will be included in the list of hashes. If you and your peer are 15 blocks away from one another, you'll surely find a common block that's within a few blocks of the fork point and only one further iteration of Locator will be reuqired to find the fork point. However, if you and your peer diverge by 10,000 blocks, the Locator may only find a common trunk that's a few hundred (or more) blocks before the fork point, so another two or three Locator objects will need to be exchanged to zero in on the fork point.

See CChain::GetLocator and CBlockIndex::GetAncestor [both in chain.cpp].

'''Block Skiplist Pointer'''

Every block index contains the following attribute:

// pointer to the index of some further predecessor of this block
CBlockIndex* pskip;

The algorithm for choosing a given block's "skip pointer" is in chain.cpp:

// Determine which height to jump back to. Any number strictly lower than height is acceptable,
// but the following expression seems to perform well in simulations (max 110 steps to go back
// up to 2**18 blocks).
return (height & 1) ? InvertLowestOne(InvertLowestOne(height - 1)) + 1 : InvertLowestOne(height)
InvertLowestOne(int n) { return n & (n - 1); }

The skip pointer is used to move through the blockchain in O(log n) rather than walking the chain in O(n).

The idea behind a "skip list" is described here: [https://en.wikipedia.org/wiki/Skip_list Wikipedia page for skiplist]

'''Processing the Block Headers'''

Code path:
ProcessMessage (entry point)
AcceptBlockHeader
CheckBlockHeader (non-contextual validation checks)
ContextualCheckBlockHeader (blockchain-aware validation checks)

Non-Contextual Checks

These checks are "non-contextual" because they do not require any knowledge of the state of our blockchain.

1) Proof-of-Work Meets Claimed Requirement: Here, the code checks that the block has the POW that the miner claims was necessary when constructing the block. Later, we'll re-check it against our blockchain. An honest miner should always pass both checks. Only a miner who lies about the required POW (block.nBits) would pass this check but fail the contextual check.

2) Timestamp not-too-late: The code checks that the timestamp is less than 2 days in the future. This is context-independent because the code just compares the block's timestamp against the system time.

 Contextual Checks 

These checks require some knowledge of our node's blockchain.

1) Re-check Proof of Work: Here, we re-check the POW based on our own knowledge of the difficulty target (rather than trusting the difficulty the miner placed in the block's nBits).

2) Checkpoints (if enabled): If this block's height is a checkpoint, the block hash matches the hash stored in the checkpoint map. (see checkpoint.cpp).

3) Timestamp not-too-early: Here, we check that the block's timestamp is not prior to the previous block's median time (of the previous 11 blocks). This guarantees that the median time continues to advance from block to block. Obviously, this check is context-dependent because it requires knowledge of prior blocks in the active chain, which is obtained by retrieving the block from mapBlockIndex. For more info, see [BIP 113].

4) Version number

Add the Block

Finally, add the block to mapBlockIndex and update pIndexBestHeader (see AddToBlockIndex()).

At this point, the block indexes are VALID_TREE, since we know they are on the main chain, but we haven't yet received the transactions.

'''Downloading Blocks'''

Your node will start downloading blocks from the peer that it is downloading headers from right away.

This is so because:
* The headers-peer is surely a preferred-download peer, set when you received its "version" message (main.cpp:4056).
* Thus, your node will call FindNextBlocksToDownload for your headers-peer (main.cpp:5143).
* Your headers-peer will have a valid pIndexBestKnownBlock (main.cpp:421), since that was set in UpdateBlockAvailability() when you received the peer's headers (main.cpp:4532).
* FindNextBlocksToDownload will return a vector of block hashes, which you'll then request (see main.cpp:5145 [add block to the request vector] and main.cpp:5178 [send getdata msg to your headers-peer]).

However, you will not start downloading blocks from the other peers until IBD is complete.

This is so because:
* Your other peers are probably also preferred-download peers, so you'll call FindNextBlockToDownload just like with your headers-peer.
* However, that function call will return immediately because it won't be able to find any available blocks.
* When IBD concludes (see discussion below), you'll start exchanging headers with the other peers.
* Once you've exchanged headers with your other peers, you'll start requesting blocks.

Blocks are downloaded in chunks of 16 blocks at a time from each peer. With a block size of 1MB, if blocks are full we are talking about 16MB chunks. (See main.h - "MAX_BLOCKS_IN_TRANSIT_PER_PEER".)

''Moving Window''

The download code uses a "moving window" of 1024 blocks. The idea is that although you are downloading different chunks from multiple peers, at any given time the blocks you are downloading are fairly close together. The main purpose of this is so that blocks that are near one another on the blockchain are most likely contained in the same .dat file (where the raw block data is stored on disk). One advantage of having a correlation between blockchain location and block file location is that if the node chooses to "prune" block data at a later date, it's easier to delete older block files.

''Disconnecting a peer''

The code disconnects slow peers in order to keep the download moving. The peer can be disconnect under two scenarios.

First, any block requested from the peer must be delivered within a certain timeframe.

Usually, this timeframe is 20 minutes.

The following code is in main.cpp:
GetBlockTimeout(...) { return nTime + 500000 * consensusParams.nPowTargetSpacing * (4 + nValidatedQueuedBefore);}

nValidatedQueuedBefore are the number of blocks that are in flight to us and which we have validated. Let's call that N.

If N = 0, this code evaluates as:
: = 500,000 microseconds (1/2 second) * 600 seconds (the target block interval) * 4
: = 0.5 seconds * 600 seconds * 4
: = 1,200 seconds
: = 20 minutes

The formula can be simplified as:
: = 0.5 * (4 + N) * block_interval
: = (2 + 0.5 * N) * block_interval

Thus:
: N=0, timeout = 2 * block_interval = 20 minutes.
: N=10, timeout = 7 * block_interval = 70 minutes.
: N=30, timeout = 17 * block_interval = 170 minutes.

The second circumstance where we will disconnect a peer is where the peer manages to stall your entire download by preventing the moving window from progressing. Imagine your moving window is blocks 1000 to 2024 and you've downloaded everything from 1016 to 2024, but are waiting for Alice to serve up 1000 to 1016, which you may have requsted a few minutes earlier. In this siutation, if Alice continues to stall the moving window for 2 seconds, you will drop Alice and replace her with a more reliable peer. In the mantime, your node will request blocks 1000 to 1016 from one of your other peers so your moving window can start moving again.

'''Requesting Blocks'''

Requesting blocks is handled by sending out a "getdata" message (see main.cpp:5137 in v0.11). A "getdata" message is used to pack a vector of "inv" messages and send as a group. Thus, we can request 16 blocks from the peer with one network message, as opposed to sending 16 separate "inv" messages.

What blocks do we request? The code to figure that out is in FindNextBlocksToDownload; it uses a '''block locator''' to figure out the last block we have in common with this peer, and starts from there. A block locator is a homemade algorithm that efficiently finds the fork point between our node and a given peer. The locator will pack a list of 32 blocks, starting with the last 10 (since in steady-state we are usually within 10 blocks of our peers), and then jumping back exponentially. (chain.cpp:25).

This algorithm has proven to be effective at locating the fork point.

Once we've prepared the "getdata" message, we mark the blocks that we've requested as being "in flight" - with MarkBlocksAsInFlight (main.cpp).

'''Receiving Blocks'''

The entry point for receiving blocks is in ProcessMessage, when our node receives a "block" message from a peer.

The block is processed with ProcessNewBlock (main.cpp):

: 1) Checks on the block
:: a) Basic, context-independent checks on the block (merkle root, block size, etc)
:: b) Basic, context-independent on transactions within the block (transaction size, etc.)
:: c) Check that we requested this block (if unrequested, this may be a DOS attack - though not necessarily)
:: d) Context-dependent checks on the block

: 2) Store the block to disk

: 3) General check on the integrity of our blockchain (this may sound expensive, but is very fast)

: 4) Connect the block to the blockchain - if appropriate (see ActivateBestChain)

Note that the block's transactions are not validated (checked for a double-spend, etc.) until the last step, when the block is connected as the new tip of the blockchain. For more info, see below "Connecting a Block".

==Monitoring the Download==

You can monitor the download with the RPC call getpeerinfo, which shows (for each peer that you're connected to):

* "synced_headers": The last header we have in common with this peer
* "synced blocks": The last block in commen with this peer
* "inflight": The heights of blocks we're currently requesting from this peer

==See also==
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_1):_Overview Bitcoin Core 0.11 (Ch 1): Overview]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_2):_Data_Storage Bitcoin Core 0.11 (Ch 2): Data Storage]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_3):_Initialization_and_Startup Bitcoin Core 0.11 (Ch 3): Initialization and Startup]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_4):_P2P_Network Bitcoin Core 0.11 (Ch 4): P2P Network]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_6):_The_Blockchain Bitcoin Core 0.11 (Ch 6): The Blockchain]

[[Category:Technical]]
[[Category:Developer]]

Bitcoin Core 0.11 (ch 3): Initialization and Startup

2016-01-21T20:32:03Z

Mrbandrews: edit footer

This page describes the Bitcoin Core code that manages startup and initialization.

==Program entry point==

The program's entry point can be found in bitcoind.cpp.

main() is three lines of code:
* SetupEnvironment() (all this does is set the program's locale)
* Connect signal handlers
* AppInit() (this function loops for the life of the program)

AppInit: this is located nearby in bitcoind.cpp:
* Parses the command line
* Opens the data directory
* Reads the config file
* Forks a process (if running as a daemon)
* Passes control to AppInit2(), found in init.cpp.

==Initialization steps (init.cpp)==

AppInit2() initializes the bitcoin system.

It contains about 800 lines of code, which are broken into 12 steps.

Where each step begins is documented in the code. Init.cpp has a few functions at the top of the file, but for the most part it consists of AppInit2().

The following table summarizes the steps:

{| class="wikitable"
|-
! Initialization Step !! Short Description !! Longer Description
|-
| 1 || OS-specific setup tasks || These tasks are not particularly interesting. For more info, see the code.
|-
| 2 || Parameter Interactions || Certain command-line options require other options to be set in a certain way. For example, -zapwallettxes implies a -rescan, thus the code will set the -rescan flag=true if it isn't already.
|-
| 3 || Internal flags / Parameter sanity-check || Sets global variables for certain parameters. For the wallet, it sanity-checks transaction fee levels (makes sure your fee is high enough to qualify for relay [error]; but not absurdly high [warning]).
|-
| 4 || Application init. / RPC Server ||
Locks the data directory. (If unable, print error and quit.) 
Spawn X threads for the Script-checking engine. (Default=0, meaning use all available processors; boost::thread::hardware_concurrency). 
Start RPC server in "warmup" mode.
|-
| 5 || Verify wallet database integrity || If wallet is enabled, try to open it. 
If the user knows that the wallet has been corrupted (-salvagewallet), try to recover the private keys.
|-
| 6 || Network Initialization ||
The node registers for certain signals. 
Checks whether the user wants to interact only with peers on a certain network (ip4, ip6, tor). 
Checks whether to use onion routing (tor). 
Checks whether the user wants to whitelist any specific peers. 
Attempts to listen on the bitcoin port (exits on failure). 
If user specified a certain peer to seed connections, attempt to connect.
|-
| 7 || Load the block chain. ||
Load the blockchain into memory and initialize the UTXO caches. 
 
Calculate cache sizes. 
There is a total cache size, which is divided amongst three specific caches. 
Default total cache size = 100MB (Max: 4 GB, min: 4 MB). 
1) Blockchain cache: 1/8 of the total cache, but shouldn't be larger than 2MB. 
2) UTXO database cache : 25-50% of the remaining cache space. This is the LevelDB cache. 
This stores uncompressed blocks of LevelDB data and is managed by LevelDB, as described in [http://leveldb.googlecode.com/git-history/1.17/doc/index.html the LevelDB documentation.] 
3) UTXO in-memory cache: Half of the remaining cache space. 
This cache size defines the size of the cacheCoins object (a protected member of CoinsViewCache). 
TODO: verify that this statement is correct... 
 
Load the blockchain into mapBlockIndex. 
By "blockchain" this means the entire block tree (all known blocks, not just those in the active chain.) 
What is loaded into memory are the CBlockIndex objects, which contain metadata about the block. 
Verifies the last 288 blocks (VerifyDB). 
Note: The program takes less than 1 second from startup until this point; this step takes about 10-20 seconds. 
 
The UTXO set. 
The UTXO set is not loaded into memory; instead, the cache will be filled as coins are accessed from the database. 
Note that as of May 2015, storing the entire UTXO set in memory would require about 3.6 GB. 
As of Jan. 2016, the compressed data on disk is about 1.2 GB.
|-
| 8 || Load the wallet. || If this is the first time the program has been run, it creates a wallet and gives you an initial key (address).
|-
| 9 || Datadir maintenance || If the user is block-pruning, unset NODE_NETWORK and call the pruning function.
|-
| 10 || Import blocks || Scan for better chains in the block chain database, that are not yet connected as the active best chain.
|-
| 11 || Start node / RPC server ||
Calls StartNode in net.cpp. 
This starts up the networking thread group, including ThreadProcessMessage, which is the program's main thread (see below). 
Transition RPC server from "warmup" mode to normal mode.
|-
| 12 || Finished
|}

When AppInit2 finishes, control returns to AppInit() in bitcoind.cpp.

There, the code's top-level thread loops indefinitely in a function called WaitForShutdown(). It sleeps for 2 seconds and checks to see if the user pressed ctrl-C. If so, it calls Shutdown() back in init.cpp.

Shutdown() shuts down the RPC server, stops the node, unregisters the signal handlers, etc., and then the program completes.

==Cache Sizes==

Step 7 initialized the cache sizes. There are 3 caches contemplated in step 7. Two are LevelDB database caches and the other is the coins cache, whose size is managed by the flushing code in main.cpp.

The user can allocate a total cache size with -dbcache. The user cannot pick and choose how much space to allocate to each specific cache. The default total cache size = 100MB (Max: 4 GB, min: 4 MB).

'''1) Block index cache'''

This cache stores uncompressed chunks of the /blocks/index LevelDB data and is managed by LevelDB, as described in [http://leveldb.googlecode.com/git-history/1.17/doc/index.html the LevelDB documentation.] 

If the user enables a full transaction index (-txindex=1) it can be up to 1/8 of the total cache size. If -txindex is not enabled then only 2 MiB is needed.

'''2) UTXO database cache'''

This is the LevelDB cache for the /chainstate database.

This cache is allocated 25-50% of the remaining cache space, depending on the total cache size.

'''3) UTXO in-memory cache'''

This is the coins cache that is managed by the main.cpp code. (see FlushStateToDisk and related functions)

The variable (nCoinsCache) is declared as extern in main.h. In main.cpp, it is hard-coded to 5000 * 300 (in-memory coins are about 300 bytes, so this means 5000 coins), however it should be re-initialized in Step 7.

This cache is given all of the remaining cache space.

This cache is not loaded during initialization, rather it is filled as coins are accessed. (This can be verified by the CCoinsViewCache constructor, which sets cachedCoinsUsage=0.)

==Thread Startup==

The code uses boost::thread_groups to manage the various threads.

It should be noted that although Bitcoin Core is a multi-threaded program, "the reference Satoshi client is largely single-threaded." [https://github.com/bitcoin/bips/blob/master/bip-0031.mediawiki Comment by Mike Hearn in BIP 31 (2012)]

What is meant is that the vast majority of the program's activity takes place in the messaging thread (ThreadMessageHandler - see below.)

Almost all of the threads are part of a single, master thread group that is created on the stack at program startup (see bitcoind.cpp). This thread group is passed to init.cpp which creates a few child threads (including a number of script-checking threads, but these are all part of the master thread group, not a separate group.)

The thread group is passed to net.cpp, which creates the networking threads, including the message-processing thread.

The two other thread groups are task-specific:
* rpc server thread group (see rpcserver.h/cpp)
* miner thread group

Naturally, the node will only create the RPC server thread group if the RPC server is activated, and will only create the miner thread group if it is mining. If both are disabled, then Bitcoin Core only has a single thread group.

'''Child Threads'''

The parent thread (meaning the thread in which the program begins operating) delegates almost all of the program's work to child threads. After spawning threads in init.cpp and net.cpp, the parent thread simply listens for a shutdown command, at which time the parent thread needs only to interrupt the threads in its thread group and proceed with shutdown.

The child threads are summarized in this table, listed in the order in which they are created:

{| class="wikitable"
|-
! Thread !! When / Where Created !! Description
|-
| Script-checking || Step 4 init.cpp || This is a set of threads - 4 by default. Script-checking (including signature checking) is expensive so is handled in separate threads.
|-
| Scheduler || Step 4 init.cpp || Scheduler thread. (TODO: describe)
|-
| RPC Threads || Step 4 rpcserver.cpp || If RPC server enabled, start a group of threads to handle RPC calls.
|-
| Import || Step 10 init.cpp || Imports blocks. Three scenarios: 1) Reindex (rescan all known blocks from blk???.dat files). 2) Bootstrap (use bootstrap.dat as an alternative to full IBD from the network.) 3) -loadblock (scan a specific blk???.dat file) If none of those apply, this thread does nothing.
|-
| DNSAddressSeed || Step 11 net.cpp || Attempts to build a vector of IP addresses based on the dns seeds, stores the vector and the thread exits. In a test in June 2014, this took about 4 seconds and found 158 addresses.
|-
| Plug & Play || Step 11 net.cpp || UPNP (Universal Plug & Play) Deals with port mapping for UPNP.
|-
| SocketHandler || Step 11 net.cpp || This thread services the sockets: Waits for I/O on all the relevant sockets with a 50ms timeout. Processes new incoming connections on listening socket and creates a CNode for the new peer. Receives and sends data streams. Sets sockets that have not done anything to a disconnected state.
|-
| OpenAddedConnections || Step 11 net.cpp || Initiates outbound connections specified by the user with the –addnode parameter. If can't connect, sleeps for 2 minutes each cycle.
|-
| OpenConnections || Step 11 net.cpp || Initiates other outbound connections from DNS seeds (if that fails, find nodes based on fixed seeds) If can't connect, sleeps for 500 milliseconds each cycle.
|-
| MessageHandler || Step 11 net.cpp || This is the program's main thread. This thread runs a while(true) loop, receiving and sending messages. (See net.cpp:1049) The code uses boost::signals2 to call the ProcessMessages and SendMessages functions in main.cpp. (The code introducing signals is in PR 2154 - see the next-to-last commit in that pull.) ProcessMessage and SendMessage run in this thread. So, most of the code in main.cpp runs in this thread.
|-
| Wallet Flusher || Step 12 init.cpp || If wallet is enabled, this thread flushes the wallet periodically.
|}

==See also==
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_1):_Overview Bitcoin Core 0.11 (Ch 1): Overview]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_2):_Data_Storage Bitcoin Core 0.11 (Ch 2): Data Storage]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_4):_P2P_Network Bitcoin Core 0.11 (Ch 4): P2P Network]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_5):_Initial_Block_Download Bitcoin Core 0.11 (Ch 5): Initial Block Download]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_6):_The_Blockchain Bitcoin Core 0.11 (Ch 6): The Blockchain]

[[Category:Technical]]
[[Category:Developer]]

Bitcoin Core 0.11 (ch 1): Overview

2016-01-21T20:31:32Z

Mrbandrews: edit footer

==Organization & Maintenance of these Pages==

The purpose of this set of Wiki pages is to document the Bitcoin Core C++ source code, in a way that is helpful to the programmer who wants to learn how the program is designed and what the code does.

Ideally, the accuracy of the information on these pages would be checked by developers who are getting up to speed on the Bitcoin Core code base. Additionally, each new release of Bitcoin Core (0.12, 0.13, etc) could have a new set of pages that are modified to match that version.

These pages are loosely based on the set of pages called "Satoshi Client: xxx" (on this Wiki) which were written in 2011 and based on version 0.3.

This set of Wiki pages includes:

* Ch 1: Intro & Overview (this page)
* Ch 2: Data Storage
* Ch 3: Initialization & Startup
* Ch 4: P2P Network
* Ch 5: Initial Block Download (IBD)
* Ch 6: Blockchain
* Ch 7: Transactions & the Memory Pool
* Ch 8: RPC Server

These pages document the "relay node" aspect of Bitcoin Core, meaning a node which validates blocks and transactions and relays them to other nodes. The node has fully validated the blockchain, although it may not necessarily maintain a full copy of it on disk.

These pages do NOT cover:
* Wallet
* GUI (Qt)
* Mining

==Definitions==

A few definitions at the outset:

'''Consensus code'''

Code that validates blocks and transactions.

Consensus code must have bug-for-bug compatibility across versions and implementations (meaning, 0.12 must have the same consensus behaviour as 0.11, even if it is buggy; otherwise, a network fork may result.)

'''Policy code'''

Code that implements a particular node's policy (as opposed to consensus). A node's algorithm for which transactions to store in its transaction pool is an example of policy. For example, a node could refuse to relay or store any transaction that is larger than 200KB. What is important is that if such a transaction is transmitted to the node as part of a newly mined block, the node does not reject the block.

'''P2P code'''

Code relating to communications with other nodes (peers) over the P2P network. Communication includes discovering and connecting to other nodes; exchanging various P2P messages (e.g., messages containing blocks and transactions); occasionally, banning misbehaving peers. The bitcoin network uses a custom set of P2P messages. Most of the P2P code can be found in net.h/net.cpp.

'''Mempool ("memory pool" or "transaction pool")'''

A set of transactions which the node knows about and chooses to store in memory and relay to other nodes, and which have not yet been included in a block. In many cases, this may be the full set of transactions that the node has received and validated. If the node has received transactions that violate its policy, however, the mempool will be a subset. In any event, when the node receives and validates a block, it deletes any transactions in the block from its mempool.

'''Full Node'''

A full node is one that validates blocks and transactions and relays them to other nodes. A full node has validated the blockchain from scratch (although with block file pruning, it may have discarded older parts of the chain to clear up disk space.) The key characteristic of a full node is that it has validated the blockchain and continues to fully validate and relay incoming blocks and transactions. A full node can be differentiated from an SPV node, which trusts another node (or set of nodes) to validate.

'''"Basic Full Node"'''

A "basic full node" is what is documented in these pages. By "basic full node," what is meant is a node that validates and relays blocks and transactions, but does not mine new blocks or perform other optional tasks (RPC server, wallet). Extending these pages to include documentation of these optional aspects of a node is a future project.

==Architecture of a Bitcoin Node==

One developer described the architecture of a basic full node as follows:

: --------------------

: The basic architecture of a bitcoin node is as follows:

: At the core there exist fundamental bitcoin message structures, along with the code necessary for serialization/deserialization. These structures belong in their own source files with minimal dependencies so they can be reused for applications that needn't perform verification and relay - for instance, filtering and notification agents. Unfortunately, these core structures currently reside for the most part in main.h/main.cpp...

: On top of these core structures sits a network component that manages sockets, does peer discovery, and handles queueing and dispatching of messages. This component is clearly dependent on the core message structures but does not depend on the specific logic used to verify blocks and transactions nor to identify misbehaving peers nor sign transactions nor maintain a block chain database.

: Then we have a scripting engine, signature verification component, and a signing component. Historical database applications do not need signature verification/signing functionality at all. Filtering messages and sending alerts generally does not even require a scripting engine and does fine with basic pattern matching.

: The most critical high-level operations needed by a verification/relay node such as the satoshi client are transaction verification; block chain and memory pool management; and detection/management of misbehaving peers. These things are currently primarily implemented in main.h/main.cpp. These are indeed the main operations of the satoshi client - but the core low-level structures should not depend at all on this logic.

: ---------------------

See here: [https://github.com/bitcoin/bitcoin/pull/2154 PR 2154]

In the form of a (crude) picture:

Validating transactions; Managing blockchain, mempool, and peers.
|
Scripting engine / Signatures
|
Network layer
|
P2P Messages

And here is the same picture, augmented with the definitions above:

Validating transactions; Managing blockchain, mempool, peers (Consensus and Policy code)
|
Scripting engine / Signatures (Consensus code)
|
Network layer (P2P code)
|
P2P Messages

==Code Modularization / Organization==

As of 0.11, modularization of the Bitcoin Core code is somewhat limited.

Ideally, Bitcoin Core would be modularized so that the consensus code would be separated and made into a library which could be distributed to other implementations. In this way, fears of accidentally forking the network would be mitigated. As of 2016, this is work in progress (see, e.g., various "libconsensus" pull requests on GitHub.)

In December 2013, a proposal was made for modularizing the code base: [https://github.com/bitcoin/bitcoin/issues/3440 Post-0.9 modularization of Bitcoin Core]

Examples of optional modules would be:
* Miner
* Wallet
* Notifications

As of 0.11, the steps taken toward modularization were primarily separating certain classes into subdirectories, described below.

==Source Code Files==

The C++ code is in the src/ directory of the repository.

Most of the code resides in the top-level directory, although there has been some effort to modularize the code base with subdirectories (wallet, consensus, primitives, etc.) Also, certain components (QT, LevelDB, etc) live in subdirectories.

Key files in the src/ directory include: (file.* means the header file [file.h] and the source file [file.cpp])

{| class="wikitable"
|-
! File !! Description / Purpose
|-
| net.* || Manages the network (peer connections, etc.). The while(true) loop in ThreadMessageHandler controls the program's flow, signalling main.cpp when there is work to do. Key dependencies: None.
|-
| init.cpp || Initializes the node, calling functions in main.cpp as necessary. Key dependencies: main.h
|-
| main.* || main.h declares some key global variables (mapBlockIndex, chainActive, mempool, etc), constants, and functions. main.cpp is the program's longest source file (5,237 lines). main.cpp has most of the key functions for managing the blockchain, such as connecting, disconnecting, validating and storing blocks; identifying a certain block as the tip of the longest chain; and so forth. The "entry point" for most of the code is ProcessMessages (which listens for a signal from the message-handling thread.). Some of the code is run during initialization, called directly from init.cpp. Key dependencies: net.h
|-
| chain.* || The header file (chain.h) is the more notable of the two, as it declares the type definitions for the metadata about the block (CBlockIndex) and the longest blockchain (CChain). chain.cpp contains a few handy functions for managing the blockchain (e.g., locating blocks and finding a fork point between two chains.)
|-
| coins.* || The header file declares a CCoin, which is, conceptually, "a bitcoin." The source file contains methods for manipulating coins (retrieving, spending, etc.)
|-
| miner.* || Contains the mining code, including block creation and generating new bitcoins.
|-
|}

'''Subdirectories:'''

The subdirectories fall into three categories:
* Well-defined components (in some cases third-party)
* Modularization of code
* Other (unit tests, build files, etc.)

''Subdirectories - Components:''

{| class="wikitable"
|-
! Directory !! Description / Purpose
|-
| leveldb || C++ source code, docs, etc. for the LevelDB build.
|-
| qt || The GUI code (QT). QT is a C++ open-source project for GUI code, first released in 1995.
|-
| secp256k1 || Library implementing ECDSA cryptography. Purpose: This proprietary C library eliminates reliance on SSL for signature checking. This is important because SSL was susceptible to introducing consensus bugs, because newly released versions do not guarantee bug-for-bug compatibility. This library was written/released in early 2015.
|-
| zmq || From the ZMQ wiki: ZMQ (or ZeroMQ or 0MQ) is a high-performance asynchronous messaging library. It provides a message queue, but unlike message-oriented middleware, a ZMQ system can run without a dedicated message broker. The library is designed to have a familiar socket-style API.
|}

''Subdirectories - Modularization:''

{| class="wikitable"
|-
! Directory !! Key Files !! Description / Purpose
|-
| consensus || consensus.* merkle.* params.* validation.* || Code implementing (or defining, as the case may be) the block & transaction validation rules. Purpose: Moving this code into a subdirectory is a step towards modularizing the consensus code. The idea is that in a future version of bitcoin, the consensus code should be packaged as a library, so that alternative implementations of the protocol could simply include this library and guarantee validation compatibility. "...[T]he goal is not reimplementing the consensus rules but rather extract them from Bitcoin Core so that nobody needs to re-implement them again. It is not only exposing it but also separating it from Bitcoin Core so that they can be changed without having to also change/take into account non-consensus Bitcoin Core specific things." -- Jorge Timon, on bitcoin-development mailing list, 20 Aug 2015. Discussion on github: [https://github.com/bitcoin/bitcoin/issues/6714 PR6714]
|-
| crypto || ripemd.* sha256.* || Cryptographic hash functions. Both RIPEMD and SHA-256 are used in transforming a bitcoin address to Base-58 encoding.
|-
| policy || policy.* fees.* || Move validation code that is a matter of ''policy'' (as opposed to consensus) into a separate directory.
|-
| primitives || block.* transaction.* || Definitions of certain basic data types (blocks, transactions, etc.)
|-
| script || interpreter.* script.* standard.* || The script engine. Defines the op_codes (script.h). Parses and evaluates the validation script. (interpreter.cpp:EvalScript()) Defines what is a "standard" transaction (standard.h). Purpose: the Script engine validates basic transactions but also makes contracts possible. It could be said that in large part, what a platform like Ethereum does is provide a more robust script engine (and language in which to express script) - so in a sense, deploying such as system consists of replacing this sub-directory with something more powerful.
|-
| wallet || wallet.* || Wallet code.
|}

''Subdirectories - Other:''

{| class="wikitable"
|-
! Directory !! Description / Purpose
|-
| compat || A few minor, low-level files dealing with compatibility details.
|-
| config obj obj-test || These directories relate to the build process.
|-
| test || Unit tests. Uses the Boost unit test framework. A good introduction to the Boost user test framework is here: [http://www.alittlemadness.com/2009/03/31/c-unit-testing-with-boosttest/]
|-
| univalue || Per the README: "A universal value object, with JSON encoding (output) and decoding (input). Built as a single dynamic RAII C++ object class, and no templates."
|}

==Design Patterns==

'''Object-Oriented Design'''

Naturally, being a C++ program, the code employs object-oriented design.

However, the code's use of object-oriented design is by not universal. In 0.11, objects are used mainly for defining the data structures which main.cpp uses to manage the blockchain and the UTXO set (the bitcoins). Main.h declares many functions and global variables but almost no classes; main.cpp is over 5000 lines but does not include any class methods.

The code has a relatively flat class structure, with most classes being "stand-alone".

Where inheritance is used, it often is a linear hierarchy (i.e.: A <-- B <--C <-- D)

For example:
CBlockHeader <-- CBlock
CTransaction <-- CMerkleTx <-- CWalletTx

There are only a few examples of base classes that have more than one descendant class, and multiple inheritance is only used once in the code (CWallet inherits from 2 classes.)

The program's best example of an elegant class hierarchy relates to the coins caches. It uses an abstract class that is inherited by a few subclasses that demonstrate encapsulation and polymorphism. Chapter 3 has a class diagram as well as a diagram and explanation of how the classes are instantiated and relate to one another.

'''Multithreading'''

The code uses Boost thread groups to implement multithreading. The vast majority of the action takes place in the message-processing thread, and to a lesser extent the socket thread.

'''Observer (Signals and slots)'''

The "observer" pattern uses "signals and slots" to decouple two or more areas of code. Bitcoin Core 0.11 uses the observer pattern in a limited way. Namely, it uses signals to decouple the message-gathering loop from the main.cpp code. In 0.10 and earlier a normal function call was used.

 
 
 
==See also==
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_2):_Data_Storage Bitcoin Core 0.11 (Ch 2): Data Storage]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_3):_Initialization_and_Startup Bitcoin Core 0.11 (Ch 3): Initialization and Startup]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_4):_P2P_Network Bitcoin Core 0.11 (Ch 4): P2P Network]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_5):_Initial_Block_Download Bitcoin Core 0.11 (Ch 5): Initial Block Download]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_6):_The_Blockchain Bitcoin Core 0.11 (Ch 6): The Blockchain]

[[Category:Technical]]
[[Category:Developer]]

Bitcoin Core 0.11 (ch 2): Data Storage

2016-01-21T20:30:42Z

Mrbandrews: edit footer

This page describes how & where Bitcoin Core stores blockchain data.

==Overview==

There are basically four pieces of data that are maintained:

: '''blocks/blk*.dat:''' the actual Bitcoin blocks, in network format, dumped in raw on disk. They are only needed for rescanning missing transactions in a wallet, reorganizing to a different part of the chain, and serving the block data to other nodes that are synchronizing.

: '''blocks/index/*:''' this is a LevelDB database that contains metadata about all known blocks, and where to find them on disk. Without this, finding a block would be very slow.

: '''chainstate/*:''' this is a LevelDB database with a compact representation of all currently unspent transaction outputs and some metadata about the transactions they are from. The data here is necessary for validating new incoming blocks and transactions. It can theoretically be rebuilt from the block data (see the -reindex command line option), but this takes a rather long time. Without it, you could still theoretically do validation indeed, but it would mean a full scan through the blocks (7 GB as of may 2013) for every output being spent.

: '''blocks/rev*.dat:''' these contain "undo" data. You can see blocks as 'patches' to the chain state (they consume some unspent outputs, and produce new ones), and see the undo data as reverse patches. They are necessary for rolling back the chainstate, which is necessary in case of reorganisations.

Note that the LevelDB's are redundant in the sense that they can be rebuilt from the block data. But validation and other operations would become intolerably slow without them.

See here: [http://bitcoin.stackexchange.com/questions/11104/what-is-the-database-for?rq=1 StackExchange post by Pieter Wuille (2013)]

==Raw Block data (blk*.dat)==

Block files store the raw blocks as they were received over the network.

Block files are about 128 MB, allocated in 16 MB chunks to prevent excessive fragmentation. As of October 2015, the block chain is stored in about 365 block files, for a total of about 45 GB.

Each block file (blk1234.dat) has a corresponding undo file (rev1234.dat) which contains the data necessary to remove blocks from the blockchain in the event of a reorganization (fork).

Info about the block files is stored in the block index (the LevelDB) in two places:
* General info about the files themselves is held in the "f" records in the block index LevelDB (meaning keys "fxxxx", where "xxxx" is the 4 digit file number), including:
** Number of blocks stored in the file
** File size (and the corresponding undo file size)
** Lowest and highest block in the file
** Timestamps - earlier and latest blocks in the file

* Info about where to find a particular block on disk is in the "b" ("b" = block) record:
** Each block contains a pointer to the block is on disk (a file number and an offset)

'''Accessing the block data files from the code'''

The block files are accessed through:

1) DiskBlockPos: a struct that is simply a pointer to a block's location on disk (a file number and an offset.)

2) ''vInfoBlockFiles'': a vector of BlockFileInfo objects. This variable is used to perform such tasks as:
* Determine whether new blocks can fit into the current file or a new file needs to be created
* Calculate the total disk usage by block & undo files
* Iterate through the block files and find ones that can be pruned

Blocks are written to disk as soon as they are received, in AcceptBlock. (The actual disk write operation is in WriteBlockToDisk [main.cpp:1164]). Note that there is some overlap of the code that accesses block files with the code that accesses and writes to the coins database (/chainstate). There is a complex system of when to flush state to disk. None of this code affects block files, which are simply written to disk when received. Once they have been received and stored, the block files are only needed for serving blocks to other nodes.

'''More info about block files'''

See here: [https://github.com/sipa/bitcoin/commit/5382bcf8cd23c36a435c29080770a79b5e28af42 the commit that puts multiple blocks in a block file (2012)]

==Block index (leveldb)==

The block index holds metadata about all known blocks, including where the block is stored on disk.

Note that the set of "known blocks" is a superset of the longest chain, because it includes blocks that were received and processed but are not part of the active chain - for example, orphaned blocks that were detached from the active chain in a small reorganization.

'''Terminology'''

The terminology can be a little confusing here, because while people normally think of the "blockchain" as being synonymous with the active chain (an uninterrupted, linear chain of X blocks starting with the genesis block and continuing to the current tip), there are some places in the code where "blockchain" refers to the active chain plus the numerous, mostly short forks off the chain that our node happens to know about.

a) Block Tree

A better term for the set of known blocks stored on disk is "block tree," as this term contemplates a tree structure with numerous branches (albeit small ones) from the main chain. Indeed, the block index LevelDB is accessed through the "CBlockTreeDB" wrapper class, defined in src/txdb.h. Note that it's perfectly fine, indeed it is expected, that different nodes would have slightly different block trees; what matters is that they agree on the active chain.

'''Key-value pairs'''

Inside the actual LevelDB, the used key/value pairs are:

'b' + 32-byte block hash -> block index record. Each record stores:
* The block header
* The height.
* The number of transactions.
* To what extent this block is validated.
* In which file, and where in that file, the block data is stored.
* In which file, and where in that file, the undo data is stored.

'f' + 4-byte file number -> file information record. Each record stores:
* The number of blocks stored in the block file with that number.
* The size of the block file with that number ($DATADIR/blocks/blkNNNNN.dat).
* The size of the undo file with that number ($DATADIR/blocks/revNNNNN.dat).
* The lowest and highest height of blocks stored in the block file with that number.
* The lowest and highest timestamp of blocks stored in the block file with that number.

'l' -> 4-byte file number: the last block file number used.

'R' -> 1-byte boolean ('1' if true): whether we're in the process of reindexing.

'F' + 1-byte flag name length + flag name string -> 1 byte boolean ('1' if true, '0' if false): various flags that can be on or off. Currently defined flags include:
* 'txindex': Whether the transaction index is enabled.

't' + 32-byte transaction hash -> transaction index record. These are optional and only exist if 'txindex' is enabled (see above). Each record stores:
* Which block file number the transaction is stored in.
* Which offset into that file the block the transaction is part of is stored at.
* The offset from the start of that block to the position where that transaction itself is stored.

See here: [http://bitcoin.stackexchange.com/questions/28168/what-are-the-keys-used-in-the-blockchain-leveldb-ie-what-are-the-keyvalue-pair StackExchange post by Pieter Wuille (2014)]

'''Data Access Layer'''

The database is accessed through CBlockTreeDB wrapper class. See txdb.h.

The wrapper is instantiated in a global variable called pblocktree, defined in main.cpp.

''CBlockIndex''

Blocks stored in the database are represented in memory as CBlockIndex objects. An object of this type is first created after the ''header'' is received; the code does not wait to receive the full block. When headers are received over the network, they are streamed into a vector of CBlockHeaders, which are then checked. Each header that checks out causes a new CBlockIndex to be created, which is stored to the database.

''CBlock / CBlockHeader''

Note that these objects have little to do with the /blocks LevelDB. A CBlock holds the full set of transactions in the block, the data for which is stored in two places - in full, in raw format, in the blk???.dat files, and in pruned format in the UTXO database. The block index database cares not for such details, since it holds only the metadata for the block.

''Loading the block database into memory''

The entire database is loaded into memory on startup. See LoadBlockIndexGuts (txdb.cpp). This only takes a few seconds.

The blocks ('b' keys) are loaded into the global "mapBlockIndex" variable. "mapBlockIndex" is an unordered_map that holds CBlockIndex for each block in the entire block tree; not just the active chain.

mapBlockIndex is described in more detail in Chapter 6 - The Blockchain.

The block file metadata ('f' keys) is loaded into vInfoBlockFiles.

==The UTXO set (chainstate leveldb)==

The UTXO database was introduced in 2012 in [https://github.com/bitcoin/bitcoin/pull/1677 pull request #1677 - "Ultraprune."]

The idea behind "Ultraprune" is to reduce the size of (prune) the set of past transactions, keeping only those parts of past transactions that are necessary to validate later transactions.

Say you have a transaction T1 which takes two inputs and sends to 3 outputs: O1,O2,O3. Two of those outputs (O1, O2) have been used as inputs in a later transaction, T2. Once T2 has been mined, T1 only has one item of interest (O3). There's no reason to keep T1 around in its entirety. Instead, a slimmed-down version of T1 will suffice, consisting only of O3 (locking script and amount) and certain basic information about T1 (height, whether it is a coinbase, etc.)

The description of ultraprune is on the specific "ultraprune" commit within the pull:

: -------------

: This switches bitcoin's transaction/block verification logic to use a "coin database", which contains all unredeemed transaction output scripts, amounts and heights.

: The name ultraprune comes from the fact that instead of a full transaction index, we only (need to) keep an index with unspent outputs. For now, the blocks themselves are kept as usual, although they are only necessary for serving, rescanning and reorganizing.

: The basic data structures are CCoins (representing the coins of a single transaction), and CCoinsView (representing a state of the coins database). There are several implementations for CCoinsView. A dummy, one backed by the coins database (coins.dat), one backed by the memory pool, and one that adds a cache on top of it. FetchInputs, ConnectInputs, ConnectBlock, DisconnectBlock, ... now operate on a generic CCoinsView.

: The block switching logic now builds a single cached CCoinsView with changes to be committed to the database before any changes are made. This means no uncommitted changes are ever read from the database, and should ease the transition to another database layer which does not support transactions (but does support atomic writes), like LevelDB.

: For the getrawtransaction() RPC call, access to a txid-to-disk index would be preferable. As this index is not necessary or even useful for any other part of the implementation, it is not provided. Instead, getrawtransaction() uses the coin database to find the block height, and then scans that block to find the requested transaction. This is slow, but should suffice for debug purposes.

: -----------------

See: [https://github.com/sipa/bitcoin/commit/450cbb0944cd20a06ce806e6679a1f4c83c50db2 Ultraprune - July 2012]

'''Terminology'''

"UTXO (Unspent Transaction Out):" An output from a transaction. This is colloquially referred to as a "coin." For this reason, the UTXO db is sometimes referred to as the "coins database."

"UTXO set / coins database / chainstate database:" These terms are more or less synonymous and are used interchangeably.

"Provably Unspendable:" A coin is provably unspendable if its scriptPubKey cannot be satisfied - for example, an OP_RETURN. A provably unspendable coin can be eliminated from the utxo database regardless of its amount.

'''Key-value pairs'''

The records in the chainstate levelDB are:

'c' + 32-byte transaction hash -> unspent transaction output record for that transaction. These records are only present for transactions that have at least one unspent output left. Each record stores:
* The version of the transaction.
* Whether the transaction was a coinbase or not.
* Which height block contains the transaction.
* Which outputs of that transaction are unspent.
* The scriptPubKey and amount for those unspent outputs. 
'B' -> 32-byte block hash: the block hash up to which the database represents the unspent transaction outputs.

See here: [http://bitcoin.stackexchange.com/questions/28168/what-are-the-keys-used-in-the-blockchain-leveldb-ie-what-are-the-keyvalue-pair StackExchange post by Pieter Wuille (2014)]

'''Data Access Layer and Caching'''

Access to the UTXO database is considerably more complex than the block index. This is because its performance is critical to the overall performance of the Bitcoin system. The block index is not so critical to performance because there are only a few hundred thousand blocks and a node running on decent hardware can retrieve and scroll through them in a few seconds (and does not need to do so very often.) On the other hand, there are millions of coins in the UTXO database and they must be checked and modified for each input of each transaction going into the mempool or included in a block.

As sipa said in the ultraprune commit:
: The basic data structures are CCoins (representing the coins of a single transaction), and CCoinsView (representing a state of the coins database). There are several implementations for CCoinsView. A dummy, one backed by the coins database (coins.dat), one backed by the memory pool, and one that adds a cache on top of it.

This is not stated as clearly as it might have been, however; at least, not for the current state of the code.

In 0.11, the instantiations of the CoinsView are:
* dummy
* database
* pCoinsTip (a cache backed by the database)
* "validation cache" (used when backed by pCoinsTip, in use when connecting a block)

Separate from that chain of caches is the memory pool's CoinsView, which is backed by the database.

The class diagram (data types) for the views is:

CCoinsView (abstract class)
/ \
ViewDB ViewBacked
(database) / \
ViewMempool ViewCache

Each class has one key characteristic:
* View is the base class, declaring methods for verifying that coins exist (HaveCoins), retrieving coins (GetCoins), etc.
* ViewDB has code to interact with the LevelDB.
* ViewBacked has a pointer to another View; thus it is "backed" by another view (version) of the UTXO set.
* ViewCache has a cache (a map of CCoins).
* ViewMempool associates a mempool with a view.

Those are the defined classes; whereas the object diagram is:

Database
/ \
MemPool Blockchain cache (pcoinsTip)
View/Cache \
Validation cache

Here is a table summarizing the instantiations of Views:

{| class="wikitable"
|-
! Object !! Type !! Backed By? !! Description / Purpose
|-
| DB view || ViewDB || n/a || Represents the UTXO set according to the /chainstate LevelDB. Retrieves coins and flushes changes to the LevelDB. Creation in code (instantiation): see init.cpp:1131
|-
| pCoinsTip (blockchain cache) || ViewCache || DB view || Holds the UTXO set corresponding to the active chain's tip. Retrieves/flushes to the database view. Creation in code: see init.cpp:1133
|-
| Validation cache || ViewCache || pCoinsTip || This cache's lifetime is within ConnectTip (or DisconnectTip). Its purpose is to keep track of modifications to the UTXO set while processing a block. If the block validates, the cache is flushed to pcoinsTip. If the block fails, the cache is discarded. Creation in code: see main.cpp:2231: CCoinsViewCache view(pcoinsTip);
|-
| Mempool view || ViewMemPool || pCoinsTip || This object brings the mempool into view, meaning it can see both a UTXO set and the mempool. Its purpose is to enable validation of chains of transactions, a.k.a. "zero-confirmation" transactions. (If chains of transactions weren't permitted, the mempool could simply validate against pcoinsTip.) Thus, when queried, it can check if a given input can be found either in the mempool (i.e., "zero-conf") or in the blockchain's utxo set ("confirmed.") Note that this object is not a cache; rather, it is a view that is used by the object below, which does contain a cache. Creation in code: Its lifetime is that of AcceptToMemoryPool in main.cpp.
|-
| Mempool cache || ViewCache || Mempool view || The cache for the mempool. It contains a cache and sets its backend to be the mempool view. Creation in code: Its lifetime is also that of AcceptToMemoryPool in main.cpp.
|}

''Loading the UTXO set''

Access to the coins database is initialized in init.cpp: 1131-1133:

pcoinsdbview = new CCoinsViewDB(nCoinDBCache, false, fReindex);
pcoinscatcher = new CCoinsViewErrorCatcher(pcoinsdbview);
pcoinsTip = new CCoinsViewCache(pcoinscatcher);

The code starts by initializing a CoinsViewDB, which is equipped with methods to load coins from the LevelDB. 
The error catcher is a little hack that can be ignored. 
Next, the code initalizes pCoinsTip, which is the cache representing the state of the active chain, and is backed by the database view. 

''Cache vs. Database''

The FetchCoins function in coins.cpp demonstrates how the code uses the cache vs. the database:
1 CCoinsMap::iterator it = cacheCoins.find(txid);
2 if (it != cacheCoins.end())
3 return it;
4 CCoins tmp;
5 if (!base->GetCoins(txid, tmp))
6 return cacheCoins.end();
7 CCoinsMap::iterator ret = cacheCoins.insert(std::make_pair(txid, CCoinsCacheEntry())).first;

First, the code searches the cache for the coins for a given transaction id. (line 1) 
If found, it returns the "fetched" coins. (lines 2-3) 
If not, it searches the database. (line 5) 
If found in the database, it updates the cache. (line 7) 
 
Note: if the cache's backend is another cache, then the term "database" really means "parent cache."
 
 

''Flushing the Validation Cache to the Blockchain Cache''

The validation cache is flushed to the blockchain cache after connecting a block, just before it goes out of scope. The scope is captured in ConnectTip, and specifically, in the code block main.cpp:2231-2243. In that code block, there is a call to ConnectBlock, during which the code stores the new coins in the validation cache. (Specifically, see UpdateCoins() in main.cpp.) At the end of the code block, the validation cache is flushed. Since its "parent view" is also a cache (pcoinsTip, the "blockchain cache") the code will call the parent's ViewCache::BatchWrite, which swaps the updated coin entries into its own cache. (Polymorphism in action: Later, when the the blockchain cache flushes to the database view, the code will run CoinsViewDB::BatchWrite, the last line of which writes to the LevelDB.)

In summary, usage of the validation cache is straightforward: it is instantiated, used, flushed, and goes out of scope in the aforementioned code block.

''Flushing the Blockchain Cache to the Database''

Flushing the validate cache was simple because the code only shuffled items between two caches in memory (of which no one is aware outside of the caching code.) Flushing the blockchain cache to the database is a bit more complicated. At the lowest level, the mechanics of flushing the blockchain cache (pcoinsTip) is the same as the validation cache: the Flush() method calls BatchWrite on its backend (the "base" pointer), and in this case that means BatchWrite on the database view. Up a level, Flush() is called from FlushStateToDisk (FSTD) - main.cpp:2098. FlushStateToDisk is invoked at a few different points, with a given ''mode'':

{| class="wikitable"
|-
! Flush Mode !! Description !! When called
|-
| IF_NEEDED || Flush only if the cache is over its size limit. || Right after connecting (or disconnecting) a block and flushing the validation cache. See ConnectTip / DisconnectTip.
|-
| ALWAYS || Flush cache. || During initialization only.
|-
| PERIODIC || Here, the code considers other data points to decide whether to flush. Is the code ''almost'' over its size limit? Has it been a long time since the cache was flushed? If so, then proceed.|| At end of ActivateBestChain() (Code comment: "write changes periodically to disk, after relay").
|}

The idea is to flush the block cache frequently (to avoid having to download a large number of blocks if the program crashes), but the coins cache infrequently (in order to maximize the benefit from the coins cache.)

Specifically, the block cache is guaranteed to be flushed once an hour, whereas the coins cache once per day. (See here: [https://github.com/bitcoin/bitcoin/pull/6102#issuecomment-98847663 Sipa comment on PR 6102])

The FlushStateToDisk code is well-commented so for more info, the curious reader can check main.cpp.

==Raw undo data (rev*.dat)==

The undo data contains the information that is necessary to disconnect or "roll back" a block: specifically, the coins that were spent by the block in question.

So, the data being written is essentially a set of CTxOut objects. (A CTxOut is simply an amount and a script - see primitives/transaction.h:107-108).

The matter is complicated slightly by the fact that if the coin is the last one being spent by its transaction, the undo data needs to store the transaction's metadata (the txn's block height, whether it's a coinbase, and its version.) So, if you have a transaction T with outputs O1,O2,O3 spent in that order, for O1 and O2 all that will be written to the undo file is the amount and the script. For 03, the undo file will have the amount, the script, plus T's height and version, and whether T is a coinbase.

The undo data is written to the raw file with the following code:
fileout << blockundo; (main.cpp:1567 [UndoWriteToDisk])

This line of code calls the serialization function on the CBlockUndo - which is basically just a vector of coins (CTxOuts.) Finally, a checksum is written to the undo file. The checksum is used during initialization to verify that any undo data being checked is intact. See [https://github.com/bitcoin/bitcoin/pull/2145 Pull 2145]

The undo data is used when disconnecting a block. The DisconnectBlock() code is discussed further down this wiki page in The Blockchain: Reorganizations.

==Use of LevelDB==

LevelDB is a key-value store that was introduced to store the block index and UTXO set (chainstate) in 2012 as part of the complex "Ultraprune" pull (PR 1677). See here: [https://github.com/bitcoin/bitcoin/pull/1677/commits the 27 commits on Ultraprune].

On the subject of why LevelDB is used, core developer Greg Maxwell stated the following to the [http://bitcoin-development.narkive.com/XAPoxKZU/patch-switching-bitcoin-core-to-sqlite-db bitcoin-dev mailing list in October 2015]:

:: I think people are falling into a trap of thinking "It's a <database>, I know a <black box> for that!"; but the application and needs are very specialized here. . . It just so happens that on the back of the very bitcoin specific cryptographic consensus algorithim there was a slot where a pre-existing high performance key-value store fit; and so we're using one and saving ourselves some effort...

One might ask whether different nodes could use different databases - as long as they retrieve the same data, what's the difference? The issue here is "bug-for-bug compatibility" - if one database has a bug that causes records to not be returned under certain circumstances, then all other nodes bst have the same bug, else the network could fork as a result.

Greg Maxwell stated the following in [http://bitcoin-development.narkive.com/XAPoxKZU/patch-switching-bitcoin-core-to-sqlite-db the same thread referenced above (in response to a proposal to switch to using sqlite)]:

:: ...[D]atabases sometimes have errors which cause them to fail to return records, or to return stale data. And if those exist consistency must be maintained; and "fixing" the bug can cause a divergence in consensus state that could open users up to theft.

:: Case in point, prior to leveldb's use in Bitcoin Core it had a bug that, under rare conditions, could cause it to consistently return not found on records that were really there. . . Leveldb fixed this serious bug in a minor update. But deploying a fix like this in an uncontrolled manner in the bitcoin network would potentially cause a fork in the consensus state; so any such fix would need to be rolled out in an orderly manner.

==See also==
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_1):_Overview Bitcoin Core 0.11 (Ch 1): Overview]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_3):_Initialization_and_Startup Bitcoin Core 0.11 (Ch 3): Initialization and Startup]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_4):_P2P_Network Bitcoin Core 0.11 (Ch 4): P2P Network]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_5):_Initial_Block_Download Bitcoin Core 0.11 (Ch 5): Initial Block Download]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_6):_The_Blockchain Bitcoin Core 0.11 (Ch 6): The Blockchain]

[[Category:Technical]]
[[Category:Developer]]

Bitcoin Core 0.11 (ch 4): P2P Network

2016-01-21T20:29:18Z

Mrbandrews: edit footer

Bitcoin is a peer-to-peer network, so Bitcoin Core has code to discover peers and manage those connections.

Most of the network-handling code is in net.h/cpp.

==Data structures to manage peers==

At any given time, the node is connected with a set of other nodes, i.e. peers. By default the code connects to 8 outbound peers (nodes that our node goes out and finds) and allows up to 125 inbound peers (nodes that find us through the network).

The global variable vNodes (net.h) holds the set of peers. The variable is protected by cs_vNodes.

Each peer is represented by a CNode.

The CNode contains dozens of attributes, most of which have to do with low-level plumbing (sockets, byte streams, etc.)

Some of the key attributes of a CNode include:

'''CNode'''

{| class="wikitable"
|-
! Attribute !! Description
|-
| nServices || Commonly referred to as "the service bits." This is a bitmap of what services the peer provides. As of 0.11 this is still just binary: either the node is NODE_NETWORK (it is a full node and does everything) or not (it is SPV). In future versions, these bits will convey more precise information about what the node can and can't do. For example, with Block Pruning a node may be able to serve recent blocks (say, the last week or two worth of blocks), but not the entire blockchain.
|-
| fClient || Whether this peer is a SPV node. (Here, the term "client" means "merely SPV, not a full node.")
|-
| fOneShot ||
|-
| fInbound || Whether this node is "inbound" or "outbound." Common sense suggests that a node that we discovered through the network (an outbound node) is less likely to attack us than a node that found us. So, for example, when the code looks for a peer to request historical blocks from, it prefers an outbound node, if possible.
|-
| fWhiteListed || A whitelisted node is not subject to being banned for bad behaviour.
|-
| setInventoryKnown ||
|-
| vSendMsg || Messages that we've queued up to send to the peer. This is type <CSerializeData> since we are going to send it over the network.
|-
| vRecvMsg || Messages that we've received from the peer. This is type <CNetMessage> because as soon as the data is received, it is deserialized and packed into a more useful format (an object).
|}

==Peer discovery & connectivity==

===Address manager===

IP addresses of the node's peers are managed by the address manager (see addrman.h).

The code comment explains the address manager (edited here for conciseness):

Design goals:
* Keep the address tables in-memory, and asynchronously dump the entire table to peers.dat.
* Make sure no (localized) attacker can fill the entire table with his nodes/addresses. To that end:
* Addresses are organized into buckets.
* Addresses that have not yet been tried go into 1024 "new" buckets.
* Addresses of nodes that are known to be accessible go into 256 "tried" buckets.
* Bucket selection is based on cryptographic hashing, using a randomly-generated 256-bit key, which should not be observable by adversaries.
* Several indexes are kept for high performance. Defining DEBUG_ADDRMAN will introduce frequent (and expensive) consistency checks for the entire data structure.

''Timestamps''

The address manager also keeps track of when each peer was last heard from. Timestamps are only updated on an address and saved to the database when the timestamp is over 20 minutes old. By understanding the role of timestamps, it will become more clear why timestamps are kept the way they are for each of the different ways an address is discovered.

===Peer discovery===

The program discovers the IP address and port of nodes in several different ways:
* Address database (peers.dat)
* User-specified (-addnode and -connect)
* DNS seeding
* Hard-coded seeds
* From other peers ("getaddr" and "addr" messages)

'''1) Address database (peers.dat)'''

Nodes store addresses in a database (peers.dat) which is read on startup, and loaded into the address manager.

This method does not work the first time the program is run, since it does not already know about any other nodes on the bitcoin network.

For details on when/how the code stores to the database, see the section above on the address manager.

'''2) User-specified on the command line'''

The user can specify nodes to connect to on the command line with -addnode <ip> or -connect <ip>.

Notes on user-specified IP addresses:
* Multiple nodes may be specified.
* Addresses are initially given a zero timestamp, therefore they are not advertised in response to a "getaddr" request.
* With -connect, the IP addresses will not be added to peers.dat and only the provided addresses will be used.
* With -addnode, the provided addresses will be used as a starting point, but the node will soon learn other peers.

'''3) DNS seeding'''

DNS seeding is only used if the peers.dat database is empty (as it would be when initiallly running the program) and the user has not specified any nodes with -addnode or -connect.

In this case, the node can issue DNS requests to discover IP addresses of other peers.

As of 0.11, there are six DNS servers hard-coded into the program - see chainparams.cpp.

A DNS reply can contain multiple IP addresses for a requested name.

Addresses discovered via DNS are initially given a zero timestamp, to avoid being advertised in response to a "getaddr" request.

'''4) Hard-coded nodes (last resort)'''

If DNS seeding fails, the client contains hard coded IP addresses that represent bitcoin nodes. See: chainparamsseeds.h

These addresses are only used as a last resort, and a log message will be printed: "Adding fixed seed nodes as DNS doesn't seem to be available." (net.cpp)

The idea is to move away from seed nodes as soon as possible, to avoid overloading those nodes.

Once the local node has enough addresses (presumably learned from the seed nodes), the connection thread will close any seed node connections.

Like the DNS seed addresses, the hard-coded seed addresses are also given a zero timestamp to avoid being advertised in response to a "getaddr" request.

'''5) From other nodes ("getaddr" and "addr" messages)'''

Nodes exchange IP addresses with other nodes via the "getaddr" and "addr" messages. 

Usually, an addr" message is sent in response to a "getaddr".

However, the "addr" message may also arrive unsolicited, because nodes advertise addresses gratuitously when they:
* Relay addresses (see below)
* Advertise their own address periodically. (Every 24 hours, the node advertises its own address to all connected nodes.)
* When a connection is made (in response to an initial "version" message)

When does a node send a "getaddr" message?
* In response to a "version" message from an outbound node, if we don't yet have 1000 addresses.

Receiving an "addr" message:
* If the sending node is an old version and we have 1000 addresses already, it is ignored.
* If the sending node is a current version and is attempting to send us more than 1000 addresses, the peer is punished for misbehaving.
* If the address has been seen in the last 24 hours and the timestamp is currently over 60 minutes old, then it is updated to 60 minutes ago.
* If the address has NOT been seen in the last 24 hours, and the timestamp is currently over 24 hours old, then it is updated to 24 hours ago.

Responding to a "getaddr" message:
* The node figures out how many addresses it has that have a timestamp in the last 3 hours.
* It sends those addresses, but if there are more than 2500 addresses, it randomly selects 2500.
* It clears the list of the addresses we think the remote node has, which will trigger a refresh of sends to nodes. See SendMessages.

''Address Relay''

Once added, the newly received IP addresses may be relayed to other nodes if the following conditions are met:
* The address timestamp is recent (within 10 minutes of the current time)
* The "addr" message contained 10 addresses or less
* fGetAddr=false for the sending peer. (See the code for details.)
* The address must be routable.
* Code:
if (addr.nTime > nSince && !pfrom->fGetAddr && vAddr.size() <= 10 && addr.IsRoutable())
* If this test is passed, then the code's next step is (see main.cpp for details):
// Use deterministic randomness to send to the same nodes for 24 hours at a time so the addrKnowns of the chosen nodes prevent repeats.

===Peer connectivity===

The connection thread (ThreadOpenConnections) chooses among the available addresses and makes connections, and disconnects nodes when appropriate.

'''Use of CSemaphore for outbound connections'''

The code uses a semaphore to manage the number of outbound connections (usually 8).

Most of the code dealing with the semaphore is in net.cpp.

When a connection is opened, the semaphore grant is passed to the CNode data structure. This allows the socket thread to release the semaphore when the time comes, with:
pnode->grantOutbound.Release() // see net.cpp

The code CSemaphore grant(*semOutbound) will wait until there is a connection available.

'''Inbound connections: accepting and disconnecting'''

Inbound connections can be up to 125 total.

ThreadSocketHandler has the code that accepts inbound connections.

The socket thread loop:
* 1) Disconnects sockets that have the fDisconnect flag set on them (and have empty buffers)
* 2) Prepares all sockets for "select"
* 3) Calls "select", which is a system call which waits for activity on a set of sockets.

When the select() call returns, the node accepts any new connections, receives and sends on any ready sockets, and marks any inactive sockets to be disconnected (whether inbound or outbound).

Sockets are disconnected if:
* they are 60 seconds old and have not sent or received data.
* they have not sent or received data in the last 20 minutes (TIMEOUT_INTERVAL = 20*60) (or 90 minutes if peer is an old version)
* the socket overfills the buffer (see CNode::ReceiveMsgBytes- "Oversized message from peer=%i, disconnecting\n" in net.cpp)

==Sockets & Messages==

'''Socket Thread'''

The socket thread operates at the TCP layer.

It goes through an endless loop, reading and writing the sockets. (see net.cpp).

Its loop involves three basic activities:

1) Administrative work: disconnecting unused sockets, checking which sockets have data, adding sockets for new connections.

2) Receiving data: It reads the sockets that have data using the recv() system call and places that data into the peer's queue of CNetMessages. A CNetMessage organizes the data into two data streams - the message header and the message data (vRecv). The socket thread reads the buffer until it has processed all the messages from this particular peer.

3) Sending data: The message thread queued up messages-to-be-sent as vSendMsg objects, so the socket thread deserializes these objects and sends them using send(). (send() is a syscall and any incompatibilities across different operating systems are handled in the compat.h file.)

'''Message Thread'''

This is the program's main thread.

It operates primarily at the "business logic" level - validating transactions, managing the blockchain, etc.

In a sense, all of the node's activities take the form of processing an inbound message or preparing an outbound message.

Like the socket thread, this thread consists of a while(true) loop, processing inbound messages and queuing up outbound messages. Once the program's initialization is complete, this loop (see net.cpp:ThreadMessageHandler) is the program's high-level point of control.

The loop uses signals to notify main.cpp that there are messages waiting to be processed. The signal is picked up by ProcessMessages().

The use of signals has nothing to do with multi-threading; the signal is sent and picked up in the same thread. The use of signals was introduced in version 0.9 for the purpose of decoupling net.cpp from main.cpp. In version 0.8, the loop simply called the ProcessMessages() function. By changing to signals, the net.cpp code no longer needs to be aware of the processing code. Removing that dependency allows the code to avoid circular includes (since main.cpp requires knowledge of net.h.

The pull request introducing signals is [https://github.com/bitcoin/bitcoin/pull/2154 PR 2154.]

The commit removing the "main.h" dependency is [https://github.com/CodeShark/bitcoin/commit/6e68524e95da2bedc21b1d95c4a206b902ab7c22 here].

'''ProcessMessages (main.cpp)'''

ProcessMessages() is the entry point in main.cpp for almost all of the code that processes and validates transactions and blocks, etc.

It attempts to find a message start signature in the vRecv stream. If it finds a message start, it deletes everything prior to the start. Then it reads the header, extracts the message type, and calls ProcessMessage on the message.

ProcessMessage() is basically a large "switch" which takes action based on what type of message it is dealing with.

Often, in the course of processing a message, the code will push messages to the outbound queue. For example, when processing an incoming "getdata" message, the node pushes the outbound data into the queue.

'''SendMessages (main.cpp)'''

SendMessages() creates messages and queues them up in the peer's vSendMsg queue (a double-ended queue, or "deque" in C++). The vSendMsg objects are basically just serialized data.

SendMessages goes through various data structures looking for work to do. When it produces a message it calls the CNode->PushMessage, which queues the outbound data. (Note that there are many other places in the code that produce messages and call CNode->PushMessage; SendMessages() doesn't have any kind of exclusive license on placing messages in the outbound queue.)

Once the data is queued up by PushMessage, it sits and waits for the socket thread to come along.

The socket thread and the message thread use a peer-specific lock (node->cs_vSend) to coordinate access to the socket.

==Locks==

The main locks associated with the P2P aspect of the code are:
* cs_vNodes controls access to the CNode objects.
* cs_vSend controls access to the node's send buffer.
* cs_vRecvMsg controls access to the node's receiving buffer.
* cs_inventory

==Denial-of-Service Prevention==

DoS prevention is implemented by keeping track of misbehaving peers, and if they misbehave, banning them.

The DoS prevention framework was introduced in 2011, in Pull 517.

As summarized there:

<blockquote>
-----
The big idea: if a peer is sending you obviously wrong information, punish it by maybe dropping your connection to it, and ban it's IP address so it cannot immediately re-connect.
 The probability of dropping the connection, and the length of the ban, depend on how wrong, and how potentially wasteful/damaging, the peer is. So sending an extra 'version' message is a minor transgression that is usually tolerated, sending an more than MAX_BLOCK_SIZE block is a major transgression.
 Detailed how-it-works, using "I got a version message I wasn't expecting" as the specific example:
 Getting a version message from a peer increases that peer's 'misbehaving' score by 10, and (assuming that is the peer's first bad behavior) gives it a 10% of being disconnected. If it is disconnected, then that peer's IP address is banned from connecting for a couple of hours. If it is not disconnected, then nothing happens unless the peer misbehaves again; if it does, then its chances of being disconnected go up, and the length of time it will be banned increases.
 Misbehavior/ban information is stored only in memory, and information about misbehaving peers is never broadcast. Also, peers that are disconnected/banned are just dropped, there is no warning or reason sent.
 
 
--Gavin Andresen
 
-----
</blockquote>

Source: https://github.com/bitcoin/bitcoin/pull/517</blockquote>

'''Banned nodes'''

The set of banned nodes is in setBanned in net.cpp.

By default, a node is banned for 24 hours, though this can be configured with -bantime option.

==See also==
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_1):_Overview Bitcoin Core 0.11 (Ch 1): Overview]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_2):_Data_Storage Bitcoin Core 0.11 (Ch 2): Data Storage]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_3):_Initialization_and_Startup Bitcoin Core 0.11 (Ch 3): Initialization and Startup]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_5):_Initial_Block_Download Bitcoin Core 0.11 (Ch 5): Initial Block Download]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_6):_The_Blockchain Bitcoin Core 0.11 (Ch 6): The Blockchain]

[[Category:Technical]]
[[Category:Developer]]

Bitcoin Core 0.11 (ch 6): The Blockchain

2016-01-21T20:27:20Z

Mrbandrews: Created page with " This page describes the code that manages the blockchain. ==Block Index and Block Status== The block index database gets loaded into memory when the node starts. This..."

This page describes the code that manages the blockchain.

==Block Index and Block Status==

The block index database gets loaded into memory when the node starts. This means the entire block tree, not just the active chain. See LoadBlockIndexGuts() in src/txdb.cpp.

The block index (block metadata) is well commented in the code: see src/chain.h.

One of the key traits of a block is its "verification status."

Verification status captures the degree to which the code has validated this block, as well as its ancestor blocks.

The block's status is one of the following:

* VALID_HEADER = 1
* VALID_TREE = 2
* VALID_TRANSACTIONS = 3
* VALID_CHAIN = 4
* VALID_SCRIPTS = 5

The precise meaning of each status code is documented in chain.h.

Also stored in the block index are two variables worth mentioning:

: '''nTx:''' Number of transactions in this block.
: nTx > 0 means that the block has a status of at least VALID_TRANSACTIONS.

: '''nChainTx:''' Number of transactions in this block's chain, up to and including this block.
: This value will be set if and only if transactions for this block ''and all its parents'' are available.
: Thus, nChainTx > 0 is shorthand for a chain that is VALID_TRANSACTIONS. This is notable because this information is not available via the block-status enum. Namely, VALID_TRANSACTIONS only implies that its parents are TREE, while VALID_CHAIN implies that its parents are also CHAIN. In a sense, then, the expression (nChainTx !=0) is shorthand for a status that might be said to be "VALID_nChainTx = 3.5" - because it's more than VALID_TRANSACTIONS but less than VALID_CHAIN.
: Note: nChainTx is only stored in memory; there is no corresponding entry in the database.

==Key Variables==

First, a quick C++ reminder:
* map: an unordered <key,value> container (think of it as a hashtable)
* set: an ordered <key> container (think of it as a sorted linked list)
* multimap: an ordered map <key,value> where duplicate keys are allowed (thus, can hold elements <a,1>,<b,2>,<b,3>)

'''mapBlockIndex (map<block_hash, CBlockIndex*>)'''

This map contains all known blocks (where "block" means "block index"). Since a block index is created and stored in the LevelDB when a header is received, it's possible to have block indexes in the block map without having received the full block yet, let alone having stored it to disk.

mapBlockIndex is not sorted. Just think of it as your blocks/ LevelDB in memory, with the key being the block hash.

It is technically of type BlockMap, which is for readability. BlockMap is an unordered_map<block hash, block index*, comparator_function>.

mapBlockIndex is initialized from the database in LoadBlockIndexGuts, which is run at Step 7 of startup. Thereafter, it's updated whenever new blocks are received over the network.

mapBlockIndex only grows, it never shrinks. (Try searching main.cpp for mapBlockIndex.erase.) Observe also that the block index's LevelDB wrapper does not contain functionality for erasing blocks from the database - it's writing function (WriteBatchSync) only writes to the database. By comparison, the chainstate wrapper's writing function (BatchWrite) both writes and erases. (see txdb.cpp).

'''mapBlocksUnlinked'''

Multimap containing "all pairs A->B, where A (or one if its ancestors) misses transactions, but B has transactions." (comment at main.cpp:125).

The purpose of mapBlocksUnlinked is to quickly attach blocks we've already received to the blockchain when we receive a missing, intermediate block.

The alternative would be to search the entire mapBlockIndex; however, it is more efficient to keep track of unlinked blocks in a separate data structure.

Example 1:
Let A be our tip.
We receive block ''headers'' for B, C.
We receive the full block for C.
mapBlocksUnlinked = <B,C>

Upon receiving block B, we can connect C.

Example 2:
Let A be our tip.
We receive block ''headers'' for B, C, D.
We receive the full block for D.
mapBlocksUnlinked = <B,C>, <B,D>, <C,D>

Upon receiving block B, we can connect B as our tip and delete its entries in mapBlocksUnlinked, which would now consist of only one item: <C,D>.

'''setBlockIndexCandidates'''

Set of block indexes that have more total work than our current tip. (In the normal case where the block extends our current tip, it is easy enough to see that it has more total work than our tip.) Thus, they are "candidates" for extending our current blockchain (or re-organizing from our current chain to the chain that the candidate is on.) We call them "candidates" because we verify the block's proof-of-work when we receive the header, but before we receive the block. Thus, the ''header'' is a candidate for extending our chain, but we can't say for sure until we receive the full block (and if the candidate is more than one block away from our current tip, we also need to receive and verify any intermediate blocks.)

Example 1: Let A be our tip; we then receive, in order, B, C, D, such that:
A -- B
\
C -- D

We verify headers for B, C, D and they all look good. At this point setBlockIndexCandidates contains <B,C,D>. Assume B has more work than C but less work than D.

Now we receive the full block for B and it checks out. At this point, we extend chainActive with B as the new tip, and remove B from setBlockIndexCandidates. We also remove C because it has less work than B. But D still is.

Now we receive C and D. C is valid, but D has a bad transaction (double-spend, invalid signature, etc.):
* Store C to disk and keep it in mapBlocksIndex - it's a known block
* Discard D (do not store to disk) and delete it from mapBlocksIndex - it's a bad block
* Remove D from setBlockIndexCandidates

Now our chainActive is Genesis - - - - - - - A--B, and setBlockIndexCandidates is <empty>.

'''pindexBestHeader'''

This variable holds the best (most work) header that your node has validated.

It's set initially when loading the block index, and updated when adding a header to the mapBlockIndex which has more work than the current pIndexBestHeader.

A header is added to mapBlockIndex once it has passed the context-dependent and context-indendent checks.

'''chainActive (vector<CBlockIndex*>'''

chainActive is the holy blockchain. It is a linear set of blocks, consisting only of those blocks that comprise the longest chain, beginning with the Genesis block and culminating with the tip.

chainActive is a CChain, which is a vector of block indexes bundled with a few useful methods (see chain.h).

On startup, the chain is initialized in Step 10 (init.cpp), which calls ActivateBestChain (main.cpp).

'''Difference between pindexBestHeader and chainActive.Tip'''

It's important to note that pindexBestHeader and chainActive.Tip are NOT necessarily the same thing.

pindexBestHeader is a pointer to the most-work header our node has received, before having received its transactions. By comparison, chainActive.Tip is only set after we've received and validated the entire block.

==Connecting a block==

When the node receives and accepts a block that has more proof-of-work than the tip of its active blockchain, the node will attempt to make this block the new tip.

In "steady-state", where our node's active chain tip is a block B1, the typical course of events is that every 10 minutes or so, the node receives and accepts a new block B2 which was built on top of B1 (meaning B2's "prev_block" is B1's hash), notices that B2 has more work than B1, and connects B2 to B1, thereby extending the active chain.

ConnectBlock() is the key moment when a "double-spend" would be discovered. The block's transactions are validated against the current UTXO set (the coins database). Specifically, the code in ConnectBlock loops through each transaction, ensuring that all of the transaction's inputs can indeed be found in the coins database. (If a coin had been spent in a prior block, it would not be in the current UTXO set; it would have been in a different, earlier view of the UTXO set, but no longer.) The ConnectBlock code also contains some DoS protection code: for example, it ensures that the transactions do not contain an excessive number of signature operations (which are expensive).

ConnectBlock() is the last step in ProcessNewBlock. Thus, it takes place only after the block has been accepted and stored to disk. The idea is that a block can be valid in the abstract but whether its transactions are valid is dependent upon the UTXO set at the particular point where the block is being added to the chain. This may not be known until a later point in time, even in steady-state. Presume that our current tip is B1, which is then extended by B2 and B3 in rapid succession. For whatever reason, we receive B3 before B2 (perhaps B2 was created by a miner across the globe to whom we are poorly connected, whereas B3 was mined by our next-door neighbor who has a better link to the miner who created B2). At the moment we receive B3, it may (and probably does) contain transactions whose inputs do not exist in our coin database, because they are only created in B2. In any event, no effort is made to validate B3 beyond the basic checks. Later, when we receive B2, we connect it and update the UTXO set, then notice that B3 extends B2. Now B3's transactions can be validated and assuming it checks out, it becomes the new tip of the active chain.

If the entire block checks out, then the block is connected, and the undo data is created and stored to disk.

If there's a problem with one of the block's transactions, the block is not connected, and the code punishes the peer that sent us the bad block. (See InvalidBlockFound.)

==Disconnecting a block (reorganizations)==

A reorganization (re-org) occurs when your node realizes there is a longer chain that does not derive from chainActive.Tip.

Most reorganizations are only one-block reorganizations. This is likely to happen whenever competing miners find new blocks (call them A and B) at about the same time. Some of the network will receive A first, others B. Eventually a new block C will be mined on top of either A or B; which one depends on whether the miner who produced C worked on the A or B chain. Assuming C is mined on top of A, then any node which accepted B as the tip of their blockchain will need to detach B and instead attach A and C.

Here is how a reorg happens in the code:
* When we receive a new block, if it has more work than our chain's tip, we add it to the setBlockIndexCandidates (see main.cpp:ReceivedBlockTransactions)
* After the block is processed, ActivateBestChain checks to see if there's a block (such as the one we just processed) that has more work than the current chain's tip.
* ActivateBestChainStep disconnects blocks as necessary to reach the new tip. Obviously, in the usual case it is not necessary to disconnect any blocks.

'''DisconnectTip() does the following:'''
* Returns the utxo's that were consumed by this block back to the utxo set (DisconnectBlock).
* Returns the block's transactions to the mempool. (Now they need to be mined by some other block; quite likely in one of the blocks we are about to connect, in which case we'll remove them again!)
* Moves chainActive.Tip back one block.
* Updates the wallet.

Once the code has disconnected blocks back to the fork point, it connects blocks from the fork point to the new tip.

'''DisconnectBlock (returning utxo's to the utxo set)'''

This code uses the undo.dat file to "un-spend" the coins that were spent in the block now being disconnected:
* Reads the undo file from disk.
* Processes transactions in reverse order (must be reverse order because CreateNewBlock creates zero-conf transactions "in order", meaning that if a block contains T1 and T2 and T2 spends a T1 coin, then T1 must come before T2 in its transaction vector; therefore, when un-spending the coins, T2 must be un-spent before T1).
* Looks up the spent coins
* Un-spends the coins in a helper function, ApplyTxInUndo: coins->vout[out.n] = undo.txout;

==See also==
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_1):_Overview Bitcoin Core 0.11 (Ch 1): Overview]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_2):_Data_Storage Bitcoin Core 0.11 (Ch 2): Data Storage]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_3):_Initialization_and_Startup Bitcoin Core 0.11 (Ch 3): Initialization and Startup]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_4):_P2P_Network Bitcoin Core 0.11 (Ch 4): P2P Network]
 
[https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_5):_Initial_Block_Download Bitcoin Core 0.11 (Ch 5): Initial Block Download]

[[Category:Technical]]
[[Category:Developer]]

Bitcoin Core 0.11 (ch 5): Initial Block Download

2016-01-21T20:08:48Z

Mrbandrews: Created page with " This page explains how Bitcoin Core downloads the blockchain when your node first joins the network. ==Background== Once a new node joins the network, its first order of..."

Bitcoin Core 0.11 (ch 4): P2P Network

2016-01-21T19:59:10Z

Mrbandrews: Created page with " Bitcoin is a peer-to-peer network, so Bitcoin Core has code to discover peers and manage those connections. Most of the network-handling code is in net.h/cpp. =..."

Bitcoin Core 0.11 (ch 1): Overview

2016-01-21T18:50:44Z

Mrbandrews: /* Design Patterns */

Bitcoin Core 0.11 (ch 2): Data Storage

2016-01-13T21:48:41Z

Mrbandrews: /* The UTXO set (chainstate leveldb) */

Bitcoin Core 0.11 (ch 2): Data Storage

2016-01-13T21:27:08Z

Mrbandrews:

This page describes how & where Bitcoin Core stores blockchain data.

==Overview==

There are basically four pieces of data that are maintained:

: '''blocks/blk*.dat:''' the actual Bitcoin blocks, in network format, dumped in raw on disk. They are only needed for rescanning missing transactions in a wallet, reorganizing to a different part of the chain, and serving the block data to other nodes that are synchronizing.

: '''blocks/index/*:''' this is a LevelDB database that contains metadata about all known blocks, and where to find them on disk. Without this, finding a block would be very slow.

: '''chainstate/*:''' this is a LevelDB database with a compact representation of all currently unspent transaction outputs and some metadata about the transactions they are from. The data here is necessary for validating new incoming blocks and transactions. It can theoretically be rebuilt from the block data (see the -reindex command line option), but this takes a rather long time. Without it, you could still theoretically do validation indeed, but it would mean a full scan through the blocks (7 GB as of may 2013) for every output being spent.

: '''blocks/rev*.dat:''' these contain "undo" data. You can see blocks as 'patches' to the chain state (they consume some unspent outputs, and produce new ones), and see the undo data as reverse patches. They are necessary for rolling back the chainstate, which is necessary in case of reorganisations.

Note that the LevelDB's are redundant in the sense that they can be rebuilt from the block data. But validation and other operations would become intolerably slow without them.

See here: [http://bitcoin.stackexchange.com/questions/11104/what-is-the-database-for?rq=1 StackExchange post by Pieter Wuille (2013)]

==Raw Block data (blk*.dat)==

Block files store the raw blocks as they were received over the network.

Block files are about 128 MB, allocated in 16 MB chunks to prevent excessive fragmentation. As of October 2015, the block chain is stored in about 365 block files, for a total of about 45 GB.

Each block file (blk1234.dat) has a corresponding undo file (rev1234.dat) which contains the data necessary to remove blocks from the blockchain in the event of a reorganization (fork).

Info about the block files is stored in the block index (the LevelDB) in two places:
* General info about the files themselves is held in the "f" records in the block index LevelDB (meaning keys "fxxxx", where "xxxx" is the 4 digit file number), including:
** Number of blocks stored in the file
** File size (and the corresponding undo file size)
** Lowest and highest block in the file
** Timestamps - earlier and latest blocks in the file

* Info about where to find a particular block on disk is in the "b" ("b" = block) record:
** Each block contains a pointer to the block is on disk (a file number and an offset)

'''Accessing the block data files from the code'''

The block files are accessed through:

1) DiskBlockPos: a struct that is simply a pointer to a block's location on disk (a file number and an offset.)

2) ''vInfoBlockFiles'': a vector of BlockFileInfo objects. This variable is used to perform such tasks as:
* Determine whether new blocks can fit into the current file or a new file needs to be created
* Calculate the total disk usage by block & undo files
* Iterate through the block files and find ones that can be pruned

Blocks are written to disk as soon as they are received, in AcceptBlock. (The actual disk write operation is in WriteBlockToDisk [main.cpp:1164]). Note that there is some overlap of the code that accesses block files with the code that accesses and writes to the coins database (/chainstate). There is a complex system of when to flush state to disk. None of this code affects block files, which are simply written to disk when received. Once they have been received and stored, the block files are only needed for serving blocks to other nodes.

'''More info about block files'''

See here: [https://github.com/sipa/bitcoin/commit/5382bcf8cd23c36a435c29080770a79b5e28af42 the commit that puts multiple blocks in a block file (2012)]

==Block index (leveldb)==

The block index holds metadata about all known blocks, including where the block is stored on disk.

Note that the set of "known blocks" is a superset of the longest chain, because it includes blocks that were received and processed but are not part of the active chain - for example, orphaned blocks that were detached from the active chain in a small reorganization.

'''Terminology'''

The terminology can be a little confusing here, because while people normally think of the "blockchain" as being synonymous with the active chain (an uninterrupted, linear chain of X blocks starting with the genesis block and continuing to the current tip), there are some places in the code where "blockchain" refers to the active chain plus the numerous, mostly short forks off the chain that our node happens to know about.

a) Block Tree

A better term for the set of known blocks stored on disk is "block tree," as this term contemplates a tree structure with numerous branches (albeit small ones) from the main chain. Indeed, the block index LevelDB is accessed through the "CBlockTreeDB" wrapper class, defined in src/txdb.h. Note that it's perfectly fine, indeed it is expected, that different nodes would have slightly different block trees; what matters is that they agree on the active chain.

'''Key-value pairs'''

Inside the actual LevelDB, the used key/value pairs are:

'b' + 32-byte block hash -> block index record. Each record stores:
* The block header
* The height.
* The number of transactions.
* To what extent this block is validated.
* In which file, and where in that file, the block data is stored.
* In which file, and where in that file, the undo data is stored.

'f' + 4-byte file number -> file information record. Each record stores:
* The number of blocks stored in the block file with that number.
* The size of the block file with that number ($DATADIR/blocks/blkNNNNN.dat).
* The size of the undo file with that number ($DATADIR/blocks/revNNNNN.dat).
* The lowest and highest height of blocks stored in the block file with that number.
* The lowest and highest timestamp of blocks stored in the block file with that number.

'l' -> 4-byte file number: the last block file number used.

'R' -> 1-byte boolean ('1' if true): whether we're in the process of reindexing.

'F' + 1-byte flag name length + flag name string -> 1 byte boolean ('1' if true, '0' if false): various flags that can be on or off. Currently defined flags include:
* 'txindex': Whether the transaction index is enabled.

't' + 32-byte transaction hash -> transaction index record. These are optional and only exist if 'txindex' is enabled (see above). Each record stores:
* Which block file number the transaction is stored in.
* Which offset into that file the block the transaction is part of is stored at.
* The offset from the start of that block to the position where that transaction itself is stored.

See here: [http://bitcoin.stackexchange.com/questions/28168/what-are-the-keys-used-in-the-blockchain-leveldb-ie-what-are-the-keyvalue-pair StackExchange post by Pieter Wuille (2014)]

'''Data Access Layer'''

The database is accessed through CBlockTreeDB wrapper class. See txdb.h.

The wrapper is instantiated in a global variable called pblocktree, defined in main.cpp.

''CBlockIndex''

Blocks stored in the database are represented in memory as CBlockIndex objects. An object of this type is first created after the ''header'' is received; the code does not wait to receive the full block. When headers are received over the network, they are streamed into a vector of CBlockHeaders, which are then checked. Each header that checks out causes a new CBlockIndex to be created, which is stored to the database.

''CBlock / CBlockHeader''

Note that these objects have little to do with the /blocks LevelDB. A CBlock holds the full set of transactions in the block, the data for which is stored in two places - in full, in raw format, in the blk???.dat files, and in pruned format in the UTXO database. The block index database cares not for such details, since it holds only the metadata for the block.

''Loading the block database into memory''

The entire database is loaded into memory on startup. See LoadBlockIndexGuts (txdb.cpp). This only takes a few seconds.

The blocks ('b' keys) are loaded into the global "mapBlockIndex" variable. "mapBlockIndex" is an unordered_map that holds CBlockIndex for each block in the entire block tree; not just the active chain.

mapBlockIndex is described in more detail in Chapter 6 - The Blockchain.

The block file metadata ('f' keys) is loaded into vInfoBlockFiles.

==The UTXO set (chainstate leveldb)==

The UTXO database was introduced in 2012 in [https://github.com/bitcoin/bitcoin/pull/1677 pull request #1677 - "Ultraprune."]

The idea behind "Ultraprune" is to reduce the size of (prune) the set of past transactions, keeping only those parts of past transactions that are necessary to validate later transactions.

Say you have a transaction T1 which takes two inputs and sends to 3 outputs: O1,O2,O3. Two of those outputs (O1, O2) have been used as inputs in a later transaction, T2. Once T2 has been mined, T1 only has one item of interest (O3). There's no reason to keep T1 around in its entirety. Instead, a slimmed-down version of T1 will suffice, consisting only of O3 (locking script and amount) and certain basic information about T1 (height, whether it is a coinbase, etc.)

The description of ultraprune is on the specific "ultraprune" commit within the pull:

: -------------

: This switches bitcoin's transaction/block verification logic to use a "coin database", which contains all unredeemed transaction output scripts, amounts and heights.

: The name ultraprune comes from the fact that instead of a full transaction index, we only (need to) keep an index with unspent outputs. For now, the blocks themselves are kept as usual, although they are only necessary for serving, rescanning and reorganizing.

: The basic data structures are CCoins (representing the coins of a single transaction), and CCoinsView (representing a state of the coins database). There are several implementations for CCoinsView. A dummy, one backed by the coins database (coins.dat), one backed by the memory pool, and one that adds a cache on top of it. FetchInputs, ConnectInputs, ConnectBlock, DisconnectBlock, ... now operate on a generic CCoinsView. The block switching logic now builds a single cached CCoinsView with changes to be committed to the database before any changes are made. This means no uncommitted changes are ever read from the database, and should ease the transition to another database layer which does not support transactions (but does support atomic writes), like LevelDB.

: For the getrawtransaction() RPC call, access to a txid-to-disk index would be preferable. As this index is not necessary or even useful for any other part of the implementation, it is not provided. Instead, getrawtransaction() uses the coin database to find the block height, and then scans that block to find the requested transaction. This is slow, but should suffice for debug purposes.

: -----------------

See: [https://github.com/sipa/bitcoin/commit/450cbb0944cd20a06ce806e6679a1f4c83c50db2 Ultraprune - July 2012]

'''Terminology'''

"UTXO (Unspent Transaction Out):" An output from a transaction. This is colloquially referred to as a "coin." For this reason, the UTXO db is sometimes referred to as the "coins database."

"UTXO set / coins database / chainstate database:" These terms are more or less synonymous and are used interchangeably.

"Provably Unspendable:" A coin is provably unspendable if its scriptPubKey cannot be satisfied - for example, an OP_RETURN. A provably unspendable coin can be eliminated from the utxo database regardless of its amount.

'''Key-value pairs'''

The records in the chainstate levelDB are:

'c' + 32-byte transaction hash -> unspent transaction output record for that transaction. These records are only present for transactions that have at least one unspent output left. Each record stores:
* The version of the transaction.
* Whether the transaction was a coinbase or not.
* Which height block contains the transaction.
* Which outputs of that transaction are unspent.
* The scriptPubKey and amount for those unspent outputs. 
'B' -> 32-byte block hash: the block hash up to which the database represents the unspent transaction outputs.

See here: [http://bitcoin.stackexchange.com/questions/28168/what-are-the-keys-used-in-the-blockchain-leveldb-ie-what-are-the-keyvalue-pair StackExchange post by Pieter Wuille (2014)]

'''Data Access Layer and Caching'''

Access to the UTXO database is considerably more complex than the block index. This is because its performance is critical to the overall performance of the Bitcoin system. The block index is not so critical to performance because there are only a few hundred thousand blocks and a node running on decent hardware can retrieve and scroll through them in a few seconds (and does not need to do so very often.) On the other hand, there are millions of coins in the UTXO database and they must be checked and modified for each input of each transaction going into the mempool or included in a block.

As sipa said in the ultraprune commit:
: The basic data structures are CCoins (representing the coins of a single transaction), and CCoinsView (representing a state of the coins database). There are several implementations for CCoinsView. A dummy, one backed by the coins database (coins.dat), one backed by the memory pool, and one that adds a cache on top of it.

This is not stated as clearly as it might have been, however; at least, not for the current state of the code.

In 0.11, the instantiations of the CoinsView are:
* dummy
* database
* pCoinsTip (a cache backed by the database)
* "validation cache" (used when backed by pCoinsTip, in use when connecting a block)

Separate from that chain of caches is the memory pool's CoinsView, which is backed by the database.

The class diagram (data types) for the views is:

CCoinsView (abstract class)
/ \
ViewDB ViewBacked
(database) / \
ViewMempool ViewCache

Each class has one key characteristic:
* View is the base class, declaring methods for verifying that coins exist (HaveCoins), retrieving coins (GetCoins), etc.
* ViewDB has code to interact with the LevelDB.
* ViewBacked has a pointer to another View; thus it is "backed" by another view (version) of the UTXO set.
* ViewCache has a cache (a map of CCoins).
* ViewMempool associates a mempool with a view.

Those are the defined classes; whereas the object diagram is:

Database
/ \
MemPool Blockchain cache (pcoinsTip)
View/Cache \
Validation cache

Here is a table summarizing the instantiations of Views:

{| class="wikitable"
|-
! Object !! Type !! Backed By? !! Description / Purpose
|-
| DB view || ViewDB || n/a || Represents the UTXO set according to the /chainstate LevelDB. Retrieves coins and flushes changes to the LevelDB. Creation in code (instantiation): see init.cpp:1131
|-
| pCoinsTip (blockchain cache) || ViewCache || DB view || Holds the UTXO set corresponding to the active chain's tip. Retrieves/flushes to the database view. Creation in code: see init.cpp:1133
|-
| Validation cache || ViewCache || pCoinsTip || This cache's lifetime is within ConnectTip (or DisconnectTip). Its purpose is to keep track of modifications to the UTXO set while processing a block. If the block validates, the cache is flushed to pcoinsTip. If the block fails, the cache is discarded. Creation in code: see main.cpp:2231: CCoinsViewCache view(pcoinsTip);
|-
| Mempool view || ViewMemPool || pCoinsTip || This object brings the mempool into view, meaning it can see both a UTXO set and the mempool. Its purpose is to enable validation of chains of transactions, a.k.a. "zero-confirmation" transactions. (If chains of transactions weren't permitted, the mempool could simply validate against pcoinsTip.) Thus, when queried, it can check if a given input can be found either in the mempool (i.e., "zero-conf") or in the blockchain's utxo set ("confirmed.") Note that this object is not a cache; rather, it is a view that is used by the object below, which does contain a cache. Creation in code: Its lifetime is that of AcceptToMemoryPool in main.cpp.
|-
| Mempool cache || ViewCache || Mempool view || The cache for the mempool. It contains a cache and sets its backend to be the mempool view. Creation in code: Its lifetime is also that of AcceptToMemoryPool in main.cpp.
|}

''Loading the UTXO set''

Access to the coins database is initialized in init.cpp: 1131-1133:

pcoinsdbview = new CCoinsViewDB(nCoinDBCache, false, fReindex);
pcoinscatcher = new CCoinsViewErrorCatcher(pcoinsdbview);
pcoinsTip = new CCoinsViewCache(pcoinscatcher);

The code starts by initializing a CoinsViewDB, which is equipped with methods to load coins from the LevelDB. 
The error catcher is a little hack that can be ignored. 
Next, the code initalizes pCoinsTip, which is the cache representing the state of the active chain, and is backed by the database view. 

''Cache vs. Database''

The FetchCoins function in coins.cpp demonstrates how the code uses the cache vs. the database:
1 CCoinsMap::iterator it = cacheCoins.find(txid);
2 if (it != cacheCoins.end())
3 return it;
4 CCoins tmp;
5 if (!base->GetCoins(txid, tmp))
6 return cacheCoins.end();
7 CCoinsMap::iterator ret = cacheCoins.insert(std::make_pair(txid, CCoinsCacheEntry())).first;

First, the code searches the cache for the coins for a given transaction id. (line 1) 
If found, it returns the "fetched" coins. (lines 2-3) 
If not, it searches the database. (line 5) 
If found in the database, it updates the cache. (line 7) 
 
Note: if the cache's backend is another cache, then the term "database" really means "parent cache."
 
 

''Flushing the Validation Cache to the Blockchain Cache''

The validation cache is flushed to the blockchain cache after connecting a block, just before it goes out of scope. The scope is captured in ConnectTip, and specifically, in the code block main.cpp:2231-2243. In that code block, there is a call to ConnectBlock, during which the code stores the new coins in the validation cache. (Specifically, see UpdateCoins() in main.cpp.) At the end of the code block, the validation cache is flushed. Since its "parent view" is also a cache (pcoinsTip, the "blockchain cache") the code will call the parent's ViewCache::BatchWrite, which swaps the updated coin entries into its own cache. (Polymorphism in action: Later, when the the blockchain cache flushes to the database view, the code will run CoinsViewDB::BatchWrite, the last line of which writes to the LevelDB.)

In summary, usage of the validation cache is straightforward: it is instantiated, used, flushed, and goes out of scope in the aforementioned code block.

''Flushing the Blockchain Cache to the Database''

Flushing the validate cache was simple because the code only shuffled items between two caches in memory (of which no one is aware outside of the caching code.) Flushing the blockchain cache to the database is a bit more complicated. At the lowest level, the mechanics of flushing the blockchain cache (pcoinsTip) is the same as the validation cache: the Flush() method calls BatchWrite on its backend (the "base" pointer), and in this case that means BatchWrite on the database view. Up a level, Flush() is called from FlushStateToDisk (FSTD) - main.cpp:2098. FlushStateToDisk is invoked at a few different points, with a given ''mode'':

{| class="wikitable"
|-
! Flush Mode !! Description !! When called
|-
| IF_NEEDED || Flush only if the cache is over its size limit. || Right after connecting (or disconnecting) a block and flushing the validation cache. See ConnectTip / DisconnectTip.
|-
| ALWAYS || Flush cache. || During initialization only.
|-
| PERIODIC || Here, the code considers other data points to decide whether to flush. Is the code ''almost'' over its size limit? Has it been a long time since the cache was flushed? If so, then proceed.|| At end of ActivateBestChain() (Code comment: "write changes periodically to disk, after relay").
|}

The idea is to flush the block cache frequently (to avoid having to download a large number of blocks if the program crashes), but the coins cache infrequently (in order to maximize the benefit from the coins cache.)

Specifically, the block cache is guaranteed to be flushed once an hour, whereas the coins cache once per day. (See here: [https://github.com/bitcoin/bitcoin/pull/6102#issuecomment-98847663 Sipa comment on PR 6102])

The FlushStateToDisk code is well-commented so for more info, the curious reader can check main.cpp.

==Raw undo data (rev*.dat)==

The undo data contains the information that is necessary to disconnect or "roll back" a block: specifically, the coins that were spent by the block in question.

So, the data being written is essentially a set of CTxOut objects. (A CTxOut is simply an amount and a script - see primitives/transaction.h:107-108).

The matter is complicated slightly by the fact that if the coin is the last one being spent by its transaction, the undo data needs to store the transaction's metadata (the txn's block height, whether it's a coinbase, and its version.) So, if you have a transaction T with outputs O1,O2,O3 spent in that order, for O1 and O2 all that will be written to the undo file is the amount and the script. For 03, the undo file will have the amount, the script, plus T's height and version, and whether T is a coinbase.

The undo data is written to the raw file with the following code:
fileout << blockundo; (main.cpp:1567 [UndoWriteToDisk])

This line of code calls the serialization function on the CBlockUndo - which is basically just a vector of coins (CTxOuts.) Finally, a checksum is written to the undo file. The checksum is used during initialization to verify that any undo data being checked is intact. See [https://github.com/bitcoin/bitcoin/pull/2145 Pull 2145]

The undo data is used when disconnecting a block. The DisconnectBlock() code is discussed further down this wiki page in The Blockchain: Reorganizations.

==Use of LevelDB==

LevelDB is a key-value store that was introduced to store the block index and UTXO set (chainstate) in 2012 as part of the complex "Ultraprune" pull (PR 1677). See here: [https://github.com/bitcoin/bitcoin/pull/1677/commits the 27 commits on Ultraprune].

On the subject of why LevelDB is used, core developer Greg Maxwell stated the following to the [http://bitcoin-development.narkive.com/XAPoxKZU/patch-switching-bitcoin-core-to-sqlite-db bitcoin-dev mailing list in October 2015]:

:: I think people are falling into a trap of thinking "It's a <database>, I know a <black box> for that!"; but the application and needs are very specialized here. . . It just so happens that on the back of the very bitcoin specific cryptographic consensus algorithim there was a slot where a pre-existing high performance key-value store fit; and so we're using one and saving ourselves some effort...

One might ask whether different nodes could use different databases - as long as they retrieve the same data, what's the difference? The issue here is "bug-for-bug compatibility" - if one database has a bug that causes records to not be returned under certain circumstances, then all other nodes bst have the same bug, else the network could fork as a result.

Greg Maxwell stated the following in [http://bitcoin-development.narkive.com/XAPoxKZU/patch-switching-bitcoin-core-to-sqlite-db the same thread referenced above (in response to a proposal to switch to using sqlite)]:

:: ...[D]atabases sometimes have errors which cause them to fail to return records, or to return stale data. And if those exist consistency must be maintained; and "fixing" the bug can cause a divergence in consensus state that could open users up to theft.

:: Case in point, prior to leveldb's use in Bitcoin Core it had a bug that, under rare conditions, could cause it to consistently return not found on records that were really there. . . Leveldb fixed this serious bug in a minor update. But deploying a fix like this in an uncontrolled manner in the bitcoin network would potentially cause a fork in the consensus state; so any such fix would need to be rolled out in an orderly manner.

[[Category:Technical]]
[[Category:Developer]]

Bitcoin Core 0.11 (ch 2): Data Storage

2016-01-13T21:25:08Z

Mrbandrews: added categories

This page describes how & where Bitcoin stores blockchain data.

==Overview==

There are basically four pieces of data that are maintained:

: '''blocks/blk*.dat:''' the actual Bitcoin blocks, in network format, dumped in raw on disk. They are only needed for rescanning missing transactions in a wallet, reorganizing to a different part of the chain, and serving the block data to other nodes that are synchronizing.

: '''blocks/index/*:''' this is a LevelDB database that contains metadata about all known blocks, and where to find them on disk. Without this, finding a block would be very slow.

: '''chainstate/*:''' this is a LevelDB database with a compact representation of all currently unspent transaction outputs and some metadata about the transactions they are from. The data here is necessary for validating new incoming blocks and transactions. It can theoretically be rebuilt from the block data (see the -reindex command line option), but this takes a rather long time. Without it, you could still theoretically do validation indeed, but it would mean a full scan through the blocks (7 GB as of may 2013) for every output being spent.

: '''blocks/rev*.dat:''' these contain "undo" data. You can see blocks as 'patches' to the chain state (they consume some unspent outputs, and produce new ones), and see the undo data as reverse patches. They are necessary for rolling back the chainstate, which is necessary in case of reorganisations.

Note that the LevelDB's are redundant in the sense that they can be rebuilt from the block data. But validation and other operations would become intolerably slow without them.

See here: [http://bitcoin.stackexchange.com/questions/11104/what-is-the-database-for?rq=1 StackExchange post by Pieter Wuille (2013)]

==Raw Block data (blk*.dat)==

Block files store the raw blocks as they were received over the network.

Block files are about 128 MB, allocated in 16 MB chunks to prevent excessive fragmentation. As of October 2015, the block chain is stored in about 365 block files, for a total of about 45 GB.

Each block file (blk1234.dat) has a corresponding undo file (rev1234.dat) which contains the data necessary to remove blocks from the blockchain in the event of a reorganization (fork).

Info about the block files is stored in the block index (the LevelDB) in two places:
* General info about the files themselves is held in the "f" records in the block index LevelDB (meaning keys "fxxxx", where "xxxx" is the 4 digit file number), including:
** Number of blocks stored in the file
** File size (and the corresponding undo file size)
** Lowest and highest block in the file
** Timestamps - earlier and latest blocks in the file

* Info about where to find a particular block on disk is in the "b" ("b" = block) record:
** Each block contains a pointer to the block is on disk (a file number and an offset)

'''Accessing the block data files from the code'''

The block files are accessed through:

1) DiskBlockPos: a struct that is simply a pointer to a block's location on disk (a file number and an offset.)

2) ''vInfoBlockFiles'': a vector of BlockFileInfo objects. This variable is used to perform such tasks as:
* Determine whether new blocks can fit into the current file or a new file needs to be created
* Calculate the total disk usage by block & undo files
* Iterate through the block files and find ones that can be pruned

Blocks are written to disk as soon as they are received, in AcceptBlock. (The actual disk write operation is in WriteBlockToDisk [main.cpp:1164]). Note that there is some overlap of the code that accesses block files with the code that accesses and writes to the coins database (/chainstate). There is a complex system of when to flush state to disk. None of this code affects block files, which are simply written to disk when received. Once they have been received and stored, the block files are only needed for serving blocks to other nodes.

'''More info about block files'''

See here: [https://github.com/sipa/bitcoin/commit/5382bcf8cd23c36a435c29080770a79b5e28af42 the commit that puts multiple blocks in a block file (2012)]

==Block index (leveldb)==

The block index holds metadata about all known blocks, including where the block is stored on disk.

Note that the set of "known blocks" is a superset of the longest chain, because it includes blocks that were received and processed but are not part of the active chain - for example, orphaned blocks that were detached from the active chain in a small reorganization.

'''Terminology'''

The terminology can be a little confusing here, because while people normally think of the "blockchain" as being synonymous with the active chain (an uninterrupted, linear chain of X blocks starting with the genesis block and continuing to the current tip), there are some places in the code where "blockchain" refers to the active chain plus the numerous, mostly short forks off the chain that our node happens to know about.

a) Block Tree

A better term for the set of known blocks stored on disk is "block tree," as this term contemplates a tree structure with numerous branches (albeit small ones) from the main chain. Indeed, the block index LevelDB is accessed through the "CBlockTreeDB" wrapper class, defined in src/txdb.h. Note that it's perfectly fine, indeed it is expected, that different nodes would have slightly different block trees; what matters is that they agree on the active chain.

'''Key-value pairs'''

Inside the actual LevelDB, the used key/value pairs are:

'b' + 32-byte block hash -> block index record. Each record stores:
* The block header
* The height.
* The number of transactions.
* To what extent this block is validated.
* In which file, and where in that file, the block data is stored.
* In which file, and where in that file, the undo data is stored.

'f' + 4-byte file number -> file information record. Each record stores:
* The number of blocks stored in the block file with that number.
* The size of the block file with that number ($DATADIR/blocks/blkNNNNN.dat).
* The size of the undo file with that number ($DATADIR/blocks/revNNNNN.dat).
* The lowest and highest height of blocks stored in the block file with that number.
* The lowest and highest timestamp of blocks stored in the block file with that number.

'l' -> 4-byte file number: the last block file number used.

'R' -> 1-byte boolean ('1' if true): whether we're in the process of reindexing.

'F' + 1-byte flag name length + flag name string -> 1 byte boolean ('1' if true, '0' if false): various flags that can be on or off. Currently defined flags include:
* 'txindex': Whether the transaction index is enabled.

't' + 32-byte transaction hash -> transaction index record. These are optional and only exist if 'txindex' is enabled (see above). Each record stores:
* Which block file number the transaction is stored in.
* Which offset into that file the block the transaction is part of is stored at.
* The offset from the start of that block to the position where that transaction itself is stored.

See here: [http://bitcoin.stackexchange.com/questions/28168/what-are-the-keys-used-in-the-blockchain-leveldb-ie-what-are-the-keyvalue-pair StackExchange post by Pieter Wuille (2014)]

'''Data Access Layer'''

The database is accessed through CBlockTreeDB wrapper class. See txdb.h.

The wrapper is instantiated in a global variable called pblocktree, defined in main.cpp.

''CBlockIndex''

Blocks stored in the database are represented in memory as CBlockIndex objects. An object of this type is first created after the ''header'' is received; the code does not wait to receive the full block. When headers are received over the network, they are streamed into a vector of CBlockHeaders, which are then checked. Each header that checks out causes a new CBlockIndex to be created, which is stored to the database.

''CBlock / CBlockHeader''

Note that these objects have little to do with the /blocks LevelDB. A CBlock holds the full set of transactions in the block, the data for which is stored in two places - in full, in raw format, in the blk???.dat files, and in pruned format in the UTXO database. The block index database cares not for such details, since it holds only the metadata for the block.

''Loading the block database into memory''

The entire database is loaded into memory on startup. See LoadBlockIndexGuts (txdb.cpp). This only takes a few seconds.

The blocks ('b' keys) are loaded into the global "mapBlockIndex" variable. "mapBlockIndex" is an unordered_map that holds CBlockIndex for each block in the entire block tree; not just the active chain.

mapBlockIndex is described in more detail in Chapter 6 - The Blockchain.

The block file metadata ('f' keys) is loaded into vInfoBlockFiles.

==The UTXO set (chainstate leveldb)==

The UTXO database was introduced in 2012 in [https://github.com/bitcoin/bitcoin/pull/1677 pull request #1677 - "Ultraprune."]

The idea behind "Ultraprune" is to reduce the size of (prune) the set of past transactions, keeping only those parts of past transactions that are necessary to validate later transactions.

Say you have a transaction T1 which takes two inputs and sends to 3 outputs: O1,O2,O3. Two of those outputs (O1, O2) have been used as inputs in a later transaction, T2. Once T2 has been mined, T1 only has one item of interest (O3). There's no reason to keep T1 around in its entirety. Instead, a slimmed-down version of T1 will suffice, consisting only of O3 (locking script and amount) and certain basic information about T1 (height, whether it is a coinbase, etc.)

The description of ultraprune is on the specific "ultraprune" commit within the pull:

: -------------

: This switches bitcoin's transaction/block verification logic to use a "coin database", which contains all unredeemed transaction output scripts, amounts and heights.

: The name ultraprune comes from the fact that instead of a full transaction index, we only (need to) keep an index with unspent outputs. For now, the blocks themselves are kept as usual, although they are only necessary for serving, rescanning and reorganizing.

: The basic data structures are CCoins (representing the coins of a single transaction), and CCoinsView (representing a state of the coins database). There are several implementations for CCoinsView. A dummy, one backed by the coins database (coins.dat), one backed by the memory pool, and one that adds a cache on top of it. FetchInputs, ConnectInputs, ConnectBlock, DisconnectBlock, ... now operate on a generic CCoinsView. The block switching logic now builds a single cached CCoinsView with changes to be committed to the database before any changes are made. This means no uncommitted changes are ever read from the database, and should ease the transition to another database layer which does not support transactions (but does support atomic writes), like LevelDB.

: For the getrawtransaction() RPC call, access to a txid-to-disk index would be preferable. As this index is not necessary or even useful for any other part of the implementation, it is not provided. Instead, getrawtransaction() uses the coin database to find the block height, and then scans that block to find the requested transaction. This is slow, but should suffice for debug purposes.

: -----------------

See: [https://github.com/sipa/bitcoin/commit/450cbb0944cd20a06ce806e6679a1f4c83c50db2 Ultraprune - July 2012]

'''Terminology'''

"UTXO (Unspent Transaction Out):" An output from a transaction. This is colloquially referred to as a "coin." For this reason, the UTXO db is sometimes referred to as the "coins database."

"UTXO set / coins database / chainstate database:" These terms are more or less synonymous and are used interchangeably.

"Provably Unspendable:" A coin is provably unspendable if its scriptPubKey cannot be satisfied - for example, an OP_RETURN. A provably unspendable coin can be eliminated from the utxo database regardless of its amount.

'''Key-value pairs'''

The records in the chainstate levelDB are:

'c' + 32-byte transaction hash -> unspent transaction output record for that transaction. These records are only present for transactions that have at least one unspent output left. Each record stores:
* The version of the transaction.
* Whether the transaction was a coinbase or not.
* Which height block contains the transaction.
* Which outputs of that transaction are unspent.
* The scriptPubKey and amount for those unspent outputs. 
'B' -> 32-byte block hash: the block hash up to which the database represents the unspent transaction outputs.

See here: [http://bitcoin.stackexchange.com/questions/28168/what-are-the-keys-used-in-the-blockchain-leveldb-ie-what-are-the-keyvalue-pair StackExchange post by Pieter Wuille (2014)]

'''Data Access Layer and Caching'''

Access to the UTXO database is considerably more complex than the block index. This is because its performance is critical to the overall performance of the Bitcoin system. The block index is not so critical to performance because there are only a few hundred thousand blocks and a node running on decent hardware can retrieve and scroll through them in a few seconds (and does not need to do so very often.) On the other hand, there are millions of coins in the UTXO database and they must be checked and modified for each input of each transaction going into the mempool or included in a block.

As sipa said in the ultraprune commit:
: The basic data structures are CCoins (representing the coins of a single transaction), and CCoinsView (representing a state of the coins database). There are several implementations for CCoinsView. A dummy, one backed by the coins database (coins.dat), one backed by the memory pool, and one that adds a cache on top of it.

This is not stated as clearly as it might have been, however; at least, not for the current state of the code.

In 0.11, the instantiations of the CoinsView are:
* dummy
* database
* pCoinsTip (a cache backed by the database)
* "validation cache" (used when backed by pCoinsTip, in use when connecting a block)

Separate from that chain of caches is the memory pool's CoinsView, which is backed by the database.

The class diagram (data types) for the views is:

CCoinsView (abstract class)
/ \
ViewDB ViewBacked
(database) / \
ViewMempool ViewCache

Each class has one key characteristic:
* View is the base class, declaring methods for verifying that coins exist (HaveCoins), retrieving coins (GetCoins), etc.
* ViewDB has code to interact with the LevelDB.
* ViewBacked has a pointer to another View; thus it is "backed" by another view (version) of the UTXO set.
* ViewCache has a cache (a map of CCoins).
* ViewMempool associates a mempool with a view.

Those are the defined classes; whereas the object diagram is:

Database
/ \
MemPool Blockchain cache (pcoinsTip)
View/Cache \
Validation cache

Here is a table summarizing the instantiations of Views:

{| class="wikitable"
|-
! Object !! Type !! Backed By? !! Description / Purpose
|-
| DB view || ViewDB || n/a || Represents the UTXO set according to the /chainstate LevelDB. Retrieves coins and flushes changes to the LevelDB. Creation in code (instantiation): see init.cpp:1131
|-
| pCoinsTip (blockchain cache) || ViewCache || DB view || Holds the UTXO set corresponding to the active chain's tip. Retrieves/flushes to the database view. Creation in code: see init.cpp:1133
|-
| Validation cache || ViewCache || pCoinsTip || This cache's lifetime is within ConnectTip (or DisconnectTip). Its purpose is to keep track of modifications to the UTXO set while processing a block. If the block validates, the cache is flushed to pcoinsTip. If the block fails, the cache is discarded. Creation in code: see main.cpp:2231: CCoinsViewCache view(pcoinsTip);
|-
| Mempool view || ViewMemPool || pCoinsTip || This object brings the mempool into view, meaning it can see both a UTXO set and the mempool. Its purpose is to enable validation of chains of transactions, a.k.a. "zero-confirmation" transactions. (If chains of transactions weren't permitted, the mempool could simply validate against pcoinsTip.) Thus, when queried, it can check if a given input can be found either in the mempool (i.e., "zero-conf") or in the blockchain's utxo set ("confirmed.") Note that this object is not a cache; rather, it is a view that is used by the object below, which does contain a cache. Creation in code: Its lifetime is that of AcceptToMemoryPool in main.cpp.
|-
| Mempool cache || ViewCache || Mempool view || The cache for the mempool. It contains a cache and sets its backend to be the mempool view. Creation in code: Its lifetime is also that of AcceptToMemoryPool in main.cpp.
|}

''Loading the UTXO set''

Access to the coins database is initialized in init.cpp: 1131-1133:

pcoinsdbview = new CCoinsViewDB(nCoinDBCache, false, fReindex);
pcoinscatcher = new CCoinsViewErrorCatcher(pcoinsdbview);
pcoinsTip = new CCoinsViewCache(pcoinscatcher);

The code starts by initializing a CoinsViewDB, which is equipped with methods to load coins from the LevelDB. 
The error catcher is a little hack that can be ignored. 
Next, the code initalizes pCoinsTip, which is the cache representing the state of the active chain, and is backed by the database view. 

''Cache vs. Database''

The FetchCoins function in coins.cpp demonstrates how the code uses the cache vs. the database:
1 CCoinsMap::iterator it = cacheCoins.find(txid);
2 if (it != cacheCoins.end())
3 return it;
4 CCoins tmp;
5 if (!base->GetCoins(txid, tmp))
6 return cacheCoins.end();
7 CCoinsMap::iterator ret = cacheCoins.insert(std::make_pair(txid, CCoinsCacheEntry())).first;

First, the code searches the cache for the coins for a given transaction id. (line 1) 
If found, it returns the "fetched" coins. (lines 2-3) 
If not, it searches the database. (line 5) 
If found in the database, it updates the cache. (line 7) 
 
Note: if the cache's backend is another cache, then the term "database" really means "parent cache."
 
 

''Flushing the Validation Cache to the Blockchain Cache''

The validation cache is flushed to the blockchain cache after connecting a block, just before it goes out of scope. The scope is captured in ConnectTip, and specifically, in the code block main.cpp:2231-2243. In that code block, there is a call to ConnectBlock, during which the code stores the new coins in the validation cache. (Specifically, see UpdateCoins() in main.cpp.) At the end of the code block, the validation cache is flushed. Since its "parent view" is also a cache (pcoinsTip, the "blockchain cache") the code will call the parent's ViewCache::BatchWrite, which swaps the updated coin entries into its own cache. (Polymorphism in action: Later, when the the blockchain cache flushes to the database view, the code will run CoinsViewDB::BatchWrite, the last line of which writes to the LevelDB.)

In summary, usage of the validation cache is straightforward: it is instantiated, used, flushed, and goes out of scope in the aforementioned code block.

''Flushing the Blockchain Cache to the Database''

Flushing the validate cache was simple because the code only shuffled items between two caches in memory (of which no one is aware outside of the caching code.) Flushing the blockchain cache to the database is a bit more complicated. At the lowest level, the mechanics of flushing the blockchain cache (pcoinsTip) is the same as the validation cache: the Flush() method calls BatchWrite on its backend (the "base" pointer), and in this case that means BatchWrite on the database view. Up a level, Flush() is called from FlushStateToDisk (FSTD) - main.cpp:2098. FlushStateToDisk is invoked at a few different points, with a given ''mode'':

{| class="wikitable"
|-
! Flush Mode !! Description !! When called
|-
| IF_NEEDED || Flush only if the cache is over its size limit. || Right after connecting (or disconnecting) a block and flushing the validation cache. See ConnectTip / DisconnectTip.
|-
| ALWAYS || Flush cache. || During initialization only.
|-
| PERIODIC || Here, the code considers other data points to decide whether to flush. Is the code ''almost'' over its size limit? Has it been a long time since the cache was flushed? If so, then proceed.|| At end of ActivateBestChain() (Code comment: "write changes periodically to disk, after relay").
|}

The idea is to flush the block cache frequently (to avoid having to download a large number of blocks if the program crashes), but the coins cache infrequently (in order to maximize the benefit from the coins cache.)

Specifically, the block cache is guaranteed to be flushed once an hour, whereas the coins cache once per day. (See here: [https://github.com/bitcoin/bitcoin/pull/6102#issuecomment-98847663 Sipa comment on PR 6102])

The FlushStateToDisk code is well-commented so for more info, the curious reader can check main.cpp.

==Raw undo data (rev*.dat)==

The undo data contains the information that is necessary to disconnect or "roll back" a block: specifically, the coins that were spent by the block in question.

So, the data being written is essentially a set of CTxOut objects. (A CTxOut is simply an amount and a script - see primitives/transaction.h:107-108).

The matter is complicated slightly by the fact that if the coin is the last one being spent by its transaction, the undo data needs to store the transaction's metadata (the txn's block height, whether it's a coinbase, and its version.) So, if you have a transaction T with outputs O1,O2,O3 spent in that order, for O1 and O2 all that will be written to the undo file is the amount and the script. For 03, the undo file will have the amount, the script, plus T's height and version, and whether T is a coinbase.

The undo data is written to the raw file with the following code:
fileout << blockundo; (main.cpp:1567 [UndoWriteToDisk])

This line of code calls the serialization function on the CBlockUndo - which is basically just a vector of coins (CTxOuts.) Finally, a checksum is written to the undo file. The checksum is used during initialization to verify that any undo data being checked is intact. See [https://github.com/bitcoin/bitcoin/pull/2145 Pull 2145]

The undo data is used when disconnecting a block. The DisconnectBlock() code is discussed further down this wiki page in The Blockchain: Reorganizations.

==Use of LevelDB==

LevelDB is a key-value store that was introduced to store the block index and UTXO set (chainstate) in 2012 as part of the complex "Ultraprune" pull (PR 1677). See here: [https://github.com/bitcoin/bitcoin/pull/1677/commits the 27 commits on Ultraprune].

On the subject of why LevelDB is used, core developer Greg Maxwell stated the following to the [http://bitcoin-development.narkive.com/XAPoxKZU/patch-switching-bitcoin-core-to-sqlite-db bitcoin-dev mailing list in October 2015]:

:: I think people are falling into a trap of thinking "It's a <database>, I know a <black box> for that!"; but the application and needs are very specialized here. . . It just so happens that on the back of the very bitcoin specific cryptographic consensus algorithim there was a slot where a pre-existing high performance key-value store fit; and so we're using one and saving ourselves some effort...

One might ask whether different nodes could use different databases - as long as they retrieve the same data, what's the difference? The issue here is "bug-for-bug compatibility" - if one database has a bug that causes records to not be returned under certain circumstances, then all other nodes bst have the same bug, else the network could fork as a result.

Greg Maxwell stated the following in [http://bitcoin-development.narkive.com/XAPoxKZU/patch-switching-bitcoin-core-to-sqlite-db the same thread referenced above (in response to a proposal to switch to using sqlite)]:

:: ...[D]atabases sometimes have errors which cause them to fail to return records, or to return stale data. And if those exist consistency must be maintained; and "fixing" the bug can cause a divergence in consensus state that could open users up to theft.

:: Case in point, prior to leveldb's use in Bitcoin Core it had a bug that, under rare conditions, could cause it to consistently return not found on records that were really there. . . Leveldb fixed this serious bug in a minor update. But deploying a fix like this in an uncontrolled manner in the bitcoin network would potentially cause a fork in the consensus state; so any such fix would need to be rolled out in an orderly manner.

[[Category:Technical]]
[[Category:Developer]]

Bitcoin Core 0.11 (ch 3): Initialization and Startup

2016-01-13T21:09:10Z

Mrbandrews: Created page with " This page describes the Bitcoin Core code that manages startup and initialization. ==Program entry point== The program's entry point can be found in bitcoind.cpp...."

Bitcoin Core 0.11 (ch 1): Overview

2016-01-13T21:02:50Z

Mrbandrews: page creation

Bitcoin Core 0.11 (ch 2): Data Storage

2016-01-13T21:00:20Z

Mrbandrews: page creation