Protocol documentation

From Bitcoin Wiki
Revision as of 20:54, 15 January 2011 by Dooglus (talk | contribs) (→‎addr: typo)
Jump to navigation Jump to search

Sources:

Type names used in this documentation are from the C99 standard.

Common standards

Hashes

Usually, when a hash is computed within bitcoin, it is computed twice. Most of the time SHA-256 hashes are used, however RIPEMD-160 is also used when a shorter hash is desireable (for example when creating a bitcoin address).

Example of double-SHA-256 encoding of string "hello":

hello
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 (first round of sha-256)
9595c9df90075148eb06860365df33584b75bff782a510c6cd4883a419833d50 (second round of sha-256)

For bitcoin addresses (RIPEMD-160) this would give:

hello
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 (first round is sha-256)
b6a9c8c230722b7c748331a8b450f05566dc7d0f (with ripemd-160)

Signatures

Bitcoin uses Elliptic Curve Digital Signature Algorithm (ECDSA) to sign transactions.

For ECDSA the secp256k1 curve from http://www.secg.org/collateral/sec2_final.pdf is used. Public keys (in scripts) are given as 04 <x> <y> where x and y are 32 byte strings representing the coordinates of a point on the curve.

Transaction Verification

The first transaction of a block is usually the generating transaction, which do not include any "in" transaction, and generate bitcoins (from fees for example) usually received by whoever solved the block containing this transaction. Such transactions are called a "coinbase transaction" and are accepted by bitcoin clients without any need to execute scripts, provided there is only one per block.

If a transaction is not a coinbase, it references previous transaction hashes as input, and the index of the other transaction's output used as input for this transaction. The script from the in part of this transaction is executed. Then the script from the out part of the referenced transaction is executed. It is considered valid if the top element of the stack is true.

Addresses

A bitcoin address is in fact the hash of a ECDSA public key, computed this way:

Version = 1 byte of 0 (zero); on the test network, this is 1 byte of 111
Key hash = Version concatenated with RIPEMD-160(SHA-256(public key))
Checksum = 1st 4 bytes of SHA-256(SHA-256(Key hash))
Bitcoin Address = Base58Encode(Key hash concatenated with Checksum)

The Base58 encoding used is home made, and has some differences. Especially, leading zeroes are kept as single zeroes when conversion happens.

Common structures

Almost all integers are encoded in little endian. Only IP or port number are encoded big endian.

Message structure

Field Size Description Data type Comments
4 magic uint32_t Magic value indicating message origin network, and used to seek to next message when stream state is unknown
12 command char[12] ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected)
4 length uint32_t Length of payload
4 checksum uint32_t First 4 bytes of sha256(sha256(payload))
? payload char[] The actual data (can be empty, in which case checksum is excluded, and length is set to 0

Known magic values:

Network Magic value
main F9BEB4D9
testnet FABFB5DA

Variable length integer

Integer can be encoded depending on the represented value to save space. Variable length integers always precede an array/vector of a type of data that may vary in length.

Value Storage length Format
< 0xfd 1 uint8_t
<= 0xffff 3 0xfd + uint16_t
<= 0xffffffff 5 0xfe + uint32_t
- 9 0xff + uint64_t

Variable length string

Variable length string can be stored using a variable length integer followed by the string itself.

Field Size Description Data type Comments
? length var_int Length of the string
? string char[] The string itself (can be empty)

Network address

When a network address is needed somewhere, this structure is used. This protocol and structure supports IPv6, but note that the official client currently only supports IPv4 networking.

Field Size Description Data type Comments
8 services uint64_t same service(s) listed in version?
16 IPv6/4 char[16] IPv6 address. Network byte order. The official client only supports IPv4 and only reads the last 4 bytes to get the IPv4 address. However, the official client writes the IPv4 address into the message as a 16 byte IPv4-mapped IPv6 address

(12 bytes 00 00 00 00 00 00 00 00 00 00 FF FF, followed by the 4 bytes of the IPv4 address).

2 port uint16_t port number, network byte order

Hexdump example of Network address structure

0000   01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
0010   00 00 FF FF 0A 00 00 01  20 8D                    ........ .

01 00 00 00 00 00 00 00                         - 1 (NODE_NETWORK? see services listed under version command)
00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 01 - IPv6: ::ffff:10.0.0.1 or IPv4: 10.0.0.1
20 8D                                           - Port 8333

Inventory Vectors

Inventory vectors are used for notifying other nodes about objects they have or data which is being requested.

Inventory vectors consist of the following data format:

Field Size Description Data type Comments
4 type uint32_t Identifies the object type linked to this inventory
32 hash char[32] Hash of the object


The object type is currently defined as one of the following possibilities:

Value Name Description
0 ERROR Any data of with this number may be ignored
1 MSG_TX Hash is related to a transaction
2 MSG_BLOCK Hash is related to a data block

Other Data Type values are considered reserved for future implementations.

Message types

version

When a node receives an incoming connection, it will immediatly advertise its version. No futher communication is possible until both peers have exchanged their version.

Payload:

Field Size Description Data type Comments
4 version uint32_t Identifies protocol version being used by the node
8 services uint64_t bitfield of features to be enabled for this connection
8 timestamp uint64_t standard UNIX timestamp in seconds
26 addr_me net_addr The network address of the node emitting this message
version >= 106
26 addr_you net_addr The network address seen by the node emitting this message (ie, the address of the receiving node)
8 nonce uint64_t Node random unique id. This id is used to detect connections to self
? sub_version_num var_str Secondary Version information (null terminated?)
version >= 209
4 start_height uint32_t The last block received by the emitting node

If the emitter of the packet has version >= 209, a "verack" packet shall be sent if the version packet was accepted.

The following services are currently assigned:

Value Name Description
1 NODE_NETWORK This node can be asked for full blocks instead of just headers.

Hexdump example of version message (note the message header for this version message does not have a checksum):

0000   F9 BE B4 D9 76 65 72 73  69 6F 6E 00 00 00 00 00   ....version.....
0010   55 00 00 00 9C 7C 00 00  01 00 00 00 00 00 00 00   U....|..........
0020   E6 15 10 4D 00 00 00 00  01 00 00 00 00 00 00 00   ...M............
0030   00 00 00 00 00 00 00 00  00 00 FF FF 0A 00 00 01   ................
0040   DA F6 01 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 FF FF 0A 00  00 02 20 8D DD 9D 20 2C   .......... ... ,
0060   3A B4 57 13 00 55 81 01  00                        :.W..U...

Message header:
F9 BE B4 D9                                                                   - Main network magic bytes
76 65 72 73 69 6F 6E 00 00 00 00 00                                           - "version" command
55 00 00 00                                                                   - Payload is 85 bytes long
                                                                              - No checksum in version message
Version message:
9C 7C 00 00                                                                   - 31900 (version 0.3.19)
01 00 00 00 00 00 00 00                                                       - 1 (NODE_NETWORK services)
E6 15 10 4D 00 00 00 00                                                       - Mon Dec 20 21:50:14 EST 2010
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 01 DA F6 - Sender address info - see Network Address
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 02 20 8D - Recipient address info - see Network Address
DD 9D 20 2C 3A B4 57 13                                                       - Node random unique ID
00                                                                            - "" sub-version string (string is 0 bytes long)
55 81 01 00                                                                   - Last block sending node has is block #98645

verack

The verack message is sent in reply to version for clients >= 209. This message consists of only a message header with the command string "verack".

Hexdump of the verack message:

0000   F9 BE B4 D9 76 65 72 61  63 6B 00 00 00 00 00 00   ....verack......
0010   00 00 00 00                                        ....

Message header:
F9 BE B4 D9                          - Main network magic bytes
76 65 72 61  63 6B 00 00 00 00 00 00 - "verack" command
00 00 00 00                          - Payload is 0 bytes long

addr

Provide information on known nodes of the network. Non-advertised nodes should be forgotten after typically 3 hours

Payload (maximum payload length: 1000 bytes):

Field Size Description Data type Comments
1+ count var_int Number of address entries
26x? addr_list net_addr[] Address of other nodes on the network. version < 209 will only read the first one

Note: Starting version 31402, addresses are prefixed with a timestamp. If no timestamp is present, the addresses should not be relayed to other peers, unless it is indeed confirmed they are up.

Hexdump example of addr message:

0000   F9 BE B4 D9 61 64 64 72  00 00 00 00 00 00 00 00   ....addr........
0010   1F 00 00 00 7F 85 39 C2  01 E2 15 10 4D 01 00 00   ......9.....M...
0020   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 FF   ................
0030   FF 0A 00 00 01 20 8D                               .D(.. .

Message Header:
F9 BE B4 D9                                     - Main network magic bytes
61 64 64 72  00 00 00 00 00 00 00 00            - "addr"
1F 00 00 00                                     - payload is 31 bytes long
7F 85 39 C2                                     - checksum of payload

Payload:
01                                              - 1 address in this message

Address:
E2 15 10 4D                                     - Mon Dec 20 21:50:10 EST 2010 (only when version is >= 31402)
01 00 00 00 00 00 00 00                         - 1 (NODE_NETWORK service - see version message)
00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 01 - IPv4: 10.0.0.1, IPv6: ::ffff:10.0.0.1 (IPv4-mapped IPv6 address)
20 8D                                           - port 8333

inv

Allows a node to advertise its knowledge of one or more objects. It can be received unsolicited, or in reply to getblocks.

Payload (maximum payload length: 50000 bytes):

Field Size Description Data type Comments
? count var_int Number of inventory entries
36x? inventory inv_vect[] Inventory vectors

getdata

getdata is used in response to inv, to retrieve the content of a specific object, and is usually sent after receiving an inv packet, after filtering known elements.

Payload (maximum payload length: 50000 bytes):

Field Size Description Data type Comments
? count var_int Number of inventory entries
36x? inventory inv_vect[] Inventory vectors

getblocks

Return an inv packet containing the list of blocks starting at hash_start, up to hash_stop or 500 blocks, whichever comes first. To receive the next blocks hashes, one needs to issue getblocks again with the last known hash.

Payload:

Field Size Description Data type Comments
4 version uint32_t only present if nType has SER_GETHASH set (purpose unknown)
1+ start count var_int number of hash_start entries
32+ hash_start char[32] hash of the last known block of the emitting node
32 hash_stop char[32] hash of the last desired block. Set to zero to get as many blocks as possible (500)

tx

tx describes a bitcoin transaction, in reply to getdata


Field Size Description Data type Comments
4 version uint32_t Transaction data format version
1+ tx_in count var_int Number of Transaction inputs
41+ tx_in tx_in[] A list of 1 or more transaction inputs or sources for coins
1+ tx_out count var_int Number of Transaction outputs
8+ tx_out tx_out[] A list of 1 or more transaction outputs or destinations for coins
4 lock_time uint32_t The block number or timestamp at which this transaction is locked, or 0 if the transaction is always locked. A non-locked transaction must not be included in blocks, and it can be modified by broadcasting a new version before the time has expired (replacement is currently disabled in Bitcoin, however, so this is useless).

TxIn consists of the following fields:

Field Size Description Data type Comments
36 previous_output outpoint The previous output transaction reference, as an OutPoint structure
1+ script length var_int The length of the signature script
? signature script char[] Computational Script for confirming transaction authorization
4 sequence uint32_t Transaction version as defined by the sender. Intended for "replacement" of transactions when information is updated before inclusion into a block.

The OutPoint structure consists of the following fields:

Field Size Description Data type Comments
32 hash char[32] The hash of the referenced transaction.
4 index uint32_t The index of the specific output in the transaction. The first output is 0, etc.

The Script structure consists of a series of pieces of information and operations related to the value of the transaction.

(Structure to be expanded in the future… see script.h and script.cpp for more information)

The TxOut structure consists of the following fields:

Field Size Description Data type Comments
8 value uint64_t Transaction Value
1+ pk_script length var_int Length of the pk_script
? pk_script char[] Usually contains the public key as a Bitcoin script setting up conditions to claim this output.

block

The block message is sent in response to a getdata message which requests transaction information from a block hash.

Field Size Description Data type Comments
4 version uint32_t Block version information, based upon the software version creating this block
32 prev_block char[32] The hash value of the previous block this particular block references
32 merkle_root char[32] The reference to a Merkle tree collection which is a hash of all transactions related to this block
4 timestamp uint32_t A timestamp recording when this block was created (Limited to 2038!)
4 bits uint32_t The calculated difficulty target being used for this block
4 nonce uint32_t The nonce used to generate this block… to allow variations of the header and compute different hashes
? txn_count var_int Number of transaction entries
? txns tx[] Block transactions, in format of "tx" command

getaddr

The getaddr message sends a request to a node asking for information about known active peers to help with identifying potential nodes in the network. The response to receiving this message is to transmit an addr message with one or more peers from a database of known active peers. The typical presumption is that a node is likely to be active if it has been sending a message within the last three hours.

No additional data is transmitted with this message.

Note: Even though this message does not contain a payload, the official client includes a checksum in the header anyway. Possibly a minor bug?

checkorder

This message is used for IP Transactions, to ask the peer if it accepts such transactions and allow it to look at the content of the order.

It contains a CWalletTx object

Payload:

Field Size Description Data type Comments
Fields from CMerkleTx
? hashBlock
? vMerkleBranch
? nIndex
Fields from CWalletTx
? vtxPrev
? mapValue
? vOrderForm
? fTimeReceivedIsTxTime
? nTimeReceived
? fFromMe
? fSpent

submitorder

Confirms an order has been submitted.

Payload:

Field Size Description Data type Comments
32 hash char[32] Hash of the transaction
? wallet_entry CWalletTx Same payload as checkorder

reply

Generic reply for IP Transactions

Payload:

Field Size Description Data type Comments
4 reply uint32_t reply code

Possible values:

Value Name Description
0 SUCCESS The IP Transaction can proceed (checkorder), or has been accepted (submitorder)
1 WALLET_ERROR AcceptWalletTransaction() failed
2 DENIED IP Transactions are not accepted by this node

ping

The ping message is sent primarily to confirm that the TCP/IP connection is still valid. An error in transmission is presumed to be a closed connection and the address is removed as a current peer. No reply is expected as a result of this message being sent nor any sort of action expected on the part of a client when it is used.

alert

An alert is sent between nodes to send a general notification message throughout the network. If the alert can be confirmed with the signature as having come from the the core development group of the Bitcoin software, the message is suggested to be displayed for end-users. Attempts to perform transactions, particularly automated transactions through the client, are suggested to be halted. The text in the Message string should be relayed to log files and any user interfaces.

Payload:

Field Size Description Data type Comments
? message var_str System message which is coded to convey some information to all nodes in the network
? signature var_str A signature which can be confirmed with a public key verifying that it is Satoshi (the originator of Bitcoins) who has "authorized" or created the message

The signature is to be compared to this ECDSA public key:

04fc9702847840aaf195de8442ebecedf5b095cdbb9bc716bda9110971b28a49e0ead8564ff0db22209e0374782c093bb899692d524e9d6a6956e7c5ecbcd68284
(hash) 1AGRxqDa5WjUKBwHB9XYEjmkv1ucoUUy1s

Source: [1]

Scripting

See script.