Thin Client Security: Difference between revisions
Cleanup thin clients |
Waldyrious (talk | contribs) tweak category sort key |
||
(66 intermediate revisions by 14 users not shown) | |||
Line 1: | Line 1: | ||
Recently there have been a number of proposals for bitcoin clients which do not store a complete copy of every block in the entire block chain. This page will refer to all such clients as "thin clients". This page is meant to be a place to try to make sense of the security and trust implications of the various schemes. | Recently there have been a number of proposals for bitcoin clients which do not store a complete copy of every block in the entire block chain. This page will refer to all such clients as "thin clients". This page is meant to be a place to try to make sense of the security and trust implications of the various schemes. | ||
== Full | == Full Node vs. Thin Clients == | ||
It is important to distinguish between block height verification and block depth verification. | |||
A full node client verifies that all preceding blocks are valid in order to guarantee that a transaction is valid. Currently only the Satoshi client, libbitcoin, and btcd do full node verification. Full nodes are the fundamental anchor of trustless security in the Bitcoin system. | |||
A client verifies the depth D of a block by checking that there are D blocks '''after''' it (also called "confirmations"), all of which are well-formed. Thin clients don't verify the preceding blocks, they use the number of confirmations (whether they are valid or not) as a measure of the likelihood of a [[Chain_Reorganization|block chain reorganization]] producing a new longer fork which excludes the transaction. | |||
== | == Full Node Clients == | ||
The "thick" bitcoin client downloads a copy of the entire chain, including all transactions (not just headers). It will be used as the reference point for security comparisons below. | |||
A full-node client uses the [[Protocol_rules#Blocks well-formed|difficultywise-longest]] valid block chain it can find. A transaction's ''depth'' (the number of blocks or confirmations ''after'' it) is used to determine the likelihood of the transaction being double-spent due to the emergence of a longer fork. | |||
==== [[bitcoind|bitcoin-qt]] ==== | |||
==== [https://github.com/conformal/btcd btcd] ==== | |||
==== [[Libbitcoin|libbitcoin-server]] ==== | |||
=== Block Retention === | |||
Once a full-chain client has downloaded the entire chain, it typically retains it (as the Satoshi client did/does). | |||
Satoshi's original paper mentions the possibility of pruning individual transactions, which allows for full nodes which verify the entire transaction history but do not retain it. Because users are required to download and verify the block chain from some other node initially, this change isn't costless. | |||
== Thin Clients == | |||
This client downloads a complete copy of the headers for all blocks in the entire block chain. This means that the download and storage requirements scale linearly with the amount of time since Bitcoin was invented. | |||
This scheme is described in section 8 of the [http://bitcoin.org/bitcoin.pdf original bitcoin whitepaper]. | |||
==== Block Depth Check ==== | |||
As Satoshi writes, "[the thin client] can't check the transaction for himself, but by linking it to a place in the chain, he can see that a network node has accepted it, and blocks added after it further confirm the network has accepted it." If we take "X" to be the "number of blocks added after it", then a thin client essentially trusts that a transaction X blocks deep will be costly to forge. | |||
This is | This is very different from the trust model in the "thick" client: the thick client verifies that a transaction's inputs are unspent by actually checking the whole chain up to that point -- there is no "X blocks deep" involved here. At that point it uses "X blocks deep" to decide how likely it is that a longer fork in the chain will emerge which excludes that transaction. | ||
==== [[BitCoinJ|bitcoinj]] ==== | |||
A security analysis of some of the issues in | A security analysis of some of the issues in bitcoinj can be found [https://bitcoinj.github.io/security-model here]; however: | ||
* The claim that "picking 10 nodes and requiring all of them to be consistent needs much less trust" overlooks the problem of [https://en.bitcoin.it/wiki/Weaknesses#Cancer_nodes "cancer nodes"] and [http://en.wikipedia.org/wiki/Sybil_attack Sybil attacks]. | * The claim that "picking 10 nodes and requiring all of them to be consistent needs much less trust" overlooks the problem of [https://en.bitcoin.it/wiki/Weaknesses#Cancer_nodes "cancer nodes"] and [http://en.wikipedia.org/wiki/Sybil_attack Sybil attacks]. | ||
* Many of the security claims are qualified by some form of "if you don't think an attacker controls your internet connection"; see the previous section for a discussion of why this is problematic. | * Many of the security claims are qualified by some form of "if you don't think an attacker controls your internet connection"; see the previous section for a discussion of why this is problematic. | ||
=== | ==== [https://bitcointalk.org/index.php?topic=128055.0 picocoin] ==== | ||
The library (libccoin) that picocoin is based on includes code for validating scripts and blocks; this could potentially be used to implement a full-chain client. | |||
==== [[Electrum]] ==== | |||
Electrum fetches blockchain information from Electrum servers, bitcoin nodes that index the blockchain by address. | |||
Electrum performs Simple Payment Verification to check the transactions returned by servers. | |||
For this, it fetches blokchain headers from about 10 random servers. | |||
In addition, Electrum servers are authenticated by SSL, in order to protect users from MITM attacks. | |||
=== Unused Output Tree in the Block chain (UOT) === | |||
There have been several proposals (the first appears to be [https://bitcointalk.org/index.php?topic=21995.0 this one] by gmaxwell, who called it an "open transaction tree", although the term "open" is now taken to mean "not yet mined into the block chain" rather than "unspent") to form a tree of unused transaction outputs at each block in the chain, hash it as a Merkle tree, and encode the root hash in the block chain (probably as part of the coinbase input). This will be called an Unused Output Tree (UOT). The first detailed proposal so far appears to be [https://en.bitcoin.it/wiki/User:DiThi/MTUT Alberto Torres' proposal]; etotheipi's [https://bitcointalk.org/index.php?topic=88208.0 ultimate block chain compression] is a variant of this. | |||
If such UOT hashes were included in the block chain, a client which shipped with a [https://en.bitcoin.it/wiki/Vocabulary checkpoint] block that had a UOT would only need to download blocks after the checkpoint. Moreover, once the client had downloaded those blocks and confirmed their UOTs, it could discard all but the most recent block containing a UOT. | |||
Hostile miners may insert blocks into the chain which have what claims to be a UOT, but which is actually invalid. It is unlikely that such blocks could be kept out of the chain because, again, this would require adding a new block validity criterion, and miners implementing this new criterion would risk "mining on the wrong side" of a fork, which could cost them a lot of money. Therefore, any UOT strategy would need to cope with the fact that not every block containing a UOT entry can be trusted. | Hostile miners may insert blocks into the chain which have what claims to be a UOT, but which is actually invalid. It is unlikely that such blocks could be kept out of the chain because, again, this would require adding a new block validity criterion, and miners implementing this new criterion would risk "mining on the wrong side" of a fork, which could cost them a lot of money. Therefore, any UOT strategy would need to cope with the fact that not every block containing a UOT entry can be trusted. | ||
Note that at the present moment no standard format for such Unused Output Tree hashes has been agreed upon, nor | Note that at the present moment no standard format for such Unused Output Tree hashes has been agreed upon, nor do any of the blocks in the chain contain them. The [https://bitcointalk.org/index.php?topic=91954 ultraprune] feature added to bitcoind-0.8 maintains a similar data structure on the client's disk. It does not put this data structure or its hash anywhere in the block chain. | ||
== Server-Trusting Clients == | == Server-Trusting Clients == | ||
These clients involve | These clients involve a high level of trust in the server they rely upon. Mechanisms for authenticating the server, and for confirming that the server has not been compromised, are usually not explained. | ||
All thin clients listed below currently connect to a single server, and are vulnerable to an attack similar to a double-spend. The attack can be run by that single server - the server can just lie to them that they received a Bitcoin transaction, and they, assuming the server does not lie, perform some service, transfer funds or send goods without actually receiving any Bitcoin in exchange. Therefore, they are implicitly trusting it. | |||
Future enhancements have been suggested that will have the client talk to multiple servers and broadcast transactions and query all of them. Unfortunately it is well known to security researchers that this does not actually increase security; it simply makes the exploits more complicated and difficult to find. Security researchers have a name for this phenomenon: it is called a "Sybil attack"<ref>http://en.wikipedia.org/wiki/Sybil_attack</ref>. [https://bitcointalk.org/index.php?topic=88208.msg975201#msg975201 This post] on bitcointalk explains how some governments (notably Iran and China) already perform these sorts of attacks on their own citizens, with the coerced assistance of SSL certificate authorities. | |||
Clients with a checkpoint (even a very old one) that download and validate the headers for the whole block chain are [http://bitcoinmedia.com/the-irc-bootstrap-method-is-flawed/#comment-4243 not vulnerable] to Sybil attacks in the following sense: they can always ensure that an attack would cost more than the amount being stolen. | |||
=== [[BCCAPI]] === | |||
== Other == | == Other == | ||
Line 62: | Line 84: | ||
* A [http://sourceforge.net/mailarchive/message.php?msg_id=28633866 thread] on bitcoin-dev | * A [http://sourceforge.net/mailarchive/message.php?msg_id=28633866 thread] on bitcoin-dev | ||
* A [http://bitcoin.stackexchange.com/questions/2584/is-reclaiming-disk-space-already-implemented-how-effective-will-it-be/2589 question] on bitcoin.stackexchange.com | * A [http://bitcoin.stackexchange.com/questions/2584/is-reclaiming-disk-space-already-implemented-how-effective-will-it-be/2589 question] on bitcoin.stackexchange.com | ||
* The [ | * The [[Weaknesses#Sybil_attack|sybil attack (also known as "cancer nodes")]] paragraph explains some of the issues with thin clients that base security on trusting whatever "a majority of the IP addresses I can see" say. | ||
* [http://bitcoin.stackexchange.com/questions/2613/how-secure-are-various-models-of-bitcoin-clients related discussion on Stack Exchange] | |||
* A hypothesized [https://bitcointalk.org/index.php?topic=134318.msg1441171#msg1441171 intermediate security class] between thin clients and full-chain validation. | |||
<references> | |||
[[Category:Technical]] | [[Category:Technical]] | ||
[[category:Clients]] | [[category:Clients| ]] | ||
[[Category:Security]] |
Latest revision as of 08:20, 22 May 2018
Recently there have been a number of proposals for bitcoin clients which do not store a complete copy of every block in the entire block chain. This page will refer to all such clients as "thin clients". This page is meant to be a place to try to make sense of the security and trust implications of the various schemes.
Full Node vs. Thin Clients
It is important to distinguish between block height verification and block depth verification.
A full node client verifies that all preceding blocks are valid in order to guarantee that a transaction is valid. Currently only the Satoshi client, libbitcoin, and btcd do full node verification. Full nodes are the fundamental anchor of trustless security in the Bitcoin system.
A client verifies the depth D of a block by checking that there are D blocks after it (also called "confirmations"), all of which are well-formed. Thin clients don't verify the preceding blocks, they use the number of confirmations (whether they are valid or not) as a measure of the likelihood of a block chain reorganization producing a new longer fork which excludes the transaction.
Full Node Clients
The "thick" bitcoin client downloads a copy of the entire chain, including all transactions (not just headers). It will be used as the reference point for security comparisons below.
A full-node client uses the difficultywise-longest valid block chain it can find. A transaction's depth (the number of blocks or confirmations after it) is used to determine the likelihood of the transaction being double-spent due to the emergence of a longer fork.
bitcoin-qt
btcd
libbitcoin-server
Block Retention
Once a full-chain client has downloaded the entire chain, it typically retains it (as the Satoshi client did/does).
Satoshi's original paper mentions the possibility of pruning individual transactions, which allows for full nodes which verify the entire transaction history but do not retain it. Because users are required to download and verify the block chain from some other node initially, this change isn't costless.
Thin Clients
This client downloads a complete copy of the headers for all blocks in the entire block chain. This means that the download and storage requirements scale linearly with the amount of time since Bitcoin was invented.
This scheme is described in section 8 of the original bitcoin whitepaper.
Block Depth Check
As Satoshi writes, "[the thin client] can't check the transaction for himself, but by linking it to a place in the chain, he can see that a network node has accepted it, and blocks added after it further confirm the network has accepted it." If we take "X" to be the "number of blocks added after it", then a thin client essentially trusts that a transaction X blocks deep will be costly to forge.
This is very different from the trust model in the "thick" client: the thick client verifies that a transaction's inputs are unspent by actually checking the whole chain up to that point -- there is no "X blocks deep" involved here. At that point it uses "X blocks deep" to decide how likely it is that a longer fork in the chain will emerge which excludes that transaction.
bitcoinj
A security analysis of some of the issues in bitcoinj can be found here; however:
- The claim that "picking 10 nodes and requiring all of them to be consistent needs much less trust" overlooks the problem of "cancer nodes" and Sybil attacks.
- Many of the security claims are qualified by some form of "if you don't think an attacker controls your internet connection"; see the previous section for a discussion of why this is problematic.
picocoin
The library (libccoin) that picocoin is based on includes code for validating scripts and blocks; this could potentially be used to implement a full-chain client.
Electrum
Electrum fetches blockchain information from Electrum servers, bitcoin nodes that index the blockchain by address. Electrum performs Simple Payment Verification to check the transactions returned by servers. For this, it fetches blokchain headers from about 10 random servers. In addition, Electrum servers are authenticated by SSL, in order to protect users from MITM attacks.
Unused Output Tree in the Block chain (UOT)
There have been several proposals (the first appears to be this one by gmaxwell, who called it an "open transaction tree", although the term "open" is now taken to mean "not yet mined into the block chain" rather than "unspent") to form a tree of unused transaction outputs at each block in the chain, hash it as a Merkle tree, and encode the root hash in the block chain (probably as part of the coinbase input). This will be called an Unused Output Tree (UOT). The first detailed proposal so far appears to be Alberto Torres' proposal; etotheipi's ultimate block chain compression is a variant of this.
If such UOT hashes were included in the block chain, a client which shipped with a checkpoint block that had a UOT would only need to download blocks after the checkpoint. Moreover, once the client had downloaded those blocks and confirmed their UOTs, it could discard all but the most recent block containing a UOT.
Hostile miners may insert blocks into the chain which have what claims to be a UOT, but which is actually invalid. It is unlikely that such blocks could be kept out of the chain because, again, this would require adding a new block validity criterion, and miners implementing this new criterion would risk "mining on the wrong side" of a fork, which could cost them a lot of money. Therefore, any UOT strategy would need to cope with the fact that not every block containing a UOT entry can be trusted.
Note that at the present moment no standard format for such Unused Output Tree hashes has been agreed upon, nor do any of the blocks in the chain contain them. The ultraprune feature added to bitcoind-0.8 maintains a similar data structure on the client's disk. It does not put this data structure or its hash anywhere in the block chain.
Server-Trusting Clients
These clients involve a high level of trust in the server they rely upon. Mechanisms for authenticating the server, and for confirming that the server has not been compromised, are usually not explained.
All thin clients listed below currently connect to a single server, and are vulnerable to an attack similar to a double-spend. The attack can be run by that single server - the server can just lie to them that they received a Bitcoin transaction, and they, assuming the server does not lie, perform some service, transfer funds or send goods without actually receiving any Bitcoin in exchange. Therefore, they are implicitly trusting it.
Future enhancements have been suggested that will have the client talk to multiple servers and broadcast transactions and query all of them. Unfortunately it is well known to security researchers that this does not actually increase security; it simply makes the exploits more complicated and difficult to find. Security researchers have a name for this phenomenon: it is called a "Sybil attack"[1]. This post on bitcointalk explains how some governments (notably Iran and China) already perform these sorts of attacks on their own citizens, with the coerced assistance of SSL certificate authorities.
Clients with a checkpoint (even a very old one) that download and validate the headers for the whole block chain are not vulnerable to Sybil attacks in the following sense: they can always ensure that an attack would cost more than the amount being stolen.
BCCAPI
Other
- A thread on bitcoin-dev
- A question on bitcoin.stackexchange.com
- The sybil attack (also known as "cancer nodes") paragraph explains some of the issues with thin clients that base security on trusting whatever "a majority of the IP addresses I can see" say.
- related discussion on Stack Exchange
- A hypothesized intermediate security class between thin clients and full-chain validation.
<references>