This page explains hashcash and how bitcoin uses it.
- 1 Hashcash
- 2 Hashcash function
Bitcoin uses the hashcash Proof_of_work function as the mining core. All bitcoin miners whether CPU, GPU, FPGA or ASICs are expending their effort creating hashcash proofs-of-work which act as a vote in the blockchain evolution and validate the blockchain transaction log.
Like many cryptographic algorithms hashcash uses a hash function as a building block, in the same way that HMAC, or RSA signatures are defined on a pluggable hash-function (commonly denoted by the naming convention of algorithm-hash: HMAC-SHA1, HMAC-MD5, HMAC-SHA256, RSA-SHA1, etc), hashcash can be instantiated with different functions, hashcash-SHA1 (original), hashcash-SHA256^2 (bitcoin), hashcash-Scrypt(iter=1) (litecoin).
The hashcash proof-of-work function was invented in 1997 by Adam Back, and proposed for anti-DoS uses including preventing: anonymous remailer and mail2news gateway abuse, nym name squatting on nymservers (replyable pseudonymous remailer severs), as well as general email anti-spam and general network abuse throttling. Before bitcoin, hashcash was used by SpamAssasin, and (with an incompatible format) by Microsoft (with the name "email postmark") in hotmail, exchange, outlook etc and by i2p anonymity network, mixmaster anonymous remailer components and other systems. Hashcash was also used by Hal Finney's bitcoin precursor RPOW as a way to mine coins. Wei Dai's B-money Proposal, and Nick Szabo's similar Bit_Gold_proposal bitcoin precursors, also were proposed in the context of hashcash mining.
Hash function choices
In the original 1997 algorithm hashcash used SHA1 because at that time, this was the defacto and NIST recommended hash, and the previous defacto hash MD5 had recently started to show signs of weakness. Bitcoin being specified/released in 2008/2009 uses SHA256. There is actually no strong reason SHA1 would not have worked also, hashcash relies only on the hash partial preimage resistance property (security up to hash-size, 160-bit with SHA1) and not birthday collision hardness (security up to 80-bit), so the SHA1 hash is big enough. Bitcoin is anyway built to 128-bit security because 256-bit ECDSA is used, which also offers 128-bit security. Never the less SHA256 is the correct and more conservative choice because even SHA1 has started to show some weakenesses, though only in birthday collision, not in 2nd-preimage.
Bitcoin is using two hash iterations (denoted SHA256^2 ie "SHA256 function squared") and the reason for this relates to a partial attack on the smaller but related SHA1 hash. SHA1's resistance to birthday attacks has been partially broken as of 2005 in O(2^64) vs the design O(2^80). While hashcash relies on pre-image resistance and so is not vulnerable to birthday attacks, a generic method of hardening SHA1 against the birthday collision attack is to iterate it twice. A comparable attack on SHA256 does not exist so far, however as the design of SHA256 is similar to SHA1 it is probably defensive for applications to use double SHA256. And this is what bitcoin does, it is not necessary given hashcash reliance on preimage security, but it is a defensive step against future cryptanalytic developments. The attack on SHA1 and in principle other hashes of similar design like SHA256, was also the motivation for the NIST SHA3 design competition which is still ongoing.
Once the NIST SHA3 contest has finalised, bitcoin might in the future consider adopting hashcash-SHA3 as a security upgrade (eg a single invocation of SHA3 vs a double invocation of SHA256). It seems clear from the SHA1 break, and SHA256 is a similar design, that there was previously a misunderstanding about the security of hash functions against birthday collisions, and SHA3 finalists all aim to fix that issue. One aspect of relevance for hashcash-SHA3 is that there is some debate within the NIST comments process on the proposal of weakening SHA3's resistance to pre-image attacks down to 128-bit (vs the full hash size as with previous hashes). The motivation is a small performance gain, with the rationale that some hash-pluggable algorithms do not rely on full-length pre-image resistance. The proposal has met with significant negative feedback due to it creating a non-standard security assumption (compared to all previous hashes), and therefore it creates risk and all hash-pluggable algorithms (like HMAC, RSA, DSA, hashcash etc) would need to be re-examined on a case by case basis to see if SHA3 is safe to use with them; from the balance of the feedback it seems probable that NIST will accept the feedback and SHA3 will retain the full 256-bit pre-image resistance.
A practical issue with switching to hashcash-SHA3 is that it would invalidate all existing ASIC mining hardware, and so is a change that would unlikely to be made except in the face of security risk; there is no indication that SHA1 or SHA256, or SHA256^2 are vulnerable to pre-image attack so the motivation is missing absent new cryptanalytic developments. In addition even if SHA256^2 became easier due to cryptanalytic attack, and miners started using whatever the new algorithmic approach was, it does not necessarily matter as difficulty would just adapt to it. One likely side-effect however would be that it would introduce more memory or pre-computation tradeoffs which could make ASICs unprofitable, or give advantages to people with large resources to do the pre-computations. Pre-computation advantages would perhaps be enough motivation to replace the hash with SHA3. Anyway this is all speculation if and until any pre-image affecting cryptanalytic attacks are found on SHA256.
The hashcash algorithm is relatively simple to understand. The idea builds on a security property of cryptographic hashes, that they are designed to be hard to invert (so-called one-way or pre-image resistant property). You can compute y from x cheaply y=H(x) but it's very hard to find x given only y. A full hash inversion has a known computationally infeasible brute-force running time, being O(2^k) where k is the hash size eg SHA256, k=256, and if a pre-image was found anyone could very efficiently verify it by computing one hash, so there is a huge asymmetry in full pre-image mining (computationally infeasible) vs verification (a single hash invocation).
A second hash pre-image means given one-preimage x of hash y where y=H(x), the task is to find another pre-image of hash y: x' so that y=H(x'). This is not to be confused with a birthday collision which is to find two values x, x' so that H(x)=H(x'), this can be done in much lower work O(sqrt(2^k))=O(2^(k/2)) because you can proceed by computing many H(x) values and storing them until you find a matching pair. It takes a lot of memory, but there are memory-time tradeoffs.
Version 0 of hashcash protocol (1997) used a partial 2nd pre-image, however the later version 1 (2002) uses partial pre-images of a fairly chosen string, rather than digits of pi or something arbitrary, 0^k (ie all 0 string) is used for convenience, so the work is to find x such that H(x)=0. This is also equally fair and only requires one hash invocation to verify vs two with 2nd partial-pre-images. (This optimisation was proposed by Hal Finney & independently by Thomas Boschloo). To make the work easier the definition of a partial-pre-image is to find x such that H(x)/2^(n-k) = 0 where / is the integer quotient from division, n is the size of the hash output (n=256-bits for SHA256) and k is the work factor, ie, the first k bits of the hash output are 0 . So for example k=20 requires average 1 million tries. It is actually the output that partially matches, not the pre-image, so could perhaps more accurately called a pre-image with a partial output match, however partial pre-image effectively a short-hand for that.
If the partial-pre-image x from y=H(x) is random it is just a disconnected proof-of-work to no purpose, everyone can see you did do the work, but they don't know why, so users could reuse the same work for different services. To make the proof-of-work be bound to a service, or purpose, the hash must include s, a service string so the work becomes to find H(s,c)/2^(n-k)=0. The miner varies counter c until this is true. The service string could be a web server domain name, a recipients email address, or in bitcoin a block of the bitcoin blockchain ledger.
One additional problem is that if multiple people are mining, using the same service string, they must not start with the same x or they may end up with the same proof, and anyone looking at it will not honor a duplicated copy of the same work as it could have been copied without work, the first to present it will be rewarded, and others will find their work rejected. To avoid risking wasting work in this way, there needs to be a random starting point, and so the work becomes to find H(s,x,c)/2^(n-k) = 0 where x is random (eg 128-bits to make it statistically infeasible for two users to maliciously or accidentally start at the same point), and c is the counter being varied, and s is the service string.
This is what hashcash version 1 and bitcoin does. In fact in bitcoin the service string is the coinbase and the coinbase includes the recipients reward address, as well as the transactions to validate in the block. Bitcoin actually does not include a random start point x, reusing the reward address as the randomization factor to avoid collisions for this random start point purpose, which saves 16-bytes of space in the coinbase. For privacy bitcoin expect the miner to use a different reward address on each successful block.
More Precise Work
Hashcash as originally proposed has work 2^k where k is an integer, this means difficulty can only be scaled in powers of 2, this is slightly simpler as you can see and fully measure the difficulty just by counting 0s in hex/binary and was adequate for prior uses. (A lot of hashcash design choices are motivated by simplicity).
But because bitcoin needs more precise and dynamic control of work (to target 10-minute block interval accurately), it changes k to be a fractional (floating-point) so the work becomes to find H(s,x,c) < 2^(n-k) which is equivalent if k is an integer. Bitcoin defines target = 2^(n-k), so the work can be more simply written to find H(s,x,c) < target. Of course because of luck the block time actually has quite high variance, but the average is still more accurately targeted by the introduction of fractional k.
Work, difficulty & cryptographic security
Hashcash expresses security margin in the standard cryptographic security terms O(2^k) where for comparison DES offers k=56-bits of security, ECDSA-256 offers k=128-bits of security, and because its widely used this log2 way of expressing work and security can also be useful for making security comparisons.
Bitcoin rate of work is called the network hashrate in GH/sec. As the target block interval is 10 minutes that can be converted to cryptographic security as log2(hashrate*600), so that of Nov 2013 hashrate is 4 petahash/sec and bitcoin's hashcash-256^2 proofs-of-works are 62-bits (including +1 for double hash).
Bitcoin also defines a new notion of (relative) difficulty which is the work required so that at current network hashrate a block is expected to be found every 10 minutes. It is expressed relative to a minimum work unit of 2^32 iterations (approximately, technically minimum work is 0xFFFF0000 due to bitcoin implementation level details). Bitcoin difficulty is simple to approximately convert to log2 cryptographic security: k=log2(difficulty)+32 (or for high accuracy log2(difficulty*0xFFFF0000)). Difficulty is related to the target simply as difficulty = target / 0xFFFF0000.
It is perhaps easier to deal with high difficulties in log2 scale (a petahash/second is a 16 decimal digit number of hashes per second), and makes them comparable to other cryptographic security statements. For example the EFF "deepcrack" DES cracker project built a hardware brute force machine capable of breaking a DES key in 56 hours to make a political point that 56-bit DES was too weak in 1998 at a cost of $250,000 (plus volunteer design time). By comparison bitcoin network does 62-bits (including +1 for double hash) every 10-minutes and is 537,000 times more powerful than deepcrack, or could if it were focused on DES rather than SHA256 crack a DES key in 9 seconds to deepcracks 56 hours.
In principle a miner should therefore for privacy use a different reward-address for each block (and reset the counter to 0). Why Satoshi's early mined bitcoins were potentially linked, was because while he changed the reward-addresss, he forgot to reset the counter after each successful mine, which is a bitcoin mining privacy bug. In fact with bitcoin the counter also should be obscured otherwise you would reveal your effort level, and if you have a lot of mining power that may imply who the coin belongs to. Bitcoin does this via the nonce and extra-nonce. Nonce starts at 0, but extra nonce is random. Together these form a randomized counter hiding the amount of effort that went into the proof, so no one can tell if it was a powerful but unlucky miner who worked hard, or a weak miner who was very lucky.
Additionally with the introduction of mining pools, if the miner uses the same reward address for all users, which is what the current mining protocols do, then there is risk that users may redo work. To avoid users redoing work, miners hand out defined work for the users to do. However this creates an unnecessary communication round trip and in early protocol versions perhaps was a factor in the decision to have the pool send the actual block to mine, which means the miners are not validating their own blocks, which delegates validation authority, though not work, to the pool operator, reducing the security of the bitcoin network. The more recent mining protocol version allows the user to add their own block definition, but still unnecessarily incur round trips for handing out work allocation. Because the new pooled-mining protocol has a miner chosen extraNonce this acts as a random start factor so there is actually no need to talk to the pool for work allocation, a pool could have a static published address, and miners could just do work of whatever size they chose, and submit it to the pool as a UDP packet. (If privacy is required by the miner, it could use the public derivation method from BIP 32 to allow the node to tell the miner via an encrypted message with the mining work, which factor to multiply the static public key by.)
It is a misunderstanding to talk about the Scrypt proof-of-work. Scrypt is not intended as a proof-of-work function, but a stretched key-derivation function, and while it is by design expensive to compute with high iterations, it can not be used to make an efficiently publicly auditable proof-of-work, as verifying costs the same as creating.
Hashcash with the internal hash function of Scrypt may be denoted hashcash-Scrypt(1). Scrypt, by Colin Percival, is a key-derivation function for converting user chosen passphrases into keys. It is salted (to prevent pre-computation/rainbow table attacks), and the hash is iterated many times to slow down passphrase grinding. Scrypt is similar in purpose to the defacto standard passphrase key-derivation function PBKDF2 (which uses HMAC-SHA1 internally). The differentiator and why people might choose Scrypt rather than PBDF2 is that Scrypt's inner hash uses more memory so the GPU (or theoretical Scrypt ASIC/FPGA) advantage in password grinding is reduced compared to CPUs.
This does not use the key-stretching feature of Scrypt so mining is not actually using Scrypt directly, but only the inner Scrypt hash (accessed by setting the iteration parameter to one iteration). So Scrypt's key-stretching function is not being used at all to contribute to the hardness, unlike its normal use for key protection eg in deriving the encryption key from user passphrase to encrypt bitcoin wallets. The reason Scrypt's key-stretching can not be used for mining is because that simultaneously makes it more expensive to verify by the same factor. This hashcash variant can be denoted hashcash-Scrypt(iter=1,mem=128KB) or shortened to hashcash-Scrypt(1). The other major scrypt parameter denotes the amount of memory used (usually 128kB).
Decentralization: hashcash-Scrypt vs hashcash-SHA256
The 128kB Scrypt memory footprint makes litecoin arguably less vulnerable to centralization of mining power arising from limited access to or ownership of ASIC equipment by users. It's arguable and unclear, because there are counter arguments: that hashcash-SHA256^2 is very simple, so a skilled individual with his personal savings or a small Kickstarter project could design and put in an order with a chip-fabricator. This simplicity ensures that many people will do it and ASICs should become available. Conversely it is somewhat more difficult in comparison to make an hashcash-Scrypt(1) ASIC so perhaps litecoin will prove in the mid-term actually worse for centralization, if a well funded commercial entity corners the market by having faster, but proprietary, not available on the market, hashcash-Scrypt(1) ASICs that render litecoin GPU mining unprofitable.
Note also a mitigating factor is that it is considered that hashcash-Scrypt(1) should offer less speed up from ASIC implementation vs GPUs than hashcash-SHA256^2. This is claimed because of the argument that the die area taken up by 128kB of RAM, which it might be thought must be dedicated to each Scrypt(1) core, would reduce the number of Scrypt(1) cores that fit per chip. Note however that Scrypt(1) is not actually securely memory-hard in that it makes no attempt to prevent time-memory tradeoffs, so it is actually possible to repeat the computation of internal rounds to reduce the memory requirement. In theory therefore it would be possible though more computation expensive to implement Scrypt(iter=1, mem=128kB) with minimal memory, just with more work. In hardware the time-memory tradeoff would be optimized to find the optimal amount of memory to use, and it is quite possible the optimal amount would be less than 128kB.
Hashcash-Scrypt(1) also has a disadvantage relative to hashcash-SHA256^2 in that it is significantly slower to verify, as the verification cost of one iteration of Scrypt(mem=128kB) is far higher than a two SHA256 hashes. This makes validating the litecoin blockchain more CPU and memory intensive for all full nodes. Note however that the dominating CPU work of validation is the verification of the per transaction ECDSA signatures of the multiple transactions in a block. Even one ECDSA signature is slower than one Scrypt(1) verification which is done once per block, and there are many transactions (and so ECDSA signature verifications) to verify within a block.