Hashcash
This page explains hashcash and how bitcoin uses it.
Hashcash
Bitcoin uses the hashcash Proof_of_work function as the mining core. All bitcoin miners whether CPU, GPU, FPGA or ASICs are expending their effort creating hashcash proofs-of-work which act as a vote in the blockchain evolution and validate the blockchain transaction log.
Like many cryptographic algorithms hashcash uses a hash function as a building block, in the same way that HMAC, or RSA signatures are defined on a pluggable hash-function (commonly denoted by the naming convention of algorithm-hash: HMAC-SHA1, HMAC-MD5, HMAC-SHA256, RSA-SHA1, etc), hashcash can be instantiated with different functions, hashcash-SHA1 (original), hashcash-SHA256^2 (bitcoin), hashcash-Scrypt(iter=1) (litecoin).
History
The hashcash proof-of-work function was invented in 1997 by Adam Back, and proposed for anti-DoS uses including preventing: anonymous remailer and mail2news gateway abuse, nym name squatting on nymservers (replyable pseudonymous remailer severs), as well as general email anti-spam and general network abuse throttling. Before bitcoin, hashcash was used by SpamAssasin, and (with an incompatible format) by microsoft (with the name "email postmark") in hotmail, exchange, outlook etc and by i2p anonymity network, mixmaster anonymous remailer components and other systems. Hashcash was also used by Hal Finney's bitcoin precursor RPOW as a way to mine coins. Wei Dai's B-money Proposal, and Nick Szabo's similar Bit_Gold_proposal bitcoin precursors, also were proposed in the context of hashcash mining.
Hash function choices
In the original 1997 algorithm hashcash used SHA1 because at that time, this was the defacto and NIST recommended hash, and the previous defacto hash MD5 had recently started to show signs of weakness. Bitcoin being specified/released in 2008/2009 uses SHA256; and as bitcoin is a high assurance application for added security it actually uses SHA256 twice (denoted SHA256^2 ie "SHA256 function squared"). There is actually no strong reason SHA1 would not have worked also, hashcash relies only on the hash partial preimage resistance property (security up to hash-size, 160-bit with SHA1) and not birthday collision hardness (security up to 80-bit), so the SHA1 hash is big enough. Bitcoin is anyway built to 128-bit security because 256-bit ECDSA is used, which also offers 128-bit security. Never the less SHA256 is the correct and more conservative choice because even SHA1 has started to show some weakenesses, though only in birthday collision, not in 2nd-preimage.
Double Hash
Bitcoin is using two hash iterations (SHA256^2) somewhat similarly to how HMAC uses two hash iterations. The double invocation of SHA256, reduces the reliance on SHA256 security properties and so makes hashcash still secure even if SHA256 is weakened in some ways by new cryptanalytic attacks. This usage pattern offers different protection beyond doubling the number of internal hash rounds. In the HMAC algorithm, which also uses a similar double hash construct, the double hash ensures that even though MD5 as bare hash is significantly weak now, HMAC-MD5 is still considered secure.
Hashcash function
The hashcash algorithm is relatively simple to understand. The idea builds on a security property of cryptographic hashes, that they are designed to be hard to invert (so-called one-way or pre-image resistant property). You can compute y from x cheaply y=H(x) but its very hard to find x given only y. A full hash inversion has a known computationally infeasible brute-force running time, being O(2^k) where k is the hash size eg SHA256, k=256, and if a pre-image was found anyone could very efficiently verify it by computing one hash, so there is a huge asymmetry in full pre-image mining (computationally infeasible) vs verification (a single hash invocation).
A second hash pre-image means given one-preimage x of y where y=H(x), the task is to find another pre-image of y: x' so that y=H(x'). This is not to be confused with a birthday collision which is to find two values x, x' so that H(x)=H(x'), this can be done in much lower work O(sqrt(2^k))=O(2^(k/2)) because you can proceed by computing many H(x) values and storing them until you find a matching pair. It takes a lot of memory, but there are memory-time tradeoffs.
Version 0 of hashcash protocol (1997) used a partial 2nd pre-image, however the later version 1 (2002) uses partial pre-images of a fairly chosen string, rather than digits of pi or something arbitrary, 0^k (ie all 0 string) is used for convenience, so the work is to find x such that H(x)=0. This is also equally fair and only requires one hash invocation to verify vs two with 2nd partial-pre-images. (This optimization was proposed by Hal Finney & independently by Thomas Boschloo). To make the work easier the definition of a partial-pre-image is to find x such that H(x)=0 mod 2^(|n|-k) where n is the size of the hash output (n=256-bits for SHA256) and k is the work factor, ie the first k bits of the hash output are 0. So for example k=20 requires average 1 million tries. It is actually the output that partially matches, not the pre-image, so could perhaps more accurately called a pre-image with a partial output match, however partial pre-image effectively a short-hand for that.
Adding purpose
If the partial-pre-image x from y=H(x) is random it is just a disconnected proof-of-work to no purpose, everyone can see you did do the work, but they dont know why, so users could reuse the same work for different services. To make the proof-of-work be bound to a service, or purpose, the hash must include s, a service string so the work becomes to find H(s,c)=0 mod 2^(|n|-k). The miner varies counter c until this is true. The service string could be a web server domain name, a recipients email address, or in bitcoin a block of the bitcoin blockchain ledger.
One additional problem is that if multiple people are mining, using the same service string, they must not start with the same x or they may end up with the same proof, and anyone looking at it will not honor a duplicated copy of the same work as it could have been copied without work, the first to present it will be rewarded, and others will find their work rejected. To avoid risking wasting work in this way, there needs to be a random starting point, and so the work becomes to find H(s,x,c)=0 mod 2^(|n|-k) where x is random (eg 128-bits to make it statistically infeasible for two users to maliciously or accidentally start at the same point), and c is the counter being varied, and s is the service string.
This is what hashcash version 1 and bitcoin does. In fact in bitcoin the service string is the coinbase and the coinbase includes the recipients reward address, as well as the transactions to validate in the block. Bitcoin actually does not include a random start point x, reusing the reward address as the randomization factor to avoid collisions for this random start point purpose, which saves 16-bytes of space in the coinbase. For privacy bitcoin expect the miner to use a different reward address on each successful block.
Miner privacy
In principle a miner should therefore for privacy use a different reward-address for each block (and reset the counter to 0). Why Satoshi's early mined bitcoins were potentially linked, was because while he changed the reward-addresss, he forgot to reset the counter after each successful mine, which is a bitcoin mining privacy bug. In fact with bitcoin the counter also should be obscured otherwise you would reveal your effort level, and if youhave a lot of mining power that may imply who the coin belongs to. Bitcoin does this via the nonce an extra-nonce. nonce starts at 0, but extra nonce is random. Together these form a randomized counter hiding the amount of effort that went into the work, so no one can tell if it was a powerful but unlucky miner who worked hard, or a weak miner who was very lucky.
Additionally with the introduction of mining pools, if the miner uses the same reward address for all users, which what the current mining protocols do, then there is risk that users may redo work. To avoid users redoing work, miners hand out defined work for the users to do. However this creates unnecessary communication round trip and in early protocol versions perhaps was a factor in the decision to have the miner send the actual block to mine, which means the miners are not validating their own blocks, which delegates validation authority, though not work, to the pool operator, reducing the security of the bitcoin network. The more recent mining protocol version allows the user to add their own block definition, but still unnecessarily incur round trips for handing out work allocation. Because the new pooled-mining protocol has a miner chosen extraNonce this acts as a random start factor so there is actually no need to talk to the pool for work allocation, a pool could have a static published address, and miners could just do work of whatever size they chose, and submit it to the pool as a UDP packet. (If privacy is required by the miner, it could use a deterministic address, with pulic derivation to allow the node to tell the miner via an encrypted message with the mining work, which factor to multiply the static public key by.)
Litecoin proof-of-work
It is a misunderstanding to say litecoin uses the Scrypt proof-of-work. Scrypt is not a proof-of-work function, but a stretched key-derivation function, and while it is by design expensive to compute with high iterations, it can not be used to make an efficiently publicly auditable proof-of-work, as verifying costs the same as creating.
Litecoin uses hashcash with the internal hash function of Scrypt: hashcash-Scrypt(1). Scrypt, by Colin Percival, is a key-derivation function for converting user chosen passphrases into keys. It is salted (to prevent pre-computation/rainbow table attacks), and the hash is iterated many times to slow down passphrase grinding. Scrypt is similar in purpose to the defacto standard passphrase key-derivation function PBKDF2 (which uses HMAC-SHA1 internally). The differentiator and why people might choose Scrypt rather than PBDF2 is that Scrypt's inner hash uses more memory so the GPU (or theoretical Scrypt ASIC/FPGA) advantage in password grinding is reduced compared to CPUs.
Litecoin does not use the key-stretching feature of Scrypt so litecoin mining is not actually using Scrypt directly, but only the inner Scrypt hash (accessed by setting the iteration parameter to one iteration). So Scrypt's key-stretching function is not being used at all to contribute to the hardness, unlike its normal use for key protection eg in deriving the encryption key from user passphrase to encrypt bitcoin wallets. The reason Scrypt's key-stretching can not be used for mining is because that simultaneously makes it more expensive to verify by the same factor. The litecoin hashcash variant can be denoted hashcash-Scrypt(iter=1,mem=128KB) or shortened to hashcash-Scrypt(1). The other major scrypt parameter denotes the amount of memory used (128kB for litecoin).
The 128kB Scrypt memory footprint makes litecoin arguably less vulnerable to centralization of mining power arising from limited access or ownership of ASIC equipment by users. Its arguable and unclear, because there are counter arguments: that hashcash-SHA256^2 is very very simple, so a skilled individual with his personal savings or a small kick-stater project could design and put in an order with a chip-fabricator. This simplicity ensures that many people will do it and ASICs should become available. Conversely it is difficult in comparison to make an hashcash-Scrypt(1) ASIC so perhaps litecoin will prove in the mid-term actually worse for centralization, when a well funded commercial entity corners the market by having 100x faster, but proprietary, not available on the market, hashcash-Scrypt(1) ASICs.
Hashcash-Scrypt(1) also has a disadvantage relative to hashcash-SHA256^2 in that it is significantly slower to verify, as the verification cost of one iteration of Scrypt(mem=128kB) is far higher than a two SHA256 hashes. This makes validating the litecoin blockchain more CPU and memory intensive for all full nodes and may mid-term create a litecoin scaling problem.