Difference between revisions of "Passphrase generation"

From Bitcoin Wiki
Jump to: navigation, search
(crypto standards that increase resistant to passphrase cracking)
m (Add guides category)
 
(9 intermediate revisions by 3 users not shown)
Line 13: Line 13:
 
All of the advice you've ever heard about password security has been for the offline password use-case.
 
All of the advice you've ever heard about password security has been for the offline password use-case.
  
The second use-case is an "online passphrase", where the passphrase is essentially the ''only'' thing protecting your BTC. In this case, your passphrase much be ''massively'' more secure than usual, and you '''can not''' rely on any password-creation advice you've ever heard. Essentially, the entire world will be constantly trying to attack your passphrase at full force and with no restriction. This is not a normal situation outside of Bitcoin.  
+
The second use-case is an "online passphrase", where the passphrase is essentially the ''only'' thing protecting your BTC. In this case, your passphrase much be ''massively'' more secure than usual, and you '''can not''' rely on any password-creation advice you've ever heard.  
  
Examples include:
+
For use cases that are vulnerable to a global passphrase cracking search, imagine that the entire world could be constantly trying to crack your passphrase, billions of attempts per second, all the time and with no restriction. This is not a normal situation outside of Bitcoin.
 +
 
 +
Examples of keys that could be cracked by a global search:
  
 
* The seed/mnemonic of an HD wallet.
 
* The seed/mnemonic of an HD wallet.
* The passphrase on a wallet file that may become public, or that an attacker could gain access to.
+
* The passphrase on a wallet file that has become public, or that an attacker has otherwise gained access to.
* The input to an insecure bitaddress-style "brain wallet"  
+
* The input to a "brain wallet". ('''Warning:''' it is very strongly recommended that you not attempt to use a brain wallet.)
* The input to a secure "brain wallet" with e-mail salting, and an expensive key derivation function (e.g., scrypt).
 
  
== Crypto standards that increase resistance to passphrase cracking ==
+
Example of keys that could be cracked by a targeted search:
  
=== Key Derivation Functions ===
+
* The passphrase on a wallet file that may become public, or that an attacker could gain access to.
 
+
* The input to a secure "brain wallet" with e-mail salting, and an expensive Key Derivation Function.
Since humans are poor sources of entropy, whenever a passphrase is involved it is recommended to use programs that implement an attack resistant KDF ([https://en.wikipedia.org/wiki/Key_derivation_function Key Derivation Function]) to translate the user supplied passphrase into the ultimate encryption key. KDF functions are similar in concept to hashes such as SHA256, except that they are designed to be hard to compute. This slows down an attacker by increasing the computational cost of each passphrase cracking attempt, providing additional security per bit of entropy.
 
 
 
KDF functions can usually be tuned to provide additional or less security by configuring how many rounds of encryption need to be executed to derive a key from a passphrase. Usability places an upper bound on the number of rounds. For example, a typical number of rounds is 2 ^ 18, or 262144 which may take half a minute on a modern computer.
 
 
 
KDF functions such as Scrypt and PBKDF2 have good reputations. PBKDF2 is older and has received more scrutiny from the cryptographic community but is easier to accelerate by several orders of magnitude using custom parallel hardware. scrypt is newer and has received less scrutiny, but was specifically designed to be more difficult to accelerate using custom hardware. Some programs (e.g., WarpWallet) use both.
 
 
 
A KDF can slow down advanced passphrase cracking attempts from billions of cracking attempts per second to hundreds of attempts per second.
 
 
 
=== Salting ===
 
 
 
Salting is a technique for cryptographically segmenting the passphrase cracking search space.
 
Without salting, an attacker can attempt to crack all passphrases simultaneously in a global search space, increasing the expected ROI for his efforts.
 
 
 
For example, some Brainwallet programs use e-mail based salting to thwart global dictionary attacks. an attacker can still attempt to crack a passphrase, but he has to calculate a different key for each potential e-mail in his list. An attacker can not know for certain than any particular e-mail has been used as a salt.
 
  
== Standards for offline passphrases ==
+
=== Standards for offline passphrases ===
  
 
Between 64 and 80 bits of entropy seems reasonable. The password must be totally random (see later sections on generation). Hardware wallets have additional protections, and it's OK that they often allow only a short PIN.
 
Between 64 and 80 bits of entropy seems reasonable. The password must be totally random (see later sections on generation). Hardware wallets have additional protections, and it's OK that they often allow only a short PIN.
Line 57: Line 44:
 
* Random English words (100,000-word wordlist): 4-5 words
 
* Random English words (100,000-word wordlist): 4-5 words
  
== Standards for online passphrases ==
+
=== Standards for online passphrases ===
  
Usually you don't have to personally generate an online passphrase. The most common case of an online passphrase in Bitcoin is the mnemonic for an HD wallet seed, but your wallet should securely generate it for you (this is the several-word mnemonic that most wallets tell you to write down when first run).
+
Usually you don't have to personally generate an online passphrase. The most common case of an online passphrase in Bitcoin is the mnemonic for an HD wallet seed, but a good wallet should securely generate it for you (this is the several-word mnemonic that most wallets tell you to write down when first run), assuming it has not been tampered with.
  
 
In case you need to manually generate an online passphrase, 128 bits of entropy is required. The passphrase must be totally random (see later sections on generation).
 
In case you need to manually generate an online passphrase, 128 bits of entropy is required. The passphrase must be totally random (see later sections on generation).
Line 74: Line 61:
 
* Diceware words: 10 words
 
* Diceware words: 10 words
 
* Random English words (100,000-word wordlist): 8 words
 
* Random English words (100,000-word wordlist): 8 words
 +
 +
== Risks of automatic seed/passphrase generation ==
 +
 +
Automatic wallet seed/passphrase generation is only secure using:
 +
 +
#'''Faithful wallet software''': that has not been maliciously tampered with. If you haven't compiled wallet software yourself on a trustworthy computer running a trustworthy compiler, trustworthy source code is no guarantee for a trustworthy binary. See the [http://wiki.c2.com/?TheKenThompsonHack Ken Thompson hack] for details. Wallets with a deterministic build process (e.g., Bitcoin Core) are more resistant to attack.
 +
 +
#'''Faithful RNG''': An RNG ([https://en.wikipedia.org/wiki/Random_number_generation Random Number Generator]) may be implemented in software, hardware or both. Wallet software relies on the security of the RNG, to generate your wallet's private keys securely. An insecure RNG may create wallet keys that can later be recreated by an attacker, by generating psuedo-randomness that would seem statistically indistinguishable from true randomness yet still be predictable to an advanced attacker. An RNG may become insecure as a result of malicious weakening or an unintentional mistake. This failure mode is common to any wallet generation procedure in which the true randomness of the source of entropy being used can not be verified.
 +
 +
#'''Faithful hardware''': software is executed by hardware. Unfaithful hardware may execute faithful software unfaithfully. This would be especially difficult to detect for computations where the final output is non-deterministic, such as an unfaithful hardware execution of Random Number Generator routines. The risk of such an attack is increased by the opaqueness of hardware implementation, and the centralized capital intensive nature of hardware manufacturing, making hardware companies more vulnerable to coercion, especially in undemocratic countries where most hardware manufacturing currently takes place.
 +
 +
For high risk applications, a pair of [http://rpg.stackexchange.com/questions/70802/how-can-i-test-whether-a-die-is-fair fair dice] can provide a simpler, verifiably secure source of entropy.
  
 
==How not to generate passphrases==
 
==How not to generate passphrases==
Line 166: Line 165:
  
 
If you want to generate a number in a range with a range size that is not a power of two, choose the next-largest power of two and select the correct number of bits based on that. If the resulting number is too large, '''discard''' all of those bits and grab new ones from the endless stream of random bits. Do not use the modulo operator, as that introduces a bias. For example, if you want to generate a number from 0 to 9 (range size = 10), select the next-higher power of 2, 16. If your random stream starts 0xA52F..., grab 4 bits (log<sub>2</sub>(16) = 4), giving 0xA = 10. This is outside of the range, so discard those bits and move onto the next 4 bits, 0x5 = 5. This is in the range, so the final result is 5.
 
If you want to generate a number in a range with a range size that is not a power of two, choose the next-largest power of two and select the correct number of bits based on that. If the resulting number is too large, '''discard''' all of those bits and grab new ones from the endless stream of random bits. Do not use the modulo operator, as that introduces a bias. For example, if you want to generate a number from 0 to 9 (range size = 10), select the next-higher power of 2, 16. If your random stream starts 0xA52F..., grab 4 bits (log<sub>2</sub>(16) = 4), giving 0xA = 10. This is outside of the range, so discard those bits and move onto the next 4 bits, 0x5 = 5. This is in the range, so the final result is 5.
 +
 +
When generating raw Bitcoin private keys: Although extremely unlikely to occur with a random 256-bit number, Bitcoin private keys cannot be equal to 0 or greater than to <tt>0xFFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364140</tt>. When generating a Bitcoin private key, you should check that these conditions are met; if not, reroll the private key. Some software might allow you to use these invalid private keys, but doing so would be insecure.
  
 
==Using a computer==
 
==Using a computer==
Line 197: Line 198:
  
 
(If anyone knows of some better ones, edit this page to add it.)
 
(If anyone knows of some better ones, edit this page to add it.)
 +
 +
== See also ==
 +
 +
* [https://github.com/dropbox/zxcvbn zxcvbn] - A realistic passphrase strength estimator
 +
 +
[[Category:Guides]]

Latest revision as of 21:08, 12 February 2017

In various places in Bitcoin, it is important to generate secure passwords/passphrases. Security is especially important in Bitcoin because if your BTC is stolen, there is often no recourse. Bitcoin transactions cannot be reversed.

Strength

There are two very different use-cases for passwords in Bitcoin. The first and more common is an "offline passphrase". With an offline passphrase, something prevents an attacker from trying to guess your passphrase as fast as he might like.

Examples include:

  • Website passwords. The website will rate-limit attempts on your password.
  • Wallet passphrases. An attacker needs both the wallet file and your wallet passphrase.
  • Hardware wallet PIN. An attacker needs both the hardware wallet and your PIN.

All of the advice you've ever heard about password security has been for the offline password use-case.

The second use-case is an "online passphrase", where the passphrase is essentially the only thing protecting your BTC. In this case, your passphrase much be massively more secure than usual, and you can not rely on any password-creation advice you've ever heard.

For use cases that are vulnerable to a global passphrase cracking search, imagine that the entire world could be constantly trying to crack your passphrase, billions of attempts per second, all the time and with no restriction. This is not a normal situation outside of Bitcoin.

Examples of keys that could be cracked by a global search:

  • The seed/mnemonic of an HD wallet.
  • The passphrase on a wallet file that has become public, or that an attacker has otherwise gained access to.
  • The input to a "brain wallet". (Warning: it is very strongly recommended that you not attempt to use a brain wallet.)

Example of keys that could be cracked by a targeted search:

  • The passphrase on a wallet file that may become public, or that an attacker could gain access to.
  • The input to a secure "brain wallet" with e-mail salting, and an expensive Key Derivation Function.

Standards for offline passphrases

Between 64 and 80 bits of entropy seems reasonable. The password must be totally random (see later sections on generation). Hardware wallets have additional protections, and it's OK that they often allow only a short PIN.

This corresponds to the following password/passphrase lengths:

  • Digits only: 20-25 digits
  • Hexadecimal: 16-20 characters
  • All lowercase or all uppercase letters: 14-18 characters
  • All lowercase or all uppercase letters + numbers: 13-16 characters
  • Mixed case letters: 12-15 characters
  • Mixed case letters + numbers: 11-14 characters
  • All standard keyboard characters: 9-11 characters
  • Diceware words: 5-7 words
  • Random English words (100,000-word wordlist): 4-5 words

Standards for online passphrases

Usually you don't have to personally generate an online passphrase. The most common case of an online passphrase in Bitcoin is the mnemonic for an HD wallet seed, but a good wallet should securely generate it for you (this is the several-word mnemonic that most wallets tell you to write down when first run), assuming it has not been tampered with.

In case you need to manually generate an online passphrase, 128 bits of entropy is required. The passphrase must be totally random (see later sections on generation).

This corresponds to the following password/passphrase lengths:

  • Digits only: 39 digits
  • Hexadecimal: 32 characters
  • All lowercase or all uppercase letters: 28 characters
  • All lowercase or all uppercase letters + numbers: 25 characters
  • Mixed case letters: 23 characters
  • Mixed case letters + numbers: 22 characters
  • All standard keyboard characters: 20 characters
  • Diceware words: 10 words
  • Random English words (100,000-word wordlist): 8 words

Risks of automatic seed/passphrase generation

Automatic wallet seed/passphrase generation is only secure using:

  1. Faithful wallet software: that has not been maliciously tampered with. If you haven't compiled wallet software yourself on a trustworthy computer running a trustworthy compiler, trustworthy source code is no guarantee for a trustworthy binary. See the Ken Thompson hack for details. Wallets with a deterministic build process (e.g., Bitcoin Core) are more resistant to attack.
  1. Faithful RNG: An RNG (Random Number Generator) may be implemented in software, hardware or both. Wallet software relies on the security of the RNG, to generate your wallet's private keys securely. An insecure RNG may create wallet keys that can later be recreated by an attacker, by generating psuedo-randomness that would seem statistically indistinguishable from true randomness yet still be predictable to an advanced attacker. An RNG may become insecure as a result of malicious weakening or an unintentional mistake. This failure mode is common to any wallet generation procedure in which the true randomness of the source of entropy being used can not be verified.
  1. Faithful hardware: software is executed by hardware. Unfaithful hardware may execute faithful software unfaithfully. This would be especially difficult to detect for computations where the final output is non-deterministic, such as an unfaithful hardware execution of Random Number Generator routines. The risk of such an attack is increased by the opaqueness of hardware implementation, and the centralized capital intensive nature of hardware manufacturing, making hardware companies more vulnerable to coercion, especially in undemocratic countries where most hardware manufacturing currently takes place.

For high risk applications, a pair of fair dice can provide a simpler, verifiably secure source of entropy.

How not to generate passphrases

Humans are really, really bad at generating passphrases on their own. Don't try it. If at any step in the passphrase-generation process your brain is being used to choose something at random or randomize something, then you're doing it wrong.

Do not take words out of a book or other work. The words must be absolutely random.

Do not use the xkcd password generation method.

Using dice

Dice can be used as one way to generate random numbers and passwords. However, in order to achieve security, you must use them in a certain way.

Secure dice

Casual dice for board games are not shaped perfectly, and will be somewhat biased toward certain numbers. Special casino dice are available which do not have this flaw.

The extent to which slightly-biased dice actually affect real-world security depends on the use-case. For the use-cases on this page, if the dice is random enough to not notice its bias when playing games, then it is probably good enough.

Generating passwords

To generate passwords, write down a list of acceptable characters, and sequentially number each character starting from 1. For example, if you want to generate a 4-digit PIN, you would create a list like this:

1 0
2 1
...
10 9

If you wanted to generate a password with characters in [a-zA-Z], you would create a list like this:

1 a
2 b
...
52 Z

Now you need to figure out how many dice you're going to need to roll. Your dice should all have the same number of sides. If your character list has C characters, and each of your dice have S sides, then you need to roll logS(C) dice, rounded up. For example, log6(52) is about 2.2, which you round up to 3. So if your character list contains 52 characters, then you need to roll 3 6-sided dice.

Here, the dice are assumed to be numbered from 1 to S. If this is not the case, then you must create a system to translate the dice results into a range from 1 to S. For example, if you are dealing with 10-sided dice labeled from 0 to 9, then you can add 1 to the roll.

Roll the required number of dice and put them in a random order. Do not sort the dice from highest to lowest or anything like that. In fact, to prevent any personal bias from entering into the ordering, you may want to roll the dice and then put the dice into a line with your eyes closed.

Say that d0 is the rightmost dice you rolled, d1 is the second-from-rightmost dice you rolled, etc. Then the random number is:

1 + [(d0 - 1) × S0] + [(d1 - 1) × S1] + [(d2 - 1) × S2] + ...

For example, if we rolled 316 with 6-sided dice, this becomes:

1 + [(6 - 1) × 60] + [(1 - 1) × 61] + [(3 - 1) × 62]
= 1 + [5 × 1] + [0 × 6] + [2 × 36]
= 1 + 5 + 0 + 72
= 78

So our random character would be the 78th character in the list.

Important: If your number is larger than you need, then you must totally reroll for this character. Do not try to "wrap the numbers around" or keep only certain dice, as this results in a non-random distribution. (For the adventurous only: You can safely reduce the number of rerolls by pretending that the highest-order die has fewer sides. Eg. on a 6-sided die say that 1&4 = 1, 2&5 = 2, 3&6 = 3; or that 1&2&3 = 1 and 4&5&6 = 2. Still multiply it by the appropriate power of 6, not 3 or 2. But the die must be evenly divided, you must ensure that the maximum possible value is still greater than or equal to the highest-value character, and you must use the exact same treatment of the high-order die throughout the generation process.)

Important: There are a variety of different ways to get a random number from multiple dice, but they are usually non-random. For example, adding dice would be very non-random. The above method ensures a random distribution if the dice themselves are random and if they are ordered randomly.

If the password is L characters long, then password has log2(CL) bits of entropy.

Generating passphrases

As above, but use a list of words instead of a list of characters.

Note that there is a risk when acquiring your wordlist of an attacker giving you a wordlist that has duplicated or highly similar words. For example, the wordlist might look like it contains 1 million words, but actually be the same 1000 words repeated over and over again. Or all of the words might have an "o" in the fourth position. Etc. This can cause you to significantly overestimate the security of your passphrase. Therefore, you must acquire your wordlist from a trusted source.

Generating keys, seeds, and random numbers (Advanced)

This section is mainly intended for programmers and advanced users.

Warning: it is considered unsafe to directly handle Bitcoin keys, as doing so is error-prone, and often causes people to send BTC into oblivion.

If you want to generate a large number for use as a key or seed, you can do the following.

First, decide how many bits of security you want. 128 bits is probably secure for most things. We will call this value B.

Next, roll logS(2B) dice, rounded up, where S is the number of sides per die. For example, with 6-sided dice you would need to roll 50 dice. Put the results right next to each other in a string of text, so for example if you roll 3, 2, 5, 6, 1, you'd start your string as "32561", and then continue on for a total of 50 digits. If your dice have enough sides to result in two-digit numbers, put a leading zero in front of single-digit numbers.

Then hash your string with a command like echo "32561..." |sha256sum. The resulting hash is your random number. If you want a 128-bit or 512-bit number use sha128sum or sha512sum, respectively. If you want some in-between number of bits, use the next larger hash size and then cut off the number where you need it.

You can generate any size of random number by combining the outputs. For example, let's say that you want 768 bits of randomness and for some reason you can only use sha256sum. You can do this like:

$ echo "32561..." |sha256sum
cf6a25b9ef81af3d2b1d6f62a9780637f5e27720cafb07bb0515228ada325ed5 -
$ echo "Last hash: cf6a25b9ef81af3d2b1d6f62a9780637f5e27720cafb07bb0515228ada325ed Orig entropy: 532561..." |sha256sum
6d7f302e01da0a7131377d57ee93aaff0b26ebd25e52c7dea0a5eeddabac151c -
$ echo "Last hash: 6d7f302e01da0a7131377d57ee93aaff0b26ebd25e52c7dea0a5eeddabac151c Orig entropy: 532561..." |sha256sum
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 -

Then concatenate all of those hashes to get your final random number. In this example the result is cf6a25b...52b855.

Important: When generating a stream of random numbers like this, you have to put your source entropy back in at each step.

If you want to generate a number in a range with a range size that is not a power of two, choose the next-largest power of two and select the correct number of bits based on that. If the resulting number is too large, discard all of those bits and grab new ones from the endless stream of random bits. Do not use the modulo operator, as that introduces a bias. For example, if you want to generate a number from 0 to 9 (range size = 10), select the next-higher power of 2, 16. If your random stream starts 0xA52F..., grab 4 bits (log2(16) = 4), giving 0xA = 10. This is outside of the range, so discard those bits and move onto the next 4 bits, 0x5 = 5. This is in the range, so the final result is 5.

When generating raw Bitcoin private keys: Although extremely unlikely to occur with a random 256-bit number, Bitcoin private keys cannot be equal to 0 or greater than to 0xFFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364140. When generating a Bitcoin private key, you should check that these conditions are met; if not, reroll the private key. Some software might allow you to use these invalid private keys, but doing so would be insecure.

Using a computer

Computers are good at generating secure random numbers, but you have to be careful to use the right commands. A lot of commands, programming language features, and snippets that you'll find online will give insecure random numbers which look random, but are predictable. For example, the C rand function, the Python random module, and the Windows %RANDOM% variable are all insecure random number sources.

Do not get your passwords from anything in a Web browser, even if the page says that it's using purely client-side JavaScript

Linux

There are many different packages for generating random passwords/passphrases on Linux, but none of them are installed by default on all Linux machines, so we will provide a method that uses more standard commands. If you would like a more concise and easy-to-use command, we recommend installing an actual password generator package.

The following command will generate a 20-character random password:

seq 126 |awk '{printf "%c", $0}' |grep -o '[a-zA-Z0-9]\|[[:punct:]]' | \
shuf --random-source=/dev/urandom --repeat --head-count=20 | tr --delete '\n'

Change -head-count=20 to change the password length, and [a-zA-Z0-9]\|[[:punct:]] to change the character set. Note that this command will never use characters outside of ASCII, even if your grep pattern would select such characters.

To generate a passphrase made of words, create a file called words with one word per line and Unix-style line endings, and run:

shuf --random-source=/dev/urandom --repeat --head-count=7 words | tr '\n' ' '

Change -head-count=7 to change the number of words.

Note that there is a risk when acquiring your wordlist of an attacker giving you a wordlist that has duplicated or highly similar words. For example, the wordlist might look like it contains 1 million words, but actually be the same 1000 words repeated over and over again. Or all of the words might have an "o" in the fourth position. Etc. This can cause you to significantly overestimate the security of your passphrase. Therefore, you must acquire your wordlist from a trusted source.

Windows

KeePass includes a password generator, though not a word-based passphrase generator.

(If anyone knows of some better ones, edit this page to add it.)

See also

  • zxcvbn - A realistic passphrase strength estimator