Difference between revisions of "BIP 0039"

From Bitcoin Wiki
Jump to: navigation, search
(Expanded documentation, added test vectors)
m (Updated formatting)
Line 41: Line 41:
 
the wordlist and algorithm so it meets the following criteria:
 
the wordlist and algorithm so it meets the following criteria:
  
 +
<pre>
 
a) smart selection of words
 
a) smart selection of words
 
   - wordlist is created in such way that it's enough to type just first four
 
   - wordlist is created in such way that it's enough to type just first four
Line 70: Line 71:
 
==Specification==
 
==Specification==
  
 +
<pre>
 
Our proposal implements two methods - "encode" and "decode".
 
Our proposal implements two methods - "encode" and "decode".
  
Line 119: Line 121:
 
7. If it's not we have invalid mnemonic code.
 
7. If it's not we have invalid mnemonic code.
 
8. Treat B as binary data and return it as output.
 
8. Treat B as binary data and return it as output.
 +
 +
</pre>
  
 
==Test vectors==
 
==Test vectors==
  
 +
<pre>
 
input:    00000000 (32 bits)
 
input:    00000000 (32 bits)
 
mnemonic: abandon abandon abandon (3 words)
 
mnemonic: abandon abandon abandon (3 words)
Line 250: Line 255:
 
input:    ee9e8ce85eadcdf0f5473d7490816d9b1335f7204e7776597ac99f9e29186364 (256 bits)
 
input:    ee9e8ce85eadcdf0f5473d7490816d9b1335f7204e7776597ac99f9e29186364 (256 bits)
 
mnemonic: unlikely victory dentist round swear weapon squeeze traffic insist logo force custom create win liberty snack island skate range display thought merchant migrant nasty (24 words)
 
mnemonic: unlikely victory dentist round swear weapon squeeze traffic insist logo force custom create win liberty snack island skate range display thought merchant migrant nasty (24 words)
 +
</pre>
  
 
==Reference Implementation==
 
==Reference Implementation==
  
 
Reference implementation including wordlists is available from http://github.com/trezor/mnemonic
 
Reference implementation including wordlists is available from http://github.com/trezor/mnemonic

Revision as of 16:14, 10 September 2013

This page describes a BIP (Bitcoin Improvement Proposal).
Please see BIP 2 for more information about BIPs and creating them. Please do not just create a wiki page.

  BIP:     BIP-0039
  Title:   Mnemonic code for generating deterministic keys
  Author:  Pavol Rusnak <stick@gk2.sk>
           Marek Palatinus <info@bitcoin.cz>
           Aaron Voisine <voisine@gmail.com>
  Status:  Draft
  Type:    Standards Track
  Created: 10-09-2013

Abstract

This BIP proposes a scheme for translating binary data (usually master seeds for deterministic keys, but it can be applied to any binary data) into a group of easy to remember words also known as mnemonic code or mnemonic sentence.

Motivation

Such mnemonic code or mnemonic sentence is much easier to work with than working with the binary data directly (or its hexadecimal interpretation). The sentence could be writen down on paper (e.g. for storing in a secure location such as safe), told over telephone or other voice communication method, or memorized in ones memory (this method is called brainwallet).

Backwards Compatibility

As this BIP is written, only one Bitcoin client (Electrum) implements mnemonic codes, but it uses a different wordlist than the proposed one.

For compatibility reasons we propose adding a checkbox to Electrum, which will allow user to indicate if the legacy code is being entered during import or it is a new one that is BIP-0039 compatible. For exporting, only the new format will be used, so this is not an issue.

Rationale

Our proposal is inspired by implementation used in Electrum, but we enhanced the wordlist and algorithm so it meets the following criteria:

a) smart selection of words
   - wordlist is created in such way that it's enough to type just first four
     letters to unambiguously identify the word

b) similar words avoided
   - words as "build" and "built", "woman" and "women" or "quick" or "quickly"
     not only make remembering the sentence difficult, but are also more error
     prone and more difficult to guess (see point below)
   - we avoid these words by carefully selecting them during addition

c) sorted wordlists
   - wordlist is sorted which allow more efficient lookup of the code words
     (i.e. implementation can use binary search instead of linear search)
   - this also allows trie (prefix tree) to be used, e.g. for better compression

d) localized wordlists
   - we would like to allow localized wordlists, so it is easier for users
     to remember the code in their native language
   - by using wordlists with no colliding words among languages, it's easy to
     determine which language was used just by checking the first word of
     the sentence

e) mnemonic checksum
   - this leads to better user experience, because user can be notified
     if the mnemonic sequence is wrong, instead of showing the confusing
     data generated from the wrong sequence. 

==Specification==

<pre>
Our proposal implements two methods - "encode" and "decode".

The first method takes a binary data which have to length (L) in bytes divisable
by four and returns a sentence that consists of (L/4*3) words from the wordlist.

The second method takes sentences generated by first method (number of words in
the sentence has to be divisable by 3) and reconstructs the original binary data.

Words can repeat in the sentence more than one time.

Wordlist contains 2048 words (instead of 1626 words in Electrum), allowing
the code to compute the checksum of the whole mnemonic sequence.
Each 32 bits of input data add 1 bit of checksum.

See the following table for relation between input lengths, output lengths and
checksum sizes for the most common usecases:

+--------+---------+---------+----------+
| input  |  input  | output  | checksum |
| (bits) | (bytes) | (words) |  (bits)  |
+--------+---------+---------+----------+
|   128  |    16   |    12   |     4    |
|   160  |    20   |    15   |     5    |
|   192  |    24   |    18   |     6    |
|   224  |    28   |    21   |     7    |
|   256  |    32   |    24   |     8    |
+--------+---------+---------+----------+

Algorithm:

Encoding:
1. Read input data (I).
2. Make sure its length (L) is divisable by 32 bits.
3. Compute the length of the checkum (LC). LC = L/32
4. Split I into chunks of LC bits (I1, I2, I3, ...).
5. XOR them altogether and produce the checksum C. C = I1 xor I2 xor I3 ... xor In.
5. Concatenate I and C into encoded data (E). Length of E is divisable by 33 bits.
6. Keep taking 11 bits from E until there are none left.
7. Treat them as integer W, add word with index W to the output.

Decoding:
1. Read input mnemonic (M).
2. Make sure its wordcount is divisable by 3.
3. Figure out word indexes in a dictionary and output them as binary stream E.
4. Length of E (L) is divisable by 33 bits.
5. Split E into two parts: B and C, where B are first L/33*32 bits, C are last L/33 bits.
6. Make sure C is the checksum of B (using the step 5 from the above paragraph).
7. If it's not we have invalid mnemonic code.
8. Treat B as binary data and return it as output.

Test vectors

input:    00000000 (32 bits)
mnemonic: abandon abandon abandon (3 words)

input:    7f7f7f7f (32 bits)
mnemonic: legal wing taxi (3 words)

input:    80808080 (32 bits)
mnemonic: lethal adult bundle (3 words)

input:    ffffffff (32 bits)
mnemonic: zoo zoo zone (3 words)

input:    0000000000000000 (64 bits)
mnemonic: abandon abandon abandon abandon abandon abandon (6 words)

input:    7f7f7f7f7f7f7f7f (64 bits)
mnemonic: legal wing taxi yard water salad (6 words)

input:    8080808080808080 (64 bits)
mnemonic: lethal adult bunker absurd also dog (6 words)

input:    ffffffffffffffff (64 bits)
mnemonic: zoo zoo zoo zoo zoo young (6 words)

input:    000000000000000000000000 (96 bits)
mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon (9 words)

input:    7f7f7f7f7f7f7f7f7f7f7f7f (96 bits)
mnemonic: legal wing taxi yard water salmon worry urge lecture (9 words)

input:    808080808080808080808080 (96 bits)
mnemonic: lethal adult bunker absurd also domain achieve aunt lens (9 words)

input:    ffffffffffffffffffffffff (96 bits)
mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo year (9 words)

input:    00000000000000000000000000000000 (128 bits)
mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (12 words)

input:    7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (128 bits)
mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi worth (12 words)

input:    80808080808080808080808080808080 (128 bits)
mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker abandon (12 words)

input:    ffffffffffffffffffffffffffffffff (128 bits)
mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo worth (12 words)

input:    0000000000000000000000000000000000000000 (160 bits)
mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (15 words)

input:    7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (160 bits)
mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon winner (15 words)

input:    8080808080808080808080808080808080808080 (160 bits)
mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain abandon (15 words)

input:    ffffffffffffffffffffffffffffffffffffffff (160 bits)
mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo winner (15 words)

input:    000000000000000000000000000000000000000000000000 (192 bits)
mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (18 words)

input:    7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (192 bits)
mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon worry urge legal wave (18 words)

input:    808080808080808080808080808080808080808080808080 (192 bits)
mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain achieve aunt lethal abandon (18 words)

input:    ffffffffffffffffffffffffffffffffffffffffffffffff (192 bits)
mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo wave (18 words)

input:    00000000000000000000000000000000000000000000000000000000 (224 bits)
mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (21 words)

input:    7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (224 bits)
mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon worry urge legal wing taxi yard usage (21 words)

input:    80808080808080808080808080808080808080808080808080808080 (224 bits)
mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd abandon (21 words)

input:    ffffffffffffffffffffffffffffffffffffffffffffffffffffffff (224 bits)
mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo usage (21 words)

input:    0000000000000000000000000000000000000000000000000000000000000000 (256 bits)
mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (24 words)

input:    7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (256 bits)
mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon worry team (24 words)

input:    8080808080808080808080808080808080808080808080808080808080808080 (256 bits)
mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain achieve abandon (24 words)

input:    ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff (256 bits)
mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo team (24 words)

input:    19458fe8dace41e60617c6667874f6be (128 bits)
mnemonic: bless cloud wheel regular tiny venue birth web grief security dignity language (12 words)

input:    2d8fc7c27a52995bc165a0dcead7c169f301652f0b5db145 (192 bits)
mnemonic: sketch flavor eyebrow height youth rope various pet fish little embryo arena just chicken jump mourn advance unique fork knock general slice short caught (24 words)

input:    ea88c6efbc4028994b8b4ebc0345a474 (128 bits)
mnemonic: try educate robust joke act erosion color hedge rock blow hat tribe (12 words)

input:    bb8e02d2c0fe4c18f60e173bc38724902395b63b1b47bf18 (192 bits)
mnemonic: river ignore recycle like tobacco armor straw season despair bosnian sibling burden defy sunset typical harvest sad shadow (18 words)

input:    0b57161fd701c6c6548db5a71df70166965eec2f16a675acad3ed3460cb3e72f (256 bits)
mnemonic: approach response map protect both glass fabric remember place upset satisfy slave gravity irish romance squad involve grain excuse pioneer gauge flight open white (24 words)

input:    666fbb72dada90d6c8f627d77450f366 (128 bits)
mnemonic: grief last swamp reject pond history car sentence stone patrol diamond slam (12 words)

input:    29932c2887ff2c72dd8f9cff1b7ee498119f330d1dd81659 (192 bits)
mnemonic: chunk oak another audit vendor degree iron very year surprise retreat cook blossom obey crowd riot belt slogan (18 words)

input:    869e2bc8286611bce2159b89c5a8a4029666dc186bbfe8756ac505a4e89f481e (256 bits)
mnemonic: magnet vacuum van exotic gear tactic march ready matrix coat chin after grief huge genuine jet travel prepare race approach evil excuse burial skirt (24 words)

input:    d0263588b1deee8186818319515c6691 (128 bits)
mnemonic: soda country ghost glove unusual dose blouse cope bless medal block car (12 words)

input:    48754ef698334cf9cc5494ccca6a0294bf30224066576a7c (192 bits)
mnemonic: embrace poverty royal cope cruise lake cotton movie slam false letter chuckle venue awesome accident sing hen tight (18 words)

input:    ee9e8ce85eadcdf0f5473d7490816d9b1335f7204e7776597ac99f9e29186364 (256 bits)
mnemonic: unlikely victory dentist round swear weapon squeeze traffic insist logo force custom create win liberty snack island skate range display thought merchant migrant nasty (24 words)

Reference Implementation

Reference implementation including wordlists is available from http://github.com/trezor/mnemonic