|
|
Line 12: |
Line 12: |
| </pre> | | </pre> |
|
| |
|
| ==Abstract== | | ==Draft== |
|
| |
|
| This BIP proposes a scheme for translating binary data (usually master seeds
| | See https://github.com/trezor/python-mnemonic/blob/master/README |
| for deterministic keys, but it can be applied to any binary data) into a group
| |
| of easy to remember words also known as mnemonic code or mnemonic sentence.
| |
| | |
| ==Motivation==
| |
| | |
| Such mnemonic code or mnemonic sentence is much easier to work with than working
| |
| with the binary data directly (or its hexadecimal interpretation). The sentence
| |
| could be writen down on paper (e.g. for storing in a secure location such as
| |
| safe), told over telephone or other voice communication method, or memorized
| |
| in ones memory (this method is called brainwallet).
| |
| | |
| ==Backwards Compatibility==
| |
| | |
| As this BIP is written, only one Bitcoin client (Electrum) implements mnemonic
| |
| codes, but it uses a different wordlist than the proposed one.
| |
| | |
| For compatibility reasons we propose adding a checkbox to Electrum, which will
| |
| allow user to indicate if the legacy code is being entered during import or
| |
| it is a new one that is BIP-0039 compatible. For exporting, only the new format
| |
| will be used, so this is not an issue.
| |
| | |
| ==Rationale==
| |
| | |
| Our proposal is inspired by implementation used in Electrum, but we enhanced
| |
| the wordlist and algorithm so it meets the following criteria:
| |
| | |
| a) smart selection of words
| |
| - wordlist is created in such way that it's enough to type just first four
| |
| letters to unambiguously identify the word
| |
| | |
| b) similar words avoided
| |
| - words as "build" and "built", "woman" and "women" or "quick" or "quickly"
| |
| not only make remembering the sentence difficult, but are also more error
| |
| prone and more difficult to guess (see point below)
| |
| - we avoid these words by carefully selecting them during addition
| |
| | |
| c) sorted wordlists
| |
| - wordlist is sorted which allow more efficient lookup of the code words
| |
| (i.e. implementation can use binary search instead of linear search)
| |
| - this also allows trie (prefix tree) to be used, e.g. for better compression
| |
| | |
| d) localized wordlists
| |
| - we would like to allow localized wordlists, so it is easier for users
| |
| to remember the code in their native language
| |
| - by using wordlists with no colliding words among languages, it's easy to
| |
| determine which language was used just by checking the first word of
| |
| the sentence
| |
| | |
| e) mnemonic checksum
| |
| - this leads to better user experience, because user can be notified
| |
| if the mnemonic sequence is wrong, instead of showing the confusing
| |
| data generated from the wrong sequence.
| |
| | |
| f) seed stretching
| |
| - before the encoding and after the decoding the input binary sequence is
| |
| stretched using a symmetric cipher (Blowfish) in order to prevent
| |
| brute-force attacks in case some of the mnemonic words are leaked
| |
| | |
| ==Specification==
| |
| | |
| <pre>
| |
| Our proposal implements two methods - "encode" and "decode".
| |
| | |
| The first method takes a binary data which have to length (L) in bytes divisable
| |
| by four and returns a sentence that consists of (L/4*3) words from the wordlist.
| |
| | |
| The second method takes sentences generated by first method (number of words in
| |
| the sentence has to be divisable by 3) and reconstructs the original binary data.
| |
| | |
| Words can repeat in the sentence more than one time.
| |
| | |
| Wordlist contains 2048 words (instead of 1626 words in Electrum), allowing
| |
| the code to compute the checksum of the whole mnemonic sequence.
| |
| Each 32 bits of input data add 1 bit of checksum.
| |
| | |
| See the following table for relation between input lengths, output lengths and
| |
| checksum sizes for the most common usecases:
| |
| | |
| +--------+---------+---------+----------+
| |
| | input | input | output | checksum |
| |
| | (bits) | (bytes) | (words) | (bits) |
| |
| +--------+---------+---------+----------+
| |
| | 128 | 16 | 12 | 4 |
| |
| | 192 | 24 | 18 | 6 |
| |
| | 256 | 32 | 24 | 8 |
| |
| +--------+---------+---------+----------+
| |
| </pre>
| |
| | |
| ===Algorithm:===
| |
| | |
| <pre>
| |
| Encoding:
| |
| 1. Read input data (I).
| |
| 2. Make sure its length (L) is divisable by 64 bits.
| |
| 3. Encrypt input data 1000x with Blowfish (ECB) using the word "mnemonic" as key.
| |
| 4. Compute the length of the checkum (LC). LC = L/32
| |
| 5. Split I into chunks of LC bits (I1, I2, I3, ...).
| |
| 6. XOR them altogether and produce the checksum C. C = I1 xor I2 xor I3 ... xor In.
| |
| 7. Concatenate I and C into encoded data (E). Length of E is divisable by 33 bits.
| |
| 8. Keep taking 11 bits from E until there are none left.
| |
| 9. Treat them as integer W, add word with index W to the output.
| |
| | |
| Decoding:
| |
| 1. Read input mnemonic (M).
| |
| 2. Make sure its wordcount is divisable by 6.
| |
| 3. Figure out word indexes in a dictionary and output them as binary stream E.
| |
| 4. Length of E (L) is divisable by 33 bits.
| |
| 5. Split E into two parts: B and C, where B are first L/33*32 bits, C are last L/33 bits.
| |
| 6. Make sure C is the checksum of B (using the step 5 from the above paragraph).
| |
| 7. If it's not we have invalid mnemonic code.
| |
| 8. Treat B as binary data.
| |
| 9. Decrypt this data 1000x with Blowfish (ECB) using the word "mnemonic" as key.
| |
| 10. Return the result as output.
| |
| </pre>
| |
| | |
| ==Test vectors==
| |
| | |
| See https://github.com/trezor/python-mnemonic/blob/master/vectors.json | |
|
| |
|
| ==Reference Implementation== | | ==Reference Implementation== |
|
This page describes a BIP (Bitcoin Improvement Proposal). Please see BIP 2 for more information about BIPs and creating them. Please do not just create a wiki page.
|
BIP: BIP-0039
Title: Mnemonic code for generating deterministic keys
Author: Pavol Rusnak <stick@gk2.sk>
Marek Palatinus <info@bitcoin.cz>
Aaron Voisine <voisine@gmail.com>
Status: Draft
Type: Standards Track
Created: 10-09-2013
Draft
See https://github.com/trezor/python-mnemonic/blob/master/README
Reference Implementation
Reference implementation including wordlists is available from
http://github.com/trezor/python-mnemonic