|
|
(18 intermediate revisions by 4 users not shown) |
Line 1: |
Line 1: |
| {{bip}} | | {{bip}} |
| | {{BipMoved|bip-0039.mediawiki}} |
|
| |
|
| <pre> | | <pre> |
| BIP: BIP-0039 | | BIP: 39 |
| Title: Mnemonic code for generating deterministic keys | | Layer: Applications |
| Author: Pavol Rusnak <stick@gk2.sk> | | Title: Mnemonic code for generating deterministic keys |
| Marek Palatinus <info@bitcoin.cz>
| | Author: Marek Palatinus <slush@satoshilabs.com> |
| Aaron Voisine <voisine@gmail.com>
| | Pavol Rusnak <stick@satoshilabs.com> |
| Status: Draft | | Aaron Voisine <voisine@gmail.com> |
| Type: Standards Track | | Sean Bowe <ewillbefull@gmail.com> |
| Created: 10-09-2013 | | Comments-Summary: Unanimously Discourage for implementation |
| | Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0039 |
| | Status: Proposed |
| | Type: Standards Track |
| | Created: 2013-09-10 |
| </pre> | | </pre> |
|
| |
|
| ==Abstract== | | ==Abstract== |
|
| |
|
| This BIP proposes a scheme for translating binary data (usually master seeds | | This BIP describes the implementation of a mnemonic code or mnemonic sentence -- |
| for deterministic keys, but it can be applied to any binary data) into a group
| | a group of easy to remember words -- for the generation of deterministic wallets. |
| of easy to remember words also known as mnemonic code or mnemonic sentence. | | |
| | It consists of two parts: generating the mnemonic and converting it into a |
| | binary seed. This seed can be later used to generate deterministic wallets using |
| | BIP-0032 or similar methods. |
|
| |
|
| ==Motivation== | | ==Motivation== |
|
| |
|
| Such mnemonic code or mnemonic sentence is much easier to work with than working
| | A mnemonic code or sentence is superior for human interaction compared to the |
| with the binary data directly (or its hexadecimal interpretation). The sentence
| | handling of raw binary or hexadecimal representations of a wallet seed. The |
| could be writen down on paper (e.g. for storing in a secure location such as | | sentence could be written on paper or spoken over the telephone. |
| safe), told over telephone or other voice communication method, or memorized
| |
| in ones memory (this method is called brainwallet).
| |
| | |
| ==Backwards Compatibility==
| |
| | |
| As this BIP is written, only one Bitcoin client (Electrum) implements mnemonic
| |
| codes, but it uses a different wordlist than the proposed one.
| |
| | |
| For compatibility reasons we propose adding a checkbox to Electrum, which will
| |
| allow user to indicate if the legacy code is being entered during import or
| |
| it is a new one that is BIP-0039 compatible. For exporting, only the new format
| |
| will be used, so this is not an issue.
| |
| | |
| ==Rationale==
| |
| | |
| Our proposal is inspired by implementation used in Electrum, but we enhanced
| |
| the wordlist and algorithm so it meets the following criteria:
| |
| | |
| a) smart selection of words
| |
| - wordlist is created in such way that it's enough to type just first four
| |
| letters to unambiguously identify the word
| |
|
| |
|
| b) similar words avoided
| | This guide is meant to be a way to transport computer-generated randomness with |
| - words as "build" and "built", "woman" and "women" or "quick" or "quickly"
| | a human-readable transcription. It's not a way to process user-created |
| not only make remembering the sentence difficult, but are also more error
| | sentences (also known as brainwallets) into a wallet seed. |
| prone and more difficult to guess (see point below)
| |
| - we avoid these words by carefully selecting them during addition
| |
|
| |
|
| c) sorted wordlists
| | ==Generating the mnemonic== |
| - wordlist is sorted which allow more efficient lookup of the code words
| |
| (i.e. implementation can use binary search instead of linear search)
| |
| - this also allows trie (prefix tree) to be used, e.g. for better compression
| |
|
| |
|
| d) localized wordlists
| | The mnemonic must encode entropy in a multiple of 32 bits. With more entropy |
| - we would like to allow localized wordlists, so it is easier for users
| | security is improved but the sentence length increases. We refer to the |
| to remember the code in their native language
| | initial entropy length as ENT. The allowed size of ENT is 128-256 bits. |
| - by using wordlists with no colliding words among languages, it's easy to
| |
| determine which language was used just by checking the first word of
| |
| the sentence
| |
|
| |
|
| e) mnemonic checksum
| | First, an initial entropy of ENT bits is generated. A checksum is generated by |
| - this leads to better user experience, because user can be notified
| | taking the first <code>ENT / 32</code> bits of its SHA256 hash. This checksum is |
| if the mnemonic sequence is wrong, instead of showing the confusing
| | appended to the end of the initial entropy. Next, these concatenated bits |
| data generated from the wrong sequence.
| | are split into groups of 11 bits, each encoding a number from 0-2047, serving |
| | as an index into a wordlist. Finally, we convert these numbers into words and |
| | use the joined words as a mnemonic sentence. |
|
| |
|
| ==Specification==
| | The following table describes the relation between the initial entropy |
| | length (ENT), the checksum length (CS), and the length of the generated mnemonic |
| | sentence (MS) in words. |
|
| |
|
| <pre> | | <pre> |
| Our proposal implements two methods - "encode" and "decode".
| | CS = ENT / 32 |
| | MS = (ENT + CS) / 11 |
|
| |
|
| The first method takes a binary data which have to length (L) in bytes divisable
| | | ENT | CS | ENT+CS | MS | |
| by four and returns a sentence that consists of (L/4*3) words from the wordlist.
| | +-------+----+--------+------+ |
| | | | 128 | 4 | 132 | 12 | |
| The second method takes sentences generated by first method (number of words in
| | | 160 | 5 | 165 | 15 | |
| the sentence has to be divisable by 3) and reconstructs the original binary data.
| | | 192 | 6 | 198 | 18 | |
| | | | 224 | 7 | 231 | 21 | |
| Words can repeat in the sentence more than one time.
| | | 256 | 8 | 264 | 24 | |
| | |
| Wordlist contains 2048 words (instead of 1626 words in Electrum), allowing
| |
| the code to compute the checksum of the whole mnemonic sequence.
| |
| Each 32 bits of input data add 1 bit of checksum.
| |
| | |
| See the following table for relation between input lengths, output lengths and
| |
| checksum sizes for the most common usecases:
| |
| | |
| +--------+---------+---------+----------+
| |
| | input | input | output | checksum |
| |
| | (bits) | (bytes) | (words) | (bits) |
| |
| +--------+---------+---------+----------+ | |
| | 128 | 16 | 12 | 4 | | |
| | 160 | 20 | 15 | 5 | | |
| | 192 | 24 | 18 | 6 | | |
| | 224 | 28 | 21 | 7 | | |
| | 256 | 32 | 24 | 8 | | |
| +--------+---------+---------+----------+
| |
| </pre> | | </pre> |
|
| |
|
| ===Algorithm:=== | | ==Wordlist== |
|
| |
|
| <pre>
| | An ideal wordlist has the following characteristics: |
| Encoding:
| |
| 1. Read input data (I).
| |
| 2. Make sure its length (L) is divisable by 32 bits.
| |
| 3. Compute the length of the checkum (LC). LC = L/32
| |
| 4. Split I into chunks of LC bits (I1, I2, I3, ...).
| |
| 5. XOR them altogether and produce the checksum C. C = I1 xor I2 xor I3 ... xor In.
| |
| 5. Concatenate I and C into encoded data (E). Length of E is divisable by 33 bits.
| |
| 6. Keep taking 11 bits from E until there are none left.
| |
| 7. Treat them as integer W, add word with index W to the output.
| |
|
| |
|
| Decoding:
| | a) smart selection of words |
| 1. Read input mnemonic (M).
| | - the wordlist is created in such a way that it's enough to type the first four |
| 2. Make sure its wordcount is divisable by 3.
| | letters to unambiguously identify the word |
| 3. Figure out word indexes in a dictionary and output them as binary stream E.
| |
| 4. Length of E (L) is divisable by 33 bits.
| |
| 5. Split E into two parts: B and C, where B are first L/33*32 bits, C are last L/33 bits.
| |
| 6. Make sure C is the checksum of B (using the step 5 from the above paragraph).
| |
| 7. If it's not we have invalid mnemonic code.
| |
| 8. Treat B as binary data and return it as output.
| |
|
| |
|
| </pre>
| | b) similar words avoided |
| | - word pairs like "build" and "built", "woman" and "women", or "quick" and "quickly" |
| | not only make remembering the sentence difficult but are also more error |
| | prone and more difficult to guess |
|
| |
|
| ==Test vectors==
| | c) sorted wordlists |
| | | - the wordlist is sorted which allows for more efficient lookup of the code words |
| <pre>
| | (i.e. implementations can use binary search instead of linear search) |
| input: 00000000 (32 bits)
| | - this also allows trie (a prefix tree) to be used, e.g. for better compression |
| mnemonic: abandon abandon abandon (3 words)
| |
|
| |
|
| input: 7f7f7f7f (32 bits)
| | The wordlist can contain native characters, but they must be encoded in UTF-8 |
| mnemonic: legal wing taxi (3 words)
| | using Normalization Form Compatibility Decomposition (NFKD). |
|
| |
|
| input: 80808080 (32 bits)
| | ==From mnemonic to seed== |
| mnemonic: lethal adult bundle (3 words) | |
|
| |
|
| input: ffffffff (32 bits)
| | A user may decide to protect their mnemonic with a passphrase. If a passphrase is not |
| mnemonic: zoo zoo zone (3 words)
| | present, an empty string "" is used instead. |
|
| |
|
| input: 0000000000000000 (64 bits)
| | To create a binary seed from the mnemonic, we use the PBKDF2 function with a mnemonic |
| mnemonic: abandon abandon abandon abandon abandon abandon (6 words)
| | sentence (in UTF-8 NFKD) used as the password and the string "mnemonic" + passphrase (again |
| | in UTF-8 NFKD) used as the salt. The iteration count is set to 2048 and HMAC-SHA512 is used as |
| | the pseudo-random function. The length of the derived key is 512 bits (= 64 bytes). |
|
| |
|
| input: 7f7f7f7f7f7f7f7f (64 bits)
| | This seed can be later used to generate deterministic wallets using BIP-0032 or |
| mnemonic: legal wing taxi yard water salad (6 words)
| | similar methods. |
|
| |
|
| input: 8080808080808080 (64 bits)
| | The conversion of the mnemonic sentence to a binary seed is completely independent |
| mnemonic: lethal adult bunker absurd also dog (6 words)
| | from generating the sentence. This results in a rather simple code; there are no |
| | constraints on sentence structure and clients are free to implement their own |
| | wordlists or even whole sentence generators, allowing for flexibility in wordlists |
| | for typo detection or other purposes. |
|
| |
|
| input: ffffffffffffffff (64 bits)
| | Although using a mnemonic not generated by the algorithm described in "Generating the |
| mnemonic: zoo zoo zoo zoo zoo young (6 words) | | mnemonic" section is possible, this is not advised and software must compute a |
| | checksum for the mnemonic sentence using a wordlist and issue a warning if it is |
| | invalid. |
|
| |
|
| input: 000000000000000000000000 (96 bits)
| | The described method also provides plausible deniability, because every passphrase |
| mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon (9 words)
| | generates a valid seed (and thus a deterministic wallet) but only the correct one |
| | will make the desired wallet available. |
|
| |
|
| input: 7f7f7f7f7f7f7f7f7f7f7f7f (96 bits)
| | ==Wordlists== |
| mnemonic: legal wing taxi yard water salmon worry urge lecture (9 words)
| |
|
| |
|
| input: 808080808080808080808080 (96 bits)
| | Since the vast majority of BIP39 wallets supports only the English wordlist, |
| mnemonic: lethal adult bunker absurd also domain achieve aunt lens (9 words) | | it is '''strongly discouraged''' to use non-English wordlists for generating |
| | the mnemonic sentences. |
|
| |
|
| input: ffffffffffffffffffffffff (96 bits)
| | If you still feel your application really needs to use a localized wordlist, |
| mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo year (9 words)
| | use one of the following instead of inventing your own. |
|
| |
|
| input: 00000000000000000000000000000000 (128 bits)
| | * [[bip-0039/bip-0039-wordlists.md|Wordlists]] |
| mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (12 words)
| |
|
| |
|
| input: 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (128 bits)
| | ==Test vectors== |
| mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi worth (12 words)
| |
|
| |
|
| input: 80808080808080808080808080808080 (128 bits) | | The test vectors include input entropy, mnemonic and seed. The |
| mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker abandon (12 words)
| | passphrase "TREZOR" is used for all vectors. |
|
| |
|
| input: ffffffffffffffffffffffffffffffff (128 bits)
| | https://github.com/trezor/python-mnemonic/blob/master/vectors.json |
| mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo worth (12 words) | |
|
| |
|
| input: 0000000000000000000000000000000000000000 (160 bits)
| | Also see https://github.com/bip32JP/bip32JP.github.io/blob/master/test_JP_BIP39.json |
| mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (15 words)
| |
|
| |
|
| input: 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (160 bits)
| | (Japanese wordlist test with heavily normalized symbols as passphrase) |
| mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon winner (15 words)
| |
|
| |
|
| input: 8080808080808080808080808080808080808080 (160 bits)
| | ==Reference Implementation== |
| mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain abandon (15 words)
| |
|
| |
|
| input: ffffffffffffffffffffffffffffffffffffffff (160 bits)
| | Reference implementation including wordlists is available from |
| mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo winner (15 words)
| |
| | |
| input: 000000000000000000000000000000000000000000000000 (192 bits)
| |
| mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (18 words)
| |
| | |
| input: 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (192 bits)
| |
| mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon worry urge legal wave (18 words)
| |
| | |
| input: 808080808080808080808080808080808080808080808080 (192 bits)
| |
| mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain achieve aunt lethal abandon (18 words)
| |
| | |
| input: ffffffffffffffffffffffffffffffffffffffffffffffff (192 bits)
| |
| mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo wave (18 words)
| |
| | |
| input: 00000000000000000000000000000000000000000000000000000000 (224 bits)
| |
| mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (21 words)
| |
| | |
| input: 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (224 bits)
| |
| mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon worry urge legal wing taxi yard usage (21 words)
| |
| | |
| input: 80808080808080808080808080808080808080808080808080808080 (224 bits)
| |
| mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd abandon (21 words)
| |
| | |
| input: ffffffffffffffffffffffffffffffffffffffffffffffffffffffff (224 bits)
| |
| mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo usage (21 words)
| |
| | |
| input: 0000000000000000000000000000000000000000000000000000000000000000 (256 bits)
| |
| mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (24 words)
| |
| | |
| input: 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (256 bits)
| |
| mnemonic: legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon worry urge legal wing taxi yard water salmon worry team (24 words)
| |
| | |
| input: 8080808080808080808080808080808080808080808080808080808080808080 (256 bits)
| |
| mnemonic: lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain achieve aunt lethal adult bunker absurd also domain achieve abandon (24 words)
| |
| | |
| input: ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff (256 bits)
| |
| mnemonic: zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo team (24 words)
| |
| | |
| input: 19458fe8dace41e60617c6667874f6be (128 bits)
| |
| mnemonic: bless cloud wheel regular tiny venue birth web grief security dignity language (12 words)
| |
| | |
| input: 2d8fc7c27a52995bc165a0dcead7c169f301652f0b5db145 (192 bits)
| |
| mnemonic: sketch flavor eyebrow height youth rope various pet fish little embryo arena just chicken jump mourn advance unique fork knock general slice short caught (24 words)
| |
| | |
| input: ea88c6efbc4028994b8b4ebc0345a474 (128 bits)
| |
| mnemonic: try educate robust joke act erosion color hedge rock blow hat tribe (12 words)
| |
| | |
| input: bb8e02d2c0fe4c18f60e173bc38724902395b63b1b47bf18 (192 bits)
| |
| mnemonic: river ignore recycle like tobacco armor straw season despair bosnian sibling burden defy sunset typical harvest sad shadow (18 words)
| |
| | |
| input: 0b57161fd701c6c6548db5a71df70166965eec2f16a675acad3ed3460cb3e72f (256 bits)
| |
| mnemonic: approach response map protect both glass fabric remember place upset satisfy slave gravity irish romance squad involve grain excuse pioneer gauge flight open white (24 words)
| |
| | |
| input: 666fbb72dada90d6c8f627d77450f366 (128 bits)
| |
| mnemonic: grief last swamp reject pond history car sentence stone patrol diamond slam (12 words)
| |
| | |
| input: 29932c2887ff2c72dd8f9cff1b7ee498119f330d1dd81659 (192 bits)
| |
| mnemonic: chunk oak another audit vendor degree iron very year surprise retreat cook blossom obey crowd riot belt slogan (18 words)
| |
| | |
| input: 869e2bc8286611bce2159b89c5a8a4029666dc186bbfe8756ac505a4e89f481e (256 bits)
| |
| mnemonic: magnet vacuum van exotic gear tactic march ready matrix coat chin after grief huge genuine jet travel prepare race approach evil excuse burial skirt (24 words)
| |
| | |
| input: d0263588b1deee8186818319515c6691 (128 bits)
| |
| mnemonic: soda country ghost glove unusual dose blouse cope bless medal block car (12 words)
| |
| | |
| input: 48754ef698334cf9cc5494ccca6a0294bf30224066576a7c (192 bits)
| |
| mnemonic: embrace poverty royal cope cruise lake cotton movie slam false letter chuckle venue awesome accident sing hen tight (18 words)
| |
| | |
| input: ee9e8ce85eadcdf0f5473d7490816d9b1335f7204e7776597ac99f9e29186364 (256 bits)
| |
| mnemonic: unlikely victory dentist round swear weapon squeeze traffic insist logo force custom create win liberty snack island skate range display thought merchant migrant nasty (24 words)
| |
| </pre>
| |
| | |
| ==Reference Implementation==
| |
|
| |
|
| Reference implementation including wordlists is available from http://github.com/trezor/python-mnemonic
| | http://github.com/trezor/python-mnemonic |
|
This page describes a BIP (Bitcoin Improvement Proposal). Please see BIP 2 for more information about BIPs and creating them. Please do not just create a wiki page.
|
|
Please do not modify this page. This is a mirror of the BIP from the source Git repository here.
|
BIP: 39
Layer: Applications
Title: Mnemonic code for generating deterministic keys
Author: Marek Palatinus <slush@satoshilabs.com>
Pavol Rusnak <stick@satoshilabs.com>
Aaron Voisine <voisine@gmail.com>
Sean Bowe <ewillbefull@gmail.com>
Comments-Summary: Unanimously Discourage for implementation
Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0039
Status: Proposed
Type: Standards Track
Created: 2013-09-10
Abstract
This BIP describes the implementation of a mnemonic code or mnemonic sentence --
a group of easy to remember words -- for the generation of deterministic wallets.
It consists of two parts: generating the mnemonic and converting it into a
binary seed. This seed can be later used to generate deterministic wallets using
BIP-0032 or similar methods.
Motivation
A mnemonic code or sentence is superior for human interaction compared to the
handling of raw binary or hexadecimal representations of a wallet seed. The
sentence could be written on paper or spoken over the telephone.
This guide is meant to be a way to transport computer-generated randomness with
a human-readable transcription. It's not a way to process user-created
sentences (also known as brainwallets) into a wallet seed.
Generating the mnemonic
The mnemonic must encode entropy in a multiple of 32 bits. With more entropy
security is improved but the sentence length increases. We refer to the
initial entropy length as ENT. The allowed size of ENT is 128-256 bits.
First, an initial entropy of ENT bits is generated. A checksum is generated by
taking the first ENT / 32
bits of its SHA256 hash. This checksum is
appended to the end of the initial entropy. Next, these concatenated bits
are split into groups of 11 bits, each encoding a number from 0-2047, serving
as an index into a wordlist. Finally, we convert these numbers into words and
use the joined words as a mnemonic sentence.
The following table describes the relation between the initial entropy
length (ENT), the checksum length (CS), and the length of the generated mnemonic
sentence (MS) in words.
CS = ENT / 32
MS = (ENT + CS) / 11
| ENT | CS | ENT+CS | MS |
+-------+----+--------+------+
| 128 | 4 | 132 | 12 |
| 160 | 5 | 165 | 15 |
| 192 | 6 | 198 | 18 |
| 224 | 7 | 231 | 21 |
| 256 | 8 | 264 | 24 |
Wordlist
An ideal wordlist has the following characteristics:
a) smart selection of words
- the wordlist is created in such a way that it's enough to type the first four
letters to unambiguously identify the word
b) similar words avoided
- word pairs like "build" and "built", "woman" and "women", or "quick" and "quickly"
not only make remembering the sentence difficult but are also more error
prone and more difficult to guess
c) sorted wordlists
- the wordlist is sorted which allows for more efficient lookup of the code words
(i.e. implementations can use binary search instead of linear search)
- this also allows trie (a prefix tree) to be used, e.g. for better compression
The wordlist can contain native characters, but they must be encoded in UTF-8
using Normalization Form Compatibility Decomposition (NFKD).
From mnemonic to seed
A user may decide to protect their mnemonic with a passphrase. If a passphrase is not
present, an empty string "" is used instead.
To create a binary seed from the mnemonic, we use the PBKDF2 function with a mnemonic
sentence (in UTF-8 NFKD) used as the password and the string "mnemonic" + passphrase (again
in UTF-8 NFKD) used as the salt. The iteration count is set to 2048 and HMAC-SHA512 is used as
the pseudo-random function. The length of the derived key is 512 bits (= 64 bytes).
This seed can be later used to generate deterministic wallets using BIP-0032 or
similar methods.
The conversion of the mnemonic sentence to a binary seed is completely independent
from generating the sentence. This results in a rather simple code; there are no
constraints on sentence structure and clients are free to implement their own
wordlists or even whole sentence generators, allowing for flexibility in wordlists
for typo detection or other purposes.
Although using a mnemonic not generated by the algorithm described in "Generating the
mnemonic" section is possible, this is not advised and software must compute a
checksum for the mnemonic sentence using a wordlist and issue a warning if it is
invalid.
The described method also provides plausible deniability, because every passphrase
generates a valid seed (and thus a deterministic wallet) but only the correct one
will make the desired wallet available.
Wordlists
Since the vast majority of BIP39 wallets supports only the English wordlist,
it is strongly discouraged to use non-English wordlists for generating
the mnemonic sentences.
If you still feel your application really needs to use a localized wordlist,
use one of the following instead of inventing your own.
Test vectors
The test vectors include input entropy, mnemonic and seed. The
passphrase "TREZOR" is used for all vectors.
https://github.com/trezor/python-mnemonic/blob/master/vectors.json
Also see https://github.com/bip32JP/bip32JP.github.io/blob/master/test_JP_BIP39.json
(Japanese wordlist test with heavily normalized symbols as passphrase)
Reference Implementation
Reference implementation including wordlists is available from
http://github.com/trezor/python-mnemonic