User:Gmaxwell/coin selection

From Bitcoin Wiki
Revision as of 23:24, 28 May 2012 by Gmaxwell (talk | contribs) (example execution)
Jump to: navigation, search

Improved coin selection would lower user transaction fees, speed up transaction processing for users, improve user privacy, reduce blockchain bloat and/or any mixture of these positive results.

The reference client uses a heuristic subset-sum knapsack solver to try to minimize transaction size. It is fee blind, though it is run multiple times excluding less confirmed coins which may sometimes help avoid fees if there are many larger inputs available and minimizing sizes helps with fees (but does not guarantee low fees). Minimizing the current transaction size also does nothing to reduce the amount of data a pruning bitcoin node needs to keep around, in fact it usually does the opposite.

Optimal coin selection is generally a mixed-interger non-linear program. These kinds of problems are often NP-complete, though there are heuristic-based solvers which can usually find excellent solutions quickly.

In these problems most of constraints are linear but the fee calculation is non-linear because the use of an input both contributes the transaction's coins-days-destroyed as well as increases the transaction size.

The problem of fee/priority-blind coin selection can have entirely linear constraints and so should achieve good (or exact) solutions through basic MIP techniques like branch/bound/cut[1].

An interesting variation of the problem is where you first try to minimize linking transactions from multiple anonymity groups, or where some must be linked minimizing the badness of the linkage. One possible baddness metric is the ration of L2 (or higher) to L1 norm for the resulting group sizes (e.g. more smaller groups is better than fewer bigger groups).

Below is an example AMPL[2] input which can be used with solvers like Bonmin[3] to find good coin selections. While big solver systems are not suitable for integration with Bitcoin (due to code size, licensing, computational complexity, memory usage, etc) they can be used as benchmarks to compare more reasonable implementations.

A realistic implementation would probably try several techniques ("use a single input exactly", "build transactions with no change", "build transactions with change but no fees") with solvers specialized for each and then choose the 'best', potentially based on user preferences. In cases where the constraints limit the inputs substantially (E.g. 'use a single input exactly') exact solutions are easy.

AMPL program

This program finds a zero fee mixture of inputs which _minimizes_ the priority, subject to being zero-fee and meeting the required output and not making a txn which is excessively large or has more than a maximum amount of change. By minimizing the priority we avoid burning priority excessively, resulting in paying more fees in the future. The minimizing process also tends to sweep the wallet of small inputs as a side effect.

set coins ordered;

param f_min {coins} >= 0, default 0;
param f_max {j in coins} >= f_min[j], default 1;

param value {coins} >= 0;
param age {coins} >= 0;
param sigsz {coins} >= 0;

param target >= 0;
param target_max >= 0;
param thresh_prio >= 0;
param max_outputs >= 0, default Infinity;
param min_out >= 0;
param basesz >= 0;

# --------------------------------------------------------

var Use {j in coins} integer >= f_min[j], <= f_max[j];

# --------------------------------------------------------

minimize priority: sum {j in coins} value[j] * age[j] * Use[j]
                     / (basesz+(sum {i in coins} sigsz[i] * Use[i]));

# --------------------------------------------------------

subject to Free:
   thresh_prio <=  sum {j in coins} value[j] * age[j] * Use[j]
                     / (basesz+(sum {i in coins} sigsz[i] * Use[i]));

subject to Enough:
   target <= sum {j in coins} value[j] * Use[j];

subject to NotTooMuch:
   target_max >= sum {j in coins} value[j] * Use[j];

subject to size:
   max_outputs >= sum {j in coins} Use[j];

subject to NoDustChange:
   min_out <= sum {j in coins} value[j] * Use[j]
                - target;

#subject to NoChange:
#    target = sum {j in coins} value[j] * Use[j];

Example AMPL problem data

If there are multiple identical inputs they should be only put in once and given a fmax equal to the number of duplicates. The sizes given per input and for the base transaction are just random numbers.

param target      := 150000000;
param target_max  := 500000000;
param thresh_prio := 100;
param min_out     := 1000000;
param max_outputs := 100;
param basesz      := 100;

param:  coins:                  value       age  sigsz f_min  f_max :=
  "aaaa"                    100000001         1   100      .      .  
  "aaab"                     50000000     20000   100      .      .  
  "aaac"                    100000000         2    50      .      .  
  "aaad"                            1         1   100      .      . ;

Example run with AMPL and Bonmin

[gmaxwell@kindle01 bin]$ ./ampl 
ampl: model coins.mod;
ampl: data coins1.dat;
ampl: write gcoins1;
ampl: quit
[gmaxwell@kindle01 bin]$ ls -l
-rw-rw-r--. 1 gmaxwell gmaxwell 937 May 28 19:20
[gmaxwell@kindle01 bin]$ ./bonmin
Bonmin 1.5.0 using Cbc 2.7.1 and Ipopt 3.10.0

This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit

              Num      Status      Obj             It       time                 Location
NLP0014I             1         OPT 857114.295.5g        9 0.002g
              Num      Status      Obj             It       time                 Location
NLP0014I             1         OPT 857142.865.5g        0 0g
              Num      Status      Obj             It       time                 Location
NLP0014I             1         OPT 857142.865.5g        0 0g
Cbc0012I Integer solution of 857142.86 found by DiveMIPFractional after 0 iterations and 0 nodes (0.00 seconds)
NLP0014I             2         OPT 857142.865.5g        0 0g
NLP0014I             3         OPT 857142.865.5g        0 0g
Cbc0001I Search completed - best objective 857142.8628571429, took 0 iterations and 0 nodes (0.00 seconds)
Cbc0035I Maximum depth 0, 0 variables fixed on reduced cost

bonmin: Optimal

[gmaxwell@kindle01 bin]$ ./ampl 
ampl: model coins.mod;
ampl: data coins1.dat;
ampl: solution coins1.sol;
bonmin: Optimal
ampl: display Use;
Use [*] :=
aaaa  1
aaab  0
aaac  1
aaad  1

ampl: quit