summaryrefslogtreecommitdiff
path: root/src/main/java/org/apache/commons/math/random/package.html
blob: 9978ca0c88689934cadd7def93dd7eb3e45085a5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
<html>
<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
  -->
    <!-- $Revision: 1054186 $ $Date: 2011-01-01 03:28:46 +0100 (sam. 01 janv. 2011) $ -->
    <body>
      <p>Random number and random data generators.</p>
      <p>Commons-math provides a few pseudo random number generators. The top level interface is RandomGenerator.
      It is implemented by three classes:
      <ul>
        <li>{@link org.apache.commons.math.random.JDKRandomGenerator JDKRandomGenerator}
            that extends the JDK provided generator</li>
        <li>AbstractRandomGenerator as a helper for users generators</li>
        <li>BitStreamGenerator which is an abstract class for several generators and
            which in turn is extended by:
            <ul>
              <li>{@link org.apache.commons.math.random.MersenneTwister MersenneTwister}</li>
              <li>{@link org.apache.commons.math.random.Well512a Well512a}</li>
              <li>{@link org.apache.commons.math.random.Well1024a Well1024a}</li>
              <li>{@link org.apache.commons.math.random.Well19937a Well19937a}</li>
              <li>{@link org.apache.commons.math.random.Well19937c Well19937c}</li>
              <li>{@link org.apache.commons.math.random.Well44497a Well44497a}</li>
              <li>{@link org.apache.commons.math.random.Well44497b Well44497b}</li>
            </ul>
          </li>
        </ul>
      </p>

      <p>
      The JDK provided generator is a simple one that can be used only for very simple needs.
      The Mersenne Twister is a fast generator with very good properties well suited for
      Monte-Carlo simulation. It is equidistributed for generating vectors up to dimension 623
      and has a huge period: 2<sup>19937</sup> - 1 (which is a Mersenne prime). This generator
      is described in a paper by Makoto Matsumoto and Takuji Nishimura in 1998: <a
      href="http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.pdf">Mersenne Twister:
      A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator</a>, ACM
      Transactions on Modeling and Computer Simulation, Vol. 8, No. 1, January 1998, pp 3--30.
      The WELL generators are a family of generators with period ranging from 2<sup>512</sup> - 1
      to 2<sup>44497</sup> - 1 (this last one is also a Mersenne prime) with even better properties
      than Mersenne Twister. These generators are described in a paper by Fran&ccedil;ois Panneton,
      Pierre L'Ecuyer and Makoto Matsumoto <a
      href="http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng.pdf">Improved Long-Period
      Generators Based on Linear Recurrences Modulo 2</a> ACM Transactions on Mathematical Software,
      32, 1 (2006). The errata for the paper are in <a
      href="http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng-errata.txt">wellrng-errata.txt</a>.
      </p>

      <p>
      For simple sampling, any of these generators is sufficient. For Monte-Carlo simulations the
      JDK generator does not have any of the good mathematical properties of the other generators,
      so it should be avoided. The Mersenne twister and WELL generators have equidistribution properties
      proven according to their bits pool size which is directly linked to their period (all of them
      have maximal period, i.e. a generator with size n pool has a period 2<sup>n</sup>-1). They also
      have equidistribution properties for 32 bits blocks up to s/32 dimension where s is their pool size.
      So WELL19937c for exemple is equidistributed up to dimension 623 (19937/32). This means a Monte-Carlo
      simulation generating a vector of n variables at each iteration has some guarantees on the properties
      of the vector as long as its dimension does not exceed the limit. However, since we use bits from two
      successive 32 bits generated integers to create one double, this limit is smaller when the variables are
      of type double. so for Monte-Carlo simulation where less the 16 doubles are generated at each round,
      WELL1024 may be sufficient. If a larger number of doubles are needed a generator with a larger pool
      would be useful.
      </p>

      <p>
      The WELL generators are more modern then MersenneTwister (the paper describing than has been published
      in 2006 instead of 1998) and fix some of its (few) drawbacks. If initialization array contains many
      zero bits, MersenneTwister may take a very long time (several hundreds of thousands of iterations to
      reach a steady state with a balanced number of zero and one in its bits pool). So the WELL generators
      are better to <i>escape zeroland</i> as explained by the WELL generators creators. The Well19937a and
      Well44497a generator are not maximally equidistributed (i.e. there are some dimensions or bits blocks
      size for which they are not equidistributed). The Well512a, Well1024a, Well19937c and Well44497b are
      maximally equidistributed for blocks size up to 32 bits (they should behave correctly also for double
      based on more than 32 bits blocks, but equidistribution is not proven at these blocks sizes).
      </p>

      <p>
      The MersenneTwister generator uses a 624 elements integer array, so it consumes less than 2.5 kilobytes.
      The WELL generators use 6 integer arrays with a size equal to the pool size, so for example the
      WELL44497b generator uses about 33 kilobytes. This may be important if a very large number of
      generator instances were used at the same time.
      </p>

      <p>
      All generators are quite fast. As an example, here are some comparisons, obtained on a 64 bits JVM on a
      linux computer with a 2008 processor (AMD phenom Quad 9550 at 2.2 GHz). The generation rate for
      MersenneTwister was about 27 millions doubles per second (remember we generate two 32 bits integers for
      each double). Generation rates for other PRNG, relative to MersenneTwister:
      </p>

      <p>
        <table border="1" align="center">
          <tr BGCOLOR="#CCCCFF"><td colspan="2"><font size="+2">Example of performances</font></td></tr>
          <tr BGCOLOR="#EEEEFF"><font size="+1"><td>Name</td><td>generation rate (relative to MersenneTwister)</td></font></tr>
          <tr><td>{@link org.apache.commons.math.random.MersenneTwister MersenneTwister}</td><td>1</td></tr>
          <tr><td>{@link org.apache.commons.math.random.JDKRandomGenerator JDKRandomGenerator}</td><td>between 0.96 and 1.16</td></tr>
          <tr><td>{@link org.apache.commons.math.random.Well512a Well512a}</td><td>between 0.85 and 0.88</td></tr>
          <tr><td>{@link org.apache.commons.math.random.Well1024a Well1024a}</td><td>between 0.63 and 0.73</td></tr>
          <tr><td>{@link org.apache.commons.math.random.Well19937a Well19937a}</td><td>between 0.70 and 0.71</td></tr>
          <tr><td>{@link org.apache.commons.math.random.Well19937c Well19937c}</td><td>between 0.57 and 0.71</td></tr>
          <tr><td>{@link org.apache.commons.math.random.Well44497a Well44497a}</td><td>between 0.69 and 0.71</td></tr>
          <tr><td>{@link org.apache.commons.math.random.Well44497b Well44497b}</td><td>between 0.65 and 0.71</td></tr>
        </table>
      </p>

      <p>
      So for most simulation problems, the better generators like {@link
      org.apache.commons.math.random.Well19937c Well19937c} and {@link
      org.apache.commons.math.random.Well44497b Well44497b} are probably very good choices.
      </p>

      <p>
      Note that <em>none</em> of these generators are suitable for cryptography. They are devoted
      to simulation, and to generate very long series with strong properties on the series as a whole
      (equidistribution, no correlation ...). They do not attempt to create small series but with
      very strong properties of unpredictability as needed in cryptography.
      </p>

    </body>
</html>