Random Number Generator

Samuel Lancaster Gerry (1813-1891) was an American painter.

We have sections

The building block of a simulation study is the ability to generate random numbers, where a random number represents the value of a random variable
uniformly distributed on \((0,1)\).

\begin{equation}{\label{a}}\tag{A}\mbox{}\end{equation}

Linear Congruential Generators.

The great majority of random number generators in use are linear congruential generators}, which was introduced by Lehmer (1951). A sequence of integers \(z_{1},z_{2},\cdots\) is defined by the recursive formula

\begin{equation}{\label{sch1eq1}}\tag{1}
z_{i}=(az_{i-1}+c)\pmod{m}
\end{equation}

where \(m\) (the modulus), \(a\) (the multiplier), \(c\) (the increment), and \(m=z_{0}\) (the seed). Therefore, each \(z_{i}\) is either \(0,1,2,\cdots ,m-1\) and the quantity \(u_{i}=z_{i}/m\), called a random number, is taken as an approximation to the value of a uniform \((0,1)\) random variable. In addition to nonnegativity, the integers \(m\), \(a\), \(c\), and \(z_{0}\) should satisfy \(0<m\), \(a<m\), \(c<m\) and \(z_{0}<m\). The \(z_{i}\)’s defined by (\ref{sch1eq1}) are not really random at all. In fact, one can show by mathematical induction to obtain

\[z_{i}=\left [a^{i}z_{0}+\frac{c(a^{i}-1)}{a-1}\right ]\pmod{m}\]

for \(i=1,2,\cdots\) such that every \(z_{i}\) is completely determined by \(m\), \(a\), \(c\), and \(z_{0}\). However, by careful choice of these four parameters we try to introduce behavior in the \(z_{i}\)’s that makes the corresponding \(u_{i}\)’s appear to be i.i.d. \(U(0,1)\) random variates. Since each of the numbers \(z_{i}\) assumes one of the values \(0,1,2,\cdots ,m-1\), it follows that, after some finite number of generated values, a value must repeat itself. Once this happens, the whole sequence will begin to repeat. Therefore, we want to choose the constants \(a\) and \(m\) such that, for any initial seed \(z_{0}\), the number of variables that can be generated before this repetition occurs is large. The length of a cycle is called the period of a generator. The value \(z_{i}\) depends only on the previous integer \(z_{i-1}\). Since \(0\leq z_{i}\leq m-1\), it is clear that the period is at most \(m\). When it is in fact \(m\), the generator is said to have full period.

\begin{equation}{\label{sch1t6}}\tag{2}\mbox{}\end{equation}

Theorem \ref{sch1t6}. (Hull and Dobell (1962)) The linear congruential generator defined in (\ref{sch1eq1}) has full period if and only if the following three conditions hold true

(i) The only positive integer that divides both \(m\) and \(c\) is \(1\); that is, \(c\) is relatively prime to \(m\).

(ii) If \(q\) is a prime number that divides \(m\), then \(q\) divides \(a-1\).

(iii) If \(4\) divides \(m\), then \(4\) divides \(a-1\). \(\sharp\)

Mixed Congruential Generators.

For \(c>0\), condition (i) of Theorem \ref{sch1t6} is possible. Therefore, we might be able to obtain full period \(m\). Dividing by \(m\) to obtain the remainder in (\ref{sch1eq1}) is a relatively slow arithmetic operation, and it would be desirable to avoid having to do this division explicitly. A choice of \(m\) that is good in all these respects is \(m=2^{b}\), where \(b\) is the number of bits in a word on the computer. Furthermore, choosing \(m=2^{b}\) does allow us to avoid explicit division by \(m\) on most computers by taking advantage of integer overflow. The largest integer that can be represented is \(2^{b}-1\), and any attempt to store a larger integer \(w\) (with, say, \(h>b\) binary digits) will result in loss of the left \(h-b\) binary digits of this over-sized integer. What remains in the retained \(b\) bits is precisely \(w\pmod{2^{b}}\). The following example illustrates modulo division by overflow for \(m=2^{b}\).

Example. Suppose that the generator

\[z_{i}=(5z_{i-1}+3)\pmod{16}\mbox{ with }z_{0}=7\]

is implemented on a mythical computer with \(b=4\) bits per word. We have \(z_{6}=5\). Now \(5z_{6}+3=28\), which in binary representation is \(11100\). Since our \(4\)-bit computer can store only four binary digits, the leftmost digit of the binary number \(11100\) is dropped, leaving the binary number \(1100\), which is the binary representation of \(12=z_{7}\). \(\sharp\)

Therefore, \(m=2^{b}\) would appear to be a good choice for the modulus. With this choice, Theorem \ref{sch1t6} says that we shall obtain full period if \(c\) is odd and \(a-1\) is divisible by \(4\); which such a full period generator, \(z_{0}\) can be any integer between \(0\) and \(m-1\) without affecting the generator’s period.

Multiplicative Congruential Generators.

Multiplicative congruential generators are advantageous in that the addition of \(c\) is not needed, but they cannot have full period since condition (i) of Theorem \ref{sch1t6} cannot be satisfied (because, for example, \(m\) is positive and divides both \(m\) and \(c\)). As we shall see, it is possible to obtain period \(m-1\) if \(m\) and \(a\) are chosen carefully. Hutchinson (1966) proposed that \(m\) be the largest prime number that is less than \(2^{b}\). For example, in the case of \(b=31\), the largest prime that is less than \(2^{31}\) is, very agreeably, \(2^{31}-1=2147483647\). Now for \(m\) prime, it can be shown that the period is \(m-1\) if \(a\) is a {\bf primitive element} modulo \(m\); that is, the smallest integer \(l\) for which \(a^{l}-1\) is divisible by \(m\) is \(l=m-1\). With \(m\) and \(a\) chosen in this way, we obtain each integer \(1,2,\cdots ,m-1\) exactly once in each cycle, so that \(z_{0}\) can be any integer from \(1\) through \(m-1\). Two particular values of the multiplier \(a\) that have been widely used for the modulus \(m=2^{31}-1\) are \(a_{1}=7^{5}=16807\), and \(a_{2}=630360016\), both of which are primitive elements modulo \(m=2^{31}-1\). The multiplier \(a_{1}\) was originally suggested by Lewis, Goodman, and Miller (1969). The multiplier \(a_{2}\), suggested originally by Payne, Rabung, an Bogyo (1969), was found by Fishman and Moore to yield statistical performance much better than does \(a_{1}\). For most cases, we feel comfortable in recommending use of the multiplicative congruential generator with \(m=2^{31}-1\) and \(a=630360016\).

General Congruential Generators.

Although linear congruential generators are by far the most widely used and best understood kind of random number generator, there are many alternative types. Most of these other generators have been developed in an attempt to obtain longer periods and better statistical properties. Often, a simple linear congruential generator with carefully chosen parameters can perform nearly as well as (sometimes better than) these more complicated alternatives.

The linear congruential generators can be thought of as a special case of generators defined by

\[z_{i}=g(z_{i-1},z_{i-2},\cdots )\pmod{m}\]

where \(g\) is a fixed deterministic function of previous \(z_{i}\)’s. The \(z_{i}\)’s lie between \(0\) and \(m-1\), and the \(U(0,1)\) random numbers are given by \(u_{i}=z_{i}/m\). One choice of the function \(g\) is to let \(g(z_{i-1},z_{i-2},\cdots )=az_{i-1}^{2}+bz_{i-1}+c\), which produces a quadratic congruential generator. A special case that has received some attention is when \(a=b=1\), \(c=0\), and \(m\) is a power of \(2\). A different choice of the function \(g\) is to maintain linearly but to use earlier \(z_{i}\)’s, which gives rise to generators defined by

\[g(z_{i-1},z_{i-2},\cdots )=a_{1}z_{i-1}+a_{2}z_{i-2}+\cdots a_{k}z_{i-k}\]

where \(a_{1},a_{2},\cdots ,a_{k}\) are constants. Huge periods (as high as \(m^{q}-1\)) then become possible if the parameters are chosen properly.

Composite Generators.

Several researchers have developed methods that take two or more separate generators (usually two or three different linear congruential generators) and combine them in some way to generate the final random numbers. It is hoped that this composite generator will exhibit better statistical behavior than any of the simple generators composing it. The disadvantage in using a composite generator is, of course, that the cost of obtaining each \(u_{i}\) is more than that of using one of the simple generators alone.

The best known of the composite generators uses a second linear congruential generator to shuffle the output from the first linear congruential generator. Initially, a vector \({\bf v}=(v_{1},v_{2},\cdots ,v_{k})\) is filled sequentially with the first \(k\) \(v_{i}\)’s from the first linear congruential generator. Then the second linear congruential generator is used to generate a random integer \(i\) distributed uniformly on the integers \(1,2,\cdots ,k\), and \(v_{i}\) is returned as the first random number; the first linear congruential generator then replaces this \(i\)th location in \({\bf v}\) with its next \(v_{i}\), and the second linear congruential generator randomly chooses the next returned random number from this updated \({\bf v}\) etc.

\begin{equation}{\label{b}}\tag{B}\mbox{}\end{equation}

Testing Random Number Generators.

As we have seen, all random number generators currently used in computer simulation are actually completely deterministic. Thus, we can only hope that the \(u_{i}\)’s generated appear as if they were i.i.d. \(U(0,1)\) random variables. Now, we discuss several tests to which a random number generator can be subjected to ascertain how well the generated \(u_{i}\)’s do (or can) resemble values of true i.i.d. \(U(0,1)\) random variables. Most computer have a built-in random number generator as a part of the available software. Before such a generator is actually used in a simulation, we strongly recommend that one identify exactly what kind of generator it is and what its numerical parameters are. Unless a generator is one of the “good” ones identified and tested somewhere in the literature.

The first empirical test is designed to check whether the \(u_{i}\)’s appear to be uniformly distributed between \(0\) and \(1\). The goodness-of-fit test will be used. A goodness-of-fit is a statistical hypothesis test that is used to assess formally whether the observations \(x_{1},x_{2},\cdots , x_{n}\) are an independent sample from a particular distribution with distribution function \(F\). That is, a goodness-of-fit test can be used to test the following null hypothesis:

\[\mbox{$H_{0}:$ The \(X_{i}\)’s are i.i.d. random variables with distribution function \(F\)}\]

Before proceeding with a discussion of several specific goodness-of-fit tests, we feel compelled to comment on the formal structure and properties of these tests. First, failure to reject \(H_{0}\) should not be interpreted as “accepting \(H_{0}\) as being true”. These tests are often not very powerful for small to moderate sample sizes \(n\). On the other hand, if \(n\) is very large, then these tests will almost always reject \(H_{0}\). Since \(H_{0}\) is virtually never exactly true, even a minute departure from the hypothesized distribution will be detected for large \(n\). This is an unfortunate property of these tests, since it is usually sufficient to have a distribution that is “nearly” correct.

The oldest goodness-of-fit test is the chi-square test, which dates back at least to the paper of K. Pearson (1900). We first divide the entire range of the fitted distribution into \(k\) adjacent intervals \([a_{0},a_{1}),[a_{1},a_{2}),\cdots ,[a_{k-1},a_{k})\), where it could be that \(a_{0}=-\infty\), or \(a_{k}=+\infty\), or both. Then. we tally

\[\mbox{$N_{j}=$number of \(x_{j}\)’s in the \(j\)th interval \([a_{j-1},a_{j})\)}\]

for \(j=1,2,\cdots ,k\). Note that \(\sum_{j=1}^{k} N_{j}=n\). Next, we compute the expected proportion \(p_{j}\) of the \(x_{j}\)’s that would fall in the \(j\)th interval if we were sampling from the fitted distribution. In the continuous case,

\[p_{j}=\int_{a_{j-1}}^{a_{j}} f(x)dx\]

where \(f\) is the density of the fitted distribution. For the discrete case,

\[p_{j}=\sum_{a_{j-1}\leq x_{j}<a_{j}} p(x_{j})\]

where \(p\) is the mass function of the fitted distribution. Finally, the test statistic is

\[\chi^{2}=\sum_{i=1}^{k} \frac{(N_{j}-np_{j})^{2}}{np_{j}}.\]

Since \(np_{j}\) is the expected number of the \(n\) \(x_{j}\)’s that would fall in the \(j\)th interval if \(H_{0}\) is true, we would expect \(\chi^{2}\) to be small if the fit is good. Therefore, we reject \(H_{0}\) if \(\chi^{2}\) is too large. Suppose that all parameters of the fitted distribution are known; that is, we specified the fitted distribution without making use of the data in any way. Then if \(H_{0}\) is true, \(\chi^{2}\) converges in distribution to a chi-square distribution with \(k-1\) degrees of freedom. Thus, for large \(n\), a test with approximate level \(\alpha\) is obtained by rejecting \(H_{0}\) if \(\chi^{2}>\chi^{2}_{k- 1;\alpha}\), where$\chi^{2}_{k-1;\alpha}$ satisfies \(P\left\{\chi_{k-1}^{2}>\chi_{k-1;\alpha}^{2}\right\}=\alpha\).

Now we divide \([0,1]\) into \(k\) subintervals of equal length and generate \(u_{1},u_{2},\cdots ,u_{n}\). As a general rule, \(k\) should be at least \(100\) here, and \(n/k\) should be at least \(5\). For \(j=1,2,\cdots ,k\), let \(f_{j}\) be the number of the \(u_{i}\)’s that are in the \(j\)th sub-interval, and let (we know \(np_{j}=n/k\))

\[\chi^{2}=\frac{k}{n}\sum_{j=1}^{k}\left (f_{j}-\frac{n}{k}\right )^{2}.\]

Then, for large \(n\), \(\chi^{2}\) will have an approximate chi-square distribution with \(k-1\) degrees of freedom under the null hypothesis that the \(u_{i}\)’s are i.i.d. \(U(0,1)\) random variables. Thus we reject this hypothesis at level \(\alpha\) if \(\chi^{2}>\chi^{2}_{k-1;\alpha}\). For the large values of \(k\) likely to be encountered here, we can use the approximation

\[\chi^{2}_{k-1;\alpha}\approx (k-1)\left [1-\frac{2}{9(k-1)}+z_{\alpha}\sqrt{\frac{2}{9(k-1)}}\right ]^{3}\]

where \(z_{\alpha}\) satisfies \(P\{Z>z_{\alpha}\}=\alpha\), \(Z\) is \(N(0,1)\) distribution.

\begin{equation}{\label{sch1ex8}}\tag{3}\mbox{}\end{equation}

Example \ref{sch1ex8}. We applied the chi-square test of uniformity to the following generator

\[z_{i}=630360016\cdot z_{i-1}\pmod{2^{31}-1}.\]

We take \(k=2^{12}=4096\) and let \(n=2^{15}=32768\). We obtain \(\chi^{2}=4141\). Using the approximation for the critical point, we have \(\chi^{2}_{4095,0.1}=4211.4\). Therefore, the null hypothesis of uniformity is not rejected at level \(\alpha =0.1\). Therefore, these particular \(32768\) \(u_{i}\)’s produced by this generator do not behave in a way that is significantly different from what would be expected from truly i.i.d. \(U(0,1)\) random variables, so far as this chi-square test can ascertain. \(\sharp\)

The second empirical test, the serial test, is really just a generalization of the chi-square test to higher dimension. If the \(u_{i}\)’s were really i.i.d. \(U(0,1)\) random variates, the nonoverlapping \(d\)-tuples

\[{\bf u}_{1}=(u_{1},\cdots ,u_{d}),{\bf u}_{2}=(u_{d+1},\cdots ,u_{2d}),\cdots\]

should be i.i.d. random vectors distributed uniformly on the \(d\)-dimensional unit hypercube \([0,1]^{d}\). Divide \([0,1]\) into \(k\) sub-intervals of equal size and generate \({\bf u}_{1},{\bf u}_{2},\cdots ,{\bf u}_{n}\) (requiring \(nd\) \(u_{i}\)’s). Let \(f_{j_{1}j_{2}\cdots j_{d}}\) be the number of \({\bf u}_{i}\)’s having first component in subi-nterval \(j_{1}\), second component in sub-interval \(j_{2}\), etc. If we let

\[\chi^{2}(d)=\frac{k^{d}}{n}\sum_{j_{1}=1}^{k}\sum_{j_{2}=1}^{k}\cdots\sum_{j_{d}=1}^{k}\left (f_{j_{1}j_{2}\cdots j_{d}}-\frac{n}{k^{d}}\right )^{2}\]

then \(\chi^{2}(d)\) will have an approximate chi-square distribution with \(k^{d}-1\) degrees of freedom. Again, it is advisable to have \(n/k^{d}\geq 5\).

Example. For \(d=2\), we test the null hypothesis that the pairs \((u_{1},u_{2}),(u_{3},u_{4}),\cdots ,(u_{2n-1},u_{2n})\) are i.i.d. random vectors distributed uniformly over the unit square. We use the generator in Example \ref{sch1ex8}, and generate \(32768\) pairs of \(u_{i}\)’s. We take \(k=64\) such that the degrees of freedom are again \(4095=64^{2}-1\) at the level \(\alpha =0.1\) with critical value \(4211.4\). The value of \(\chi^{2}(2)\) is \(4016.5\), indicating acceptable uniformity in two dimensions. For \(d=3\), we take \(k=16\) to keep the degrees of freedom as \(4095=16^{3}-1\) at the level \(\alpha =0.1\) with critical value \(4211.4\). Generate \(n=32768\) non-overlapping triples of \(u_{i}\)’s. \(\chi^{2}(2)\) is \(4174.5\), again indicating uniformity in three dimensions. \(\sharp\)

Why should we care about this kind of uniformity in higher dimension? If the individual \(u_{i}\)’s are correlated, the distribution of the \(d\)-vectors \({\bf u}_{i}\) will deviate from \(d\)-dimensional uniformity. Therefore, the serial test provides an indirect check on the assumption that the individual \(u_{i}\)’s are independent. For example, if adjacent \(u_{i}\)’s tend to be positively correlated, the pairs \((u_{i},u_{i+1})\) will tend to cluster around the southwest-northeast diagonal in the unit square, and \(\chi^{2}\) should pick this up.

The third empirical test we consider, the runs test or runs-up test is a more direct test of the independence assumption. In fact, it is a test of independence only; that is, we are not testing for uniformity in particular. We examine the \(u_{i}\) sequence for unbroken subsequences of maximal length within which the \(u_{i}\)’s increase monotonically. Such a sequence is called a run up. For example, consider the following sequence

\[u_{1},u_{2},\cdots ,u_{10}:0.86, 0.11, 0.23, 0.03, 0.13, 0.06, 0.55, 0.64,0.87, 0.10.\]

The sequence starts with a run up of length \(1\) \((0.86)\), followed by a run up of length \(2\) \((0.11,0.23)\), then another run up of length \(2\) \((0.03,0.13)\), then a run up of length \(4\) \((0.06,0.55,0.64,0.87)\), and finally another run up of length \(1\) \((0.10)\). From a sequence of \(n\) \(u_{i}\)’s, we count the number of runs up of length \(1,2,3,4,5\), and \(\geq 6\), and then define

\[r_{i}=\left\{\begin{array}{ll}
\mbox{number of runs up of length \(i\)} & \mbox{for }i=1,2,3,4,5\\
\mbox{number of runs up of length \(\geq 6\)} & \mbox{for }i=6
\end{array}\right .\]

For the \(10\) \(u_{i}\)’s above, \(r_{1}=2\), \(r_{2}=2\), \(r_{3}=0\), \(r_{4}=1\), \(r_{5}=0\), and \(r_{6}=0\). The test statistic is then

\[R=\frac{1}{n}\sum_{i=1}^{6}\sum_{j=1}^{6} a_{ij}(r_{i}-nb_{i})(r_{j}-nb_{j})\]

where \(a_{ij}\) is the \((i,j)\)th element of the matrix

\[\left [\begin{array}{cccccc}
4529.4 & 9044.9 & 13568 & 18091 & 22615 & 27892\\
9044.9 & 18097 & 27139 & 36187 & 45234 & 55789\\
13568 & 27139 & 40721 & 54281 & 67852 & 83685\\
18091 & 36187 & 54281 & 72414 & 90470 & 111580\\
22615 & 45234 & 67852 & 90470 & 113262 & 139476\\
27892 & 55789 & 83685 & 111580 & 139476 & 172860
\end{array}\right ]\]

and the \(b_{i}\)’s are given by

\[(b_{1},b_{2},b_{3},b_{4},b_{5},b_{6})=\left (\frac{1}{6},\frac{5}{24},
\frac{11}{120},\frac{19}{720},\frac{29}{5040},\frac{1}{840}\right )\]

(see Knuth (1981,pp. 65-68) for derivation of these constants). For large \(n\) (Knuth recommends \(n\geq 4000\)), \(R\) will have an approximate chi-square
distribution with \(6\) degrees of freedom under the null hypothesis that the \(u_{i}\)’s are i.i.d. random variables.

Example. We use the generator in Example \ref{sch1ex8}, and generate \(n=5000\) \(u_{i}\)’s. We obtain \((r_{1},r_{2},\cdots ,r_{6}) (808,1026,448,139,43,4)\) leading to a value of \(R=9.3\). Since \(\chi^{2}_{6;0.1}=10.64\), we do not reject the hypothesis of independence at level \(\alpha =0.1\). \(\sharp\)

The run-up test can be reversed in the obvious way to obtain a runs-down test. The \(a_{ij}\) and \(b_{i}\) constants are the same. Since runs tests look solely for independence (and not specifically for uniformity), it would be probably be a good idea to apply a run tests before performing the chi-square or serial tests, since the last two tests implicitly assume independence. Knuth feels that the run test is more powerful than the chi-square tests, in the sense that many generators pass the chi-square and serial tests but fail the run tests. His explanation is that the linear congruential generators with small multipliers and increments tend to generate longer runs that would be expected for truly i.i.d. \(u_{i}\)’s.

\begin{equation}{\label{c}}\tag{C}\mbox{}\end{equation}

Numerical Computations Using Random Numbers.

First of all, we introduce the strong law of large numbers.

Theorem. (The Strong Law of Large Numbers). Let \(X_{1},X_{2},\cdots ,X_{n}\) be a sequence of independent and identically distributed random variables having mean \(\mu\). Then, we have

\[\lim_{n\rightarrow\infty}\frac{X_{1}+X_{2}+\cdots +X_{n}}{n}=\mu\mbox{ with probability }1.\]

That is, with certainty, the long-run average of a sequence of independent and identically distributed random variables will converge to its mean. \(\sharp\)

Let \(g(x)\) be a function. Suppose that we want to compute the integral \(\theta\), where

\[\theta =\int_{0}^{1} g(x)dx.\]

To compute the value of \(\theta\), note that if \(U\) is uniformly distributed over \((0,1)\), then we can express \(\theta\) as \(\theta =\mathbb{E}[g(U)]\). If \(U_{1},U_{2},\cdots ,U_{k}\) are independent uniform \((0,1)\) random variables, it follows that the random variables \(g(U_{1}),g(U_{2}),\cdots ,g(U_{k})\) are independent and identically distributed random variables having mean \(\theta\). Therefore, using the strong law of large numbers, it follows, with probability \(1\),

\[\frac{g(U_{1})+g(U_{2})+\cdots +g(U_{k})}{k}\longrightarrow\mathbb{E}[g(U)]=\theta =\int_{0}^{1} g(x)dx\mbox{ as }k\rightarrow\infty .\]

Therefore, we can approximate \(\theta\) by generating a large number of random numbers \(u_{i}\) and taking as our approximation the average value of \(g(u_{i})\). This approach to approximating integrals is called the Monte Carlo approach.

If we want to compute the integral

\[\theta =\int_{a}^{b} g(x)dx\]

then, by making the substitution \(y=(x-a)/(b-a)\) and \(dy=dx/(b-a)\), we have

\[\theta =\int_{0}^{1} g(a+(b-a)y)(b-a)dy\equiv\int_{0}^{1} h(y)dy\]

where \(h(y)=(b-a)g(a+(b-a)y)\). Thus we can approximate \(\theta\) by continually generating random numbers and then taking the average value of function \(h(y)\) evaluated at these random numbers. Similarly, if we want to compute the improper integral

\[\theta =\int_{0}^{\infty} g(x)dx\]

we could apply the substitution \(y=1/(x+1)\) and \(dy=-dx/(x+1)^{2}=-y^{2}dx\) to obtain the identity

\[\theta =\int_{0}^{1} h(y)dy\]

where

\[h(y)=\frac{g(1/y-1)}{y^{2}}.\]

Suppose now that \(g\) is a function with an \(n\)-dimensional argument and that we are interested in computing

\[\theta =\int_{0}^{1}\int_{0}^{1}\cdots\int_{0}^{1}g(x_{1},\cdots ,x_{n})dx_{1}dx_{2}\cdots dx_{n}.\]

The key to the Monte Carlo approach to estimate \(\theta\) lies in the fact that \(\theta\) can be expressed as the following expectation

\[\theta =\mathbb{E}[g(U_{1},\cdots ,U_{n})]\]

where \(U_{1},\cdots ,U_{n}\) are independent uniform \(U(0,1)\) random variables. Therefore, if we generate \(k\) independent sets, each consisting of \(n\) independent uniform \(U(0,1)\) random variables

\[\begin{array}{c}
U_{11},\cdots ,U_{1n}\\
U_{21},\cdots ,U_{2n}\\
\vdots\\
U_{k1},\cdots ,U_{kn}
\end{array}\]

then, since the random variables \(g(U_{i1},\cdots U_{in})\) for \(i=1,\cdots ,k\) are all independent and identically distributed random variables with mean \(\theta\), we can estimate \(\theta\) by

\[\frac{\sum_{i=1}^{k} g(U_{i1},\cdots ,U_{in})}{k}.\]

Example. (The Estimation of \(\pi\)) Suppose that the random vector \((X,Y)\) is uniformly distributed in the square of area centered at the origin, where \(-1\leq X\leq 1\) and \(-1\leq Y\leq 1\). Let us consider the probability that this random point in the square is contained within the inscribed circle of radius \(1\). Note that since \((X,Y)\) is uniformly distributed in the square it follows that

\begin{align*} \mathbb{P}\{(X,Y)\mbox{ is in the circle}\} & =P\{X^{2}+Y^{2}\leq 1\}\\ & =
\frac{\mbox{Area of the circle}}{\mbox{Area of the square}}=\frac{\pi}{4}.\end{align*}

Therefore, if we generate a large number of random points in the square, the proportion of points that fall within the circle will be approximately \(\pi /4\). Note that if \(X\) and \(Y\) are independent and both are uniformly distributed over \((-1,1)\), their joint density would be

\begin{align*} f(x,y) & =f_{X}(x)f_{Y}(y)\\ & =\frac{1}{2}\cdot\frac{1}{2}=\frac{1}{4}\end{align*}

for \(-1\leq x\leq 1\) and \(-1\leq y\leq 1\). Since the density function of \((X,Y)\) is constant in the square, it  follows that \((X,Y)\) is uniformly distributed in the square. Now if \(U\) is uniformly on \((0,1)\) then \(2U\) is uniformly on \((0,2)\), and so \(2U-1\) is uniformly on \((-1,1)\). Therefore, if we generate random numbers \(u_{1}\) and \(u_{2}\), set \(x=2u_{1}-1\) and \(y=2u_{1}-1\), and define

\[I=\left\{\begin{array}{ll}
1 & \mbox{if \(X^{2}+Y^{2}=(2U_{1}-1)^{2}+(2U_{2}-1)^{2}\leq 1\)}\\
0 & \mbox{otherwise}
\end{array}\right .\]

then

\[\mathbb{E}[I]=P\{X^{2}+Y^{2}\leq 1\}=\frac{\pi}{4}.\]

Therefore, we can estimate \(\pi /4\) by generating a large number of pairs of random numbers \(u_{1}\), \(u_{2}\) and estimating \(\pi /4\) by the fraction
of pairs for which

\[(2u_{1}-1)^{2}+(2u_{2}-1)^{2}\leq 1\]

(The unbiased estimator of \(E[I]\) is \(\bar{I}=\frac{1}{n}\sum_{i=1}^{n} I_{i}\)). \(\sharp\)

 

Hsien-Chung Wu
Hsien-Chung Wu
文章: 183

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *