Statistical Analysis of Simulated Data

Johannes Bartholomaus Duntze (1825-1895 ) was a German landscape painter.

We have sections

\begin{equation}{\label{a}}\tag{A}\mbox{}\end{equation}

Terminating and Non-terminating Simulations.

A simulation study is usually undertaken to determine the value of some quantity \(\theta\) connected with a particular stochastic model. A simulation of the relevant system results in the output data \(X\), a random variable whose expected value is the quantity of interest \(\theta\). A second independent simulation, that is, a second simulation run, provides a new and independent random variables having mean \(\theta\). This constitutes until we have assumed a total \(k\) runs, and the \(k\) independent random variables \(X_{1},\cdots ,X_{k}\), all of which are identically distributed with mean \(\theta\). The average of these \(k\) values, \(\bar{X}=\sum_{i=1}^{k} X_{i}/k\), is then used as an estimator, or approximator, of \(\theta\).

We now describe more precisely the random nature of simulation output. Let \(Y_{1},Y_{2},\cdots\) be an output stochastic process from a single simulation run. For example, \(Y_{i}\) might be the throughput (production) in the \(i\)th hour for a manufacturing system. The \(Y_{i}\)’s are random variables that will, in general, be neither independent nor identically distributed. Therefore, most of the formulas, which assume independence, do not apply directly.

Let \(y_{11},y_{12},\cdots ,y_{1m}\) be a realization of the random variables \(Y_{1},Y_{2},\cdots ,Y_{m}\) resulting from making a simulation run of length \(m\) observations using the random numbers \(u_{11},u_{12},\cdots\) (The \(i\)th random number used in the \(j\)th run is denoted by \(u_{ji}\)). If we run the simulation with a different set of random numbers \(u_{21},u_{22},\cdots\), then we will obtain a different realization \(y_{21},y_{22},\cdots ,y_{2m}\) of the random variables \(Y_{1},Y_{2},\cdots ,Y_{m}\). (The two realizations are not the same since the different random numbers used in the two runs produce different samples from the input probability distributions). In general, suppose that we make \(n\) independent replications (runs) of the simulation of length \(m\), resulting in the observations

\[\begin{array}{ccccc}
y_{11}, & \cdots , & y_{1i}, & \cdots , & y_{1m}\\
y_{21}, & \cdots , & y_{2i}, & \cdots , & y_{2m}\\
\vdots && \vdots && \vdots\\
y_{n1}, & \cdots , & y_{ni}, & \cdots , & y_{nm}
\end{array}\]

The observations from a particular replication (row) are clearly not i.i.d. However, note that \(y_{1i},y_{2i},\cdots ,y_{ni}\) from the \(i\)th column are i.i.d. observations of the random variable \(Y_{i}\) for \(i=1,2,\cdots ,m\). Then, roughly speaking, the goal of output analysis is to use the observations \(y_{ji}\) to draw inferences about the random variables \(Y_{1},Y_{2},\cdots ,Y_{m}\). For example, \(\bar{y}_{i}=(1/n)\sum_{j=1}^{n}y_{ji}\) is an unbiased estimate of \(\mathbb{E}(Y_{i})\).

\begin{equation}{\label{sch6ex18}}\tag{1}\mbox{}\end{equation}

Example \ref{sch6ex18}. Consider a bank with five tellers and one queue, which opens its door at 9a.m., closes its door at 5p.m., but stays open until all customers in the bank at 5p.m. have been served. Assume that customers arrive in accordance with a Poisson process at rate \(1\) per minute (i.e. i.i.d. exponential interarrival times with mean \(1\) minute), that service times are i.i.d. exponential random variables with mean \(4\) minutes, and that the customers are served in a first-in-first-out manner. The following table shows several typical output statistics from \(10\) independent replications of a simulation of the bank, assuming that no customers are present initially.

\[\begin{array}{cccccc}
\hline \mbox{Replication} & \mbox{Number} & \mbox{Finish time} & \mbox{Average delay} & \mbox{Average queue & Proportion of}\\
& \mbox{served} & \mbox{in (hours)} & \mbox{queue (minutes)} & \mbox{length} & \mbox{customers delayed}\\
&&&&& \mbox{\(<5\) minutes}\\
\hline 1 & 484 & 8.12 & 1.53 & 1.52 & 0.917\\
2 & 475 & 8.14 & 1.66 & 1.62 & 0.916\\
3 & 484 & 8.19 & 1.24 & 1.23 & 0.952\\
4 & 483 & 8.03 & 2.34 & 2.34 & 0.822\\
5 & 455 & 8.03 & 2.00 & 1.89 & 0.840\\
6 & 461 & 8.32 & 1.69 & 1.56 & 0.866\\
7 & 451 & 8.09 & 2.69 & 2.50 & 0.783\\
8 & 486 & 8.19 & 2.86 & 2.83 & 0.782\\
9 & 502 & 8.15 & 1.70 & 1.74 & 0.873\\
10 & 475 & 8.24 & 2.60 & 2.50 & 0.779\\
\hline\end{array}\]

Note that results from various replications can be quite different. Thus, one run clearly does not produce the “answers”. \(\sharp\)

Consider the output stochastic process \(Y_{1},Y_{2},\cdots\). Let

\[F_{i}(y|I)=P\{Y_{i}\leq y|I\}\mbox{ for }i=1,2,\cdots ,\]

where \(y\) is a real number and \(I\) represents the initial conditions used to start the simulation at time zero. The conditional probability \(\mathbb{P}\{Y_{i}\leq y|I\}\) is the probability that the event \(\{Y_{i}\leq y\}\) occurs given the initial conditions \(I\). For a manufacturing system, \(I\) might specify the number of jobs present, and whether each machine is busy or idle at time zero. We call \(F_{i}(y|I)\) the transient distribution of the output process at time \(i\) for initial conditions \(I\). Note that \(F_{i}(y|I)\) will, in general, be different for each value of \(i\) and each set of initial conditions \(I\). For fixed \(y\) and \(I\), the probabilities \(F_{1}(y|I),F_{2}(y|I),\cdots\) are just a sequence of numbers. If \(F_{i}(y|I)\rightarrow F(y)\) as \(i\rightarrow\infty\) for all \(y\) and for all initial conditions \(I\), then \(F(y)\) is called the steady-state distribution of the output process \(Y_{1},Y_{2},\cdots\). Strictly speaking, the steady-state distribution \(F(y)\) is only obtained in the limit as \(i\rightarrow\infty\). In practice, however, there will be often a finite time index, say, \(k+1\), such that the distribution from this point on will be approximately the same as each other. Note that steady-state does not mean that the random variables \(Y_{k+1},Y_{k+2},\cdots\) will take on the same value in a particular simulation run; rather, it means that they will have approximately the same distribution. Furthermore, these random variables will not be independent, but will approximately constitute a covariance-stationary stochastic process. (A discrete-time stochastic process \(X_{1},X_{2},\cdots\) is said to be covariance stationary if \(\mu_{i}=\mu\), \(\sigma_{i}^{2}=\sigma^{2}\) for \(i=1,2,\cdots\) , and \(Cov(X_{i},X_{i+j})\) is independent of \(i\) for \(j=1,2,\cdots\). Therefore, a covariance-stationary process the mean and variance are stationary over time
and the covariance between two random variables \(X_{i}\) and \(X_{i+j}\) depends only on the separation \(j\), sometimes called the lag, and not on the actual time values \(i\) and \(i+j\)). The steady-state distribution \(F(y)\) does not depend on the initial conditions \(I\); however, the rate of convergence of the transient distributions \(F_{i}(y|I)\) to \(F(y)\) does.

Simulations may be either terminating or non-terminating, depending on whether there is an obvious way for determining run length. Furthermore, measures of performance or parameters for non-terminating simulations may be of several types, the steady-state parameters, steady-state cycle parameters, and the other types of parameters.

A terminating simulation is one for which there is a “natural” event \(E\) that specifies the length of each run (replication). Since different runs use independent random numbers and the same initialization rule, this implies that comparable random variables from the different runs are i.i.d. The event \(E\) often occurs at a time point beyond which no useful information is obtained or at a time point when the system is “cleaned out”. It is specified before any runs are made, and the time of occurrence of \(E\) for a particular run may be a random variable. Since the initial conditions for a terminating simulation generally affect the desired measures of performance, these conditions should be representative of those for the actual system.

\begin{equation}{\label{sch6ex13}}\tag{2}\mbox{}\end{equation}

Example \ref{sch6ex13}. A retail/commercial establishment, e.g., a bank, closes each evening. If the establishment is open from 9 to 5, the objective of a simulation might be to estimate some measure of the quality of customer service over the period beginning at 9a.m. and ending when the last customer who entered before the door closed at 5p.m. has been served. In this case,

\[E=\{\mbox{at least \(8\) hours of simulated time have elapsed and the system is empty}\},\]

and the initial conditions for the simulation are the number of customers present at time zero. \(\sharp\).

\begin{equation}{\label{sch6ex14}}\tag{3}\mbox{}\end{equation}

Example \ref{sch6ex14}. Consider a military ground confrontation between a blue force and a red force, Relative to some initial force strengths, the goal of a simulation might be to determine the (final) force strengths when the battle ends. In this case

\[E=\{\mbox{either the blue force or the red force has won the battle}\}.\]

An example of a condition that would end the battle is one side losing \(30\) percent of its force, since this side would no longer be considered viable. The choice of initial conditions, e.g., the number of troops and tanks for each force, for the simulation is generally not a problem here, since they are specified by the military scenario under consideration. \(\sharp\)

Example. An aerospace manufacturer receives a contract to produce \(100\) airplanes, which must be delivered within \(18\) months. The company would like to simulate various manufacturing configurations to see which one can meet the delivery deadline at least cost. In this case

\[E=\{\mbox{$100$ airplanes have been completed}\}.\]

Example. Consider a manufacturing company that operates \(16\) hours a day (two shifts) with work in process carrying over from one day to the next. Would
this quality as a terminating simulation with

\[E=\{\mbox{$16$ hours of simulated time have elapsed}\}?\]

No, since this manufacturing operation is essentially a continuous process, with the ending conditions for one day being the initial condition for the next day. \(\sharp\)

\begin{equation}{\label{sch6ex15}}\tag{4}\mbox{}\end{equation}

Example \ref{sch6ex15}. A company that sells a single product would like to decide how many items to have in inventory during a planning horizon of \(120\) months. Given some initial inventory level, the objective might be to determine how much to order each month so as to minimize the expected average cost per month of operating the inventory system. In this case

\[E=\{120\mbox{ months have been simulated}\},\]

and the simulation is initialized with the current inventory level. \(\sharp\)

A non-terminating simulation is one for which there is no natural event \(E\) to specify the length of a run. A measure of performance for such a simulation is said to be a steady-state parameter if it is a characteristic of a steady-state distribution of some output stochastic process \(Y_{1},Y_{2},\cdots\). If the random variable \(Y\) has the steady-state distribution, then we might be interested in estimating the steady-state mean \(\mu =\mathbb{E}[Y]\).

\begin{equation}{\label{sch6ex7}}\tag{5}\mbox{}\end{equation}

Example \ref{sch6ex7}. Consider a company that is going to build a new manufacturing system and would like to determine the long-run (steady-state) mean hourly throughput of their system after it has been running long enough for the workers to know their jobs and for mechanical difficulties to have been worked out. Assume that:

  • the system will operate \(16\) hours a day \(5\) days a week.
  • there is negligible loss of production at the end of one shift or at the beginning of the next shift.
  • there are no breaks (e.g. lunch) that shut down production at specified times each day.

This system could be simulated by “pasting together” \(16\)-hour days, thus ignoring the system idle time at the end of each day and on the weekend. Let \(N_{i}\) be the number of parts manufactured in the \(i\)th hour. If the stochastic process \(N_{1},N_{2},\cdots\) has a steady-state distribution with corresponding random variable \(N\), then we are interested in estimating the mean \(\mu =\mathbb{E}(N)\). \(\sharp\)

It should be mentioned that stochastic process for most real systems do not have steady-state distribution, since the characteristics of the system change over time. For example, in a manufacturing system the production-scheduling rules and the facility layout (e.g, number and location of machines) may change from time to time. On the other hand, a simulation model may have steady-state distributions, since characteristics of the model are often assumed not to change over time. If, in Example \ref{sch6ex7}, the manufacturing company wanted to know the time required for the system to go from startup to operating in a “normal” manner, this would be a terminating simulation with terminating event

\[E=\{\mbox{simulated system is running “normality”}\}\mbox{(if such can be defined)}\]

Therefore, a simulation for a particular system might be either terminating or non-terminating, depending on the objectives of the simulation study.

Example. Consider a simulation model for a computer (or communication) system that does not currently exist. Since there are typically no representative data available on the arrival mechanism for jobs, it is common to assume that jobs arrive in accordance with a Poisson process with constant rate equal to the predicted arrival rate of jobs during the period of peak loading. (When the system is actually built, the arrival rate will vary as a function of time and the period of peak loading may be relatively short). Since the state of the system during “normal operation” is unknown, initial conditions must be chosen somewhat arbitrarily (e.g. no jobs present at time zero). Then the goal is to run the simulation long enough so that the arbitrary choice of initial conditions is no longer having a significant effect on the estimated measures of performance (e.g. mean response time for a job). In performing the above steady-state analysis of the proposed computer system. We are essentially trying to determine how the system will respond to a peak load of infinite duration. If, however, the peak period in the actual system is short of if the arrival rate before the peak period is considerably lower than the peak rate, our analysis may overestimate the congestion level during the peak period in the system. This might result in purchasing a computer configuration that is more powerful that actually needed. \(\sharp\)

Consider a stochastic process \(Y_{1},Y_{2},\cdots\) for a non-terminating simulation that does not have a steady-state distribution. Suppose that we divide the time axis into equal-length, contiguous time intervals called cycles. For example, in a manufacturing system a cycle might be an \(8\)-hour shift. Let \(Y_{i}^{C}\) be a random variable defined on the \(i\)th cycle, and assume that \(Y_{1}^{C},Y_{2}^{C},\cdots\) are comparable. Suppose that the process \(Y_{1}^{C},Y_{2}^{C},\cdots\) has a steady-state distribution \(F^{C}\) and that \(Y^{C}\) is a random variable with distribution function \(F^{C}\). Then a measure of performance is said to be a  steady-state cycle parameter if it is a characteristic of \(Y^{C}\) such as the mean \(\mu^{C}=E[Y^{C}]\). Thus, a steady-state cycle parameter is just a steady-state parameter of the appropriate cycle process \(Y_{1}^{C},Y_{2}^{C},\cdots\).

Example. Suppose for the manufacturing system in Example \ref{sch6ex7} that there is a half-hour lunch break at the beginning of the fifth hour in each \(8\)-hour shift. Then the process of hourly throughputs \(N_{1},N_{2},\cdots\) has no steady-state distribution. Let \(N_{i}^{C}\) be the average hourly throughput in the \(i\)th \(8\)-hour shift (cycle). Then, we might be interested in estimating the steady-state expected average hourly throughput over a cycle, \(\mu^{C}=\mathbb{E}(N^{C})\), which is a steady-state cycle parameter. \(\sharp\)

\begin{equation}{\label{sch6ex8}}\tag{6}\mbox{}\end{equation}

Example \ref{sch6ex8}. Consider a national long-distance telephone company for which one must dial a local phone number to gain access to the system. Suppose that the arrival rate of calls to the system varies with the time of day and day of the week, but assume that the pattern of arrival rates is identical
from week to week. Let \(D_{i}\) be the delay experienced by the \(i\)th arriving call between dialing the local phone number and actually gaining access to the system, at which time the desired long-distance number can be entered. The stochastic process \(D_{1},D_{2},\cdots\) does not have a steady-state distribution. Let \(D_{i}^{C}\) be the average delay over the \(i\)th week. Then we might be interested in estimating the steady-state expected average delay over a week, \(\mu^{C}=\mathbb{E}(D^{C})\). \(\sharp\)

For a non-terminating simulation, suppose that the stochastic process \(Y_{1},Y_{2},\cdots\) does not have a steady-state distribution, and that there is no appropriate cycle definition such that the corresponding process \(Y_{1}^{C},Y_{2}^{C},\cdots\) has a steady-state distribution. This can occur, for example, if the parameters for the model continue to change over time. In Example \ref{sch6ex8}, if the arrival rate of calls changes from week to week and from year to year, then steady-state cycle parameters will probably not be well defined. This provides, in effect, a terminating event \(E\) for the simulation and, thus, the analysis techniques for terminating simulation are appropriate. Measures of performance or parameters for such simulations usually change over time are included in the category “other parameters”.

Example. Consider a manufacturing system for microcomputers consisting of an assembly line and a test area. There is a \(3\)-month build schedule available from marketing, which describes the types and numbers of computers to be produced each week. The schedule changes from week to week because of changing sales and the introduction of new computers. In this case, weekly or monthly throughputs do not have steady-state distributions. We therefore perform a terminating simulation of length \(3\) months and estimate the actual mean throughput for each week. \(\sharp\)

\begin{equation}{\label{b}}\tag{B}\mbox{}\end{equation}

Statistical Analysis for Terminating Simulations.

Suppose that \(X_{1},\cdots ,X_{n}\) are independent random variables having the same distribution. Let \(\mu\) and \(\sigma^{2}\) denote, respectively, their mean and variance. The quantity

\begin{equation}{\label{sch6eq23}}\tag{7}
\bar{X}=\sum_{i=1}^{n}\frac{X_{i}}{n}
\end{equation}

which is the arithmetic average of the \(n\) data values, is called the sample mean. When the population mean \(\mu\) is unknown, the sample mean is often used to estimate it. Because \(E[\bar{X}]=\mu\), it follows that \(\bar{X}\) is an unbiased estimator of \(\mu\). To determine the “worth” of \(\bar{X}\) as an estimator of the population mean \(\mu\), we consider its mean square error, that is, the expected value of the squared difference between \(\bar{X}\) and \(\mu\). Now, we have

\begin{align*} \mathbb{E}[(\bar{X}-\mu )^{2}] & =\mbox{Var}(\bar{X})=\mbox{Var}\left (\frac{1}{n}\sum_{i=1}^{n}X_{i}\right )\\ & =\frac{1}{n^{2}}\sum_{i=1}^{n}\mbox{Var}(X_{i})=\frac{\sigma^{2}}{n}.\end{align*}

Therefore \(\bar{X}\), the sample mean of the \(n\) data values \(X_{1},\cdots , X_{n}\) is a random variable with mean \(\mu\) and variance \(\sigma^{2}/n\). The quantity \(S^{2}\), defined by

\begin{equation}{\label{sch6eq24}}\tag{8}
S^{2}=\frac{1}{n-1}\sum_{i=1}^{n} (X_{i}-\bar{X})^{2}
\end{equation}

is called the sample variance. We can prove that \(\mathbb{E}[S^{2}]=\sigma^{2}\). Therefore \(S^{2}\) is an unbiased estimator of \(\sigma^{2}\). We use the sample variance \(S^{2}\) as the estimator of the population variance \(\sigma^{2}\), and we use \(S=\sqrt{S^{2}}\), the so-called sample standard deviation , as the estimator of \(\sigma\).

Suppose now that, as in a simulation, we have the option of continually generating additional data values \(X_{i}\). If our objective is to estimate the value of \(\theta =\mathbb{E}(X)\), when should we stop generating new data values? The answer to this question is that we should first choose an acceptable value \(d\) for the standard deviation of our estimator. If \(d\) is the standard deviation of the estimator \(\bar{X}\), then we can, for example, be \(95\%\) certain that \(\bar{X}\) will not differ from \(\theta\) by more than \(1.96d\) ($z_{0.05}=1.96$). We should then continue to generate new data until we have generated \(n\) data values for which our estimate of \(\sigma /\sqrt{n}\), namely, \(S/\sqrt{n}\), is less than the acceptable value \(d\). Since the sample standard deviation may not be a particular good estimate of \(\sigma\) (nor may the normal approximation be valid) when the sample size is small, we thus recommend the following procedure to determine when to stop generating new data values.

  1. Choose an acceptable value \(d\) for the standard deviation of the estimator.
  2. Generate at least \(30\) data values.
  3. Continue to generate additional data values, stopping when you have generated \(k\) values and \(S/\sqrt{k}<d\), where \(S\) is the sample standard deviation based on those \(k\) values.
  4. The estimate of \(\theta\) is given by \(\bar{X}=\sum_{i=1}^{k} X_{i}/k\).

In order to use the above technique for determining when to stop generating new values, it would be valuable if we had a method for recursively computing the successive sample means and sample variances, rather than having to recompute from scratch each time a new datum value is generated. We now show how this can be done. Consider the sequence of data values \(X_{1},\cdots ,X_{n}\). Let

\[\bar{X}_{j}=\sum_{i=1}^{j}\frac{X_{i}}{j}\]

and

\[S_{j}^{2}=\frac{1}{j-1}\sum_{i=1}^{j} (X_{i}-\bar{X})^{2}\]

for \(j\geq 2\) denote, respectively, the sample mean and sample variance of the first \(j\) data values. The following recursion should be used to successively compute the current value of the sample mean and sample variance. With \(S_{1}^{2}=0\) and \(\bar{X}_{0}=0\), we have

\[\bar{X}_{j+1}=\bar{X}_{j}+\frac{X_{j+1}-\bar{X}_{j}}{j+1}\]

and

\[S_{j+1}^{2}=\left (1-\frac{1}{j}\right )S_{j}^{2}+(j+1)(\bar{X}_{j+1}-\bar{X}_{j})^{2}.\]

\begin{equation}{\label{sch6ex5}}\tag{9}\mbox{}\end{equation}

Example \ref{sch6ex5}. Consider a service system in which no new customers are allowed to enter after 5p.m. Suppose that each day follows the same probability law and that we are interested in estimating the expected time at which the last customer departs the system. Furthermore, suppose we want to be at least \(95\%\) certain that our estimated answer will not differ from the true value by more than \(15\) seconds. To satisfy the above requirement it is necessary that we continually generate data values relating to the time at which the last customer departs (each time by doing a simulation run) until we have generated a total of \(k\) values, where \(k\) is at least \(30\) and is such that \(1.96S/\sqrt{k}<15\), where \(S\) is the sample standard deviation measured in seconds of these \(k\) data values. Our estimate of the expected time at which the last customer departs will be the average of the \(k\) data values. \(\sharp\)

As a result, it is sometimes more valuable to be able to specify an interval for which we have a certain degree of confidence that \(\theta\) lies within. To obtain such an interval, we need the (approximate) distribution of the estimator \(\bar{X}\). To determine this, first recall that

\[\mathbb{E}(\bar{X})=\mu\mbox{ and }\mbox{Var}(\bar{X})=\frac{\sigma^{2}}{n}\]

From the central limit theorem, it follows that, for large \(n\), \(\frac{\sqrt{n}(\bar{X}-\mu )}{\sigma}\) is approxmately distributed as \(N(0,1)\). In addition, if we replace the unknown standard deviation \(\sigma\) by its estimator \(S\), the sample standard deviation, then it still remains the case (by a result known as Slutsky’s theorem) that the resulting quantity is approximately a \(N(0,1)\). That is, when \(n\) is large, \(\frac{\sqrt{n}(\bar{X}-\mu )}{S}\) is approxmately distributed as \(N(0,1)\). Now, for any \(\alpha\), \(0<\alpha <1\), let \(z_{\alpha}\) satisfy \(\mathbb{P}\{Z\geq z_{\alpha}\}=\alpha\),  where \(Z\) is \(N(0,1)\). If follows from the symmetry of the density function of \(N(0,1)\) about the origin that \(z_{1-\alpha}\) is equal to \(-z_{\alpha}\). Therefore, we have

\[\mathbb{P}\{-z_{\alpha /2}\leq Z\leq z_{\alpha /2}\}=1-\alpha .\]

It follows

\[\mathbb{P}\left\{-z_{\alpha /2}\leq\frac{\sqrt{n}(\bar{X}-\mu )}{S}\leq z_{\alpha /2}\right\}\approx 1-\alpha\]

or equivalently

\[\mathbb{P}\left\{\bar{X}-z_{\alpha /2}\frac{S}{\sqrt{n}}\leq\theta\leq\bar{X}+z_{\alpha /2}\frac{S}{\sqrt{n}}\right\}\approx 1-\alpha .\]

In other words, with probability \(1-\alpha\) the population mean \(\theta\) will lie within the region \(\bar{X}\pm z_{\alpha /2}S/\sqrt{n}\). If the observed values of the sample mean and the sample standard deviation are \(\bar{X}=\bar{x}\) and \(S=s\), call the interval \(\bar{x}\pm z_{\alpha /2} s/\sqrt{n}\) an (approximately) \(100(1-\alpha )\) percent confidence interval of \(\theta\).

Consider now that case, as in simulation study, where additional data values can be generated and the question is to determine when to stop generating new data values. One solution to this is to initially choose values \(\alpha\) and \(l\) and to continue generating data until the approximate \(100(1-\alpha )\) percent confidence interval estimate of \(\theta\) is less than \(l\). Since the length of this interval will be \(2z_{\alpha /2}S/\sqrt{n}\) we can accomplish this by the following technique.

  1. Generate at least \(30\) data values.
  2. Continue to generate additional data values, stopping when the number of values you have generated, call it \(k\), is such that \(2z_{\alpha /2} S/\sqrt{k}<l\), where \(S\) is the sample standard deviation based on those \(k\) values.
  3. If \(\bar{x}\) and \(s\) are the observed values of \(\bar{X}\) and \(S^{2}\), then the \(100(1-\alpha )\) percent confidence interval estimate of \(\theta\), whose length is less than \(l\), is \(\bar{x}\pm z_{\alpha /2} s/\sqrt{k}\).

Suppose that we make \(n\) independent replications of a terminating simulation, where each replication is terminated by the event \(E\) and is begun with
the “same” initial conditions. The independence of replications is accomplished by using different random numbers for each replication. Let \(X_{j}\) be a random variable defined on the \(j\)th replication for \(j=1,2,\cdots ,n\); it is assumed that the \(X_{j}\)’s are comparable for different replications. Then the \(X_{j}\)’s are i.i.d. random variables. For the bank of Examples \ref{sch6ex18} and \ref{sch6ex13}, \(X_{j}\) might be the average delay \(\sum_{i=1}^{N} D_{i}/N\) over a day from the \(j\)th replication, where \(N\), a random variable, is the number of customers served in a day. For the combat model of Example \ref{sch6ex14}, \(X_{j}\) might be the number of red tanks destroyed on the \(j\)th replication. Finally, for the inventory system of Example \ref{sch6ex15}, \(X_{j}\) could be the average cost \(\sum_{i=1}^{120} C_{i}/120\) from the \(j\)th replication.

Suppose that we would like to obtain a point estimate and confidence interval for the mean \(\mu =E[X]\), where \(X\) is random variable defined on a replication as described above. Make \(n\) independent replications of the simulation and let \(X_{1},X_{2},\cdots ,X_{n}\) be the resulting i.i.d. random variables. Then, by substituting the \(X_{j}\)’s into (\ref{sch6eq23}), we get that \(\bar{X}\) is an unbiased point estimator for \(\mu\). If the sample size is small and the standard deviation is unknown, then an approximate \(100(1-\alpha )\%\) confidence interval for \(\mu\) is given by

\begin{equation}{\label{sch6eq27}}\tag{10}
\bar{X}\pm t_{n-1;\alpha /2}\sqrt{\frac{S^{2}}{n}}
\end{equation}

where the sample variance is given by (\ref{sch6eq24}). We will call the confidence interval based on (\ref{sch6eq27}) the fixed-sample-size procedure.

\begin{equation}{\label{shc6ex12}}\tag{11}\mbox{}\end{equation}

Example \ref{shc6ex12}. For the bank of Example \ref{sch6ex18}, suppose that we want to obtain a point estimate and an approximate \(90\%\) confidence interval for the expected average delay of a customer over a day, which is given by

\[\mathbb{E}(X)=\mathbb{E}\left (\frac{\sum_{i=1}^{N} D_{i}}{N}\right ).\]

Note that we estimate the expected average delay, since each delay has, in general, a different mean. From the \(10\) replications given in the table we obtain \(\bar{X}=2.03\), \(S^{2}=0.31\) and

\[\bar{X}\pm t_{9;0.05}\sqrt{\frac{S^{2}}{10}}=2.03\pm 0.32.\]

Therefore, subject to the correct interpretation to be given to confidence intervals, we can claim with approximately \(90\%\) confidence that \(\mathbb{E}(X)\) is contained in the interval \([1.71,2.35]\) minutes. Suppose that we would like to obtain a point estimate and an approximate \(90\%\) confidence interval for the expected proportion of customers with a delay less than \(5\) minutes over a day, which is given by

\[\mathbb{E}(X)=\mathbb{E}\left (\frac{\sum_{i=1}^{N} I_{i}(0,5)}{N}\right ),\]

where the indicator function \(I_{i}(0,5)\) is defined as

\[I_{i}(0,5)=\left\{\begin{array}{ll}
1 & \mbox{if \(D_{i}<5\)}\\
0 & \mbox{otherwise}
\end{array}\right .\]

for \(i=1,2,\cdots ,N\). From the last column of table, we obtain \(\bar{X}=0.853\) and \(S^{2}=0.004\),  and the \(90\%\) confidence interval is
$0.853\pm 0.036$ or \([0.817,0.889]\). \(\sharp\)

One disadvantage of the fixed-sample-size procedure based on \(n\) replications is that the analyst has no control over the confidence interval half-length; for fixed \(n\), the half-length will depend on \(\mbox{Var}(X)\). In what follows we discuss the procedures for determining the number of replications required to estimate the mean \(\mu =\mathbb{E}(X)\) with a specified error or precision. We begin by defining two ways of measuring the error in the estimate \(X\). If the estimate \(\bar{X}\) is such that \(|\bar{X}-\mu |=\beta\), then we say that \(\bar{X}\) has an absolute error of \(\beta\). If we make replications of a simulation until the half-length of the \(100(1-\alpha )\%\) confidence interval given by (\ref{sch6eq27}) is less than or equal to \(\beta >0\), then

\begin{align*}
1-\alpha & \approx\mathbb{P}\{\bar{X}-\mbox{half-length}\leq\mu\leq\bar{X}+\mbox{half-length}\}\\
& =\mathbb{P}\{|\bar{X}-\mu |\leq\mbox{half-length}\}\\
& \leq\mathbb{P}\{|\bar{X}-\mu |\leq\beta\}\mbox{ (since half-length$\leq\beta$)}
\end{align*}

Therefore \(\bar{X}\) has an absolute error of at most \(\beta\) with a probability approximately \(1-\alpha\). In other words, if we construct \(100\) independent \(90\%\) confidence intervals using the above stopping rule, we would expect \(\bar{X}\) to have an absolute error of at most \(\beta\) in about \(90\) out of the \(100\) cases; in about \(10\) cases the absolute error would be greater than \(\beta\). Suppose that we have constructed a confidence interval for \(\mu\) based on a fixed number of replications \(n\). If we assume that our estimate \(S^{2}(n)\) of the sample variance will not change as the number of replications increases, an approximate expression for the total number of replications, \(n_{a}^{*}(\beta )\), required to obtain an absolute error of \(\beta\) is given by

\begin{equation}{\label{sch6eq28}}\tag{12}
n_{a}^{*}(\beta )=\min\left\{i\geq n:t_{i-1;\alpha /2}\cdot\sqrt{\frac{S^{2}(n)}{i}}\leq\beta\right\}
\end{equation}

We can determine \(n_{a}^{*}(\beta )\) by iteratively increasing \(i\) by \(1\) until a value of \(i\) is obtained for which

\[t_{i-1;\alpha /2}\cdot\sqrt{S^{2}(n)/i}\leq\beta.\]

Alternatively, \(n_{a}^{*}(\beta )\) can be approximated as the smallest integer satisfying

\[i\geq S^{2}(n)\cdot\left (\frac{z_{\alpha /2}}{\beta}\right )^{2}.\]

If \(n_{a}^{*}(\beta )>n\) and if we make \(n_{a}^{*}(\beta )-n\) additional replications of simulation, then the estimate \(\bar{X}\) based on all \(n_{a}^{*}(\beta )\) replications should have an absolute error of approximately \(\beta\). The accuracy of (\ref{sch6eq28}) depends on how close the variance estimate is to \(\mbox{Var}(X)\).

Example. For the bank of Example \ref{sch6ex18}, suppose that we would like to estimate the expected average delay with an absolute error of \(0.25\) minutes and a confidence interval of \(90\%\). From the available replications, we get

\[n_{a}^{*}(0.25)=\min\left\{i\geq 10:t_{i-1;0.05}\sqrt{\frac{0.31}{i}}\leq0.25\right\}=16.\]
Therefore we need \(6\) additional replications of simulation. \(\sharp\)

We now discuss another way of measuring the error in \(\bar{X}\). If the estimate \(\bar{X}\) is such that

\[\left |\frac{\bar{X}-\mu}{\mu}\right |=\gamma ,\]

then we say that \(\bar{X}\) has a relative error of \(\gamma\), or that the percentage error in \(\bar{X}\) is \(100\gamma\%\). Suppose that we make replications of a simulation until the half-length of the confidence interval given in (\ref{sch6eq27}), divided by \(|\bar{X}|\), is less than or equal to \(\gamma\). This ratio is an estimate of the actual relative error. Then, we have

\begin{align*}
1-\alpha & \approx\mathbb{P}\left\{\frac{|\bar{X}-\mu |}{|\bar{X}|}\leq\frac{\mbox{half-length}}{|\bar{X}|}\right\}\\
& \leq\mathbb{P}\{|\bar{X}-\mu |\leq\gamma |\bar{X}|\}\mbox{ (since \(\frac{\mbox{half-length}}{|\bar{X}|}\leq\gamma\))}\\
& =\mathbb{P}\{|\bar{X}-\mu |\leq\gamma |\bar{X}-\mu +\mu |\}\\
& \leq\mathbb{P}\{|\bar{X}-\mu |\leq\gamma (|\bar{X}-\mu |+|\mu |)\}\mbox{ (using triangle inequality)}\\
& =\mathbb{P}\{(1-\gamma )|\bar{X}-\mu |\leq\gamma |\mu |\}\\
& =\mathbb{P}\left\{\frac{|\bar{X}-\mu |}{|\mu |}\leq\frac{\gamma}{1-\gamma}\right\}
\end{align*}

Therefore \(\bar{X}\) has a relative error of at most \(\gamma /(1-\gamma )\) with a probability of approximately \(1-\alpha\). In other words, if we construct \(100\) independent \(90\%\) confidence intervals using the above stopping rule, we would expect \(\bar{X}\) to have a relative error of at most \(\gamma /(1-\gamma )\) in about \(90\) out of the \(100\) cases; in about \(10\) cases the relative error would be greater than \(\gamma /(1-\gamma )\). Note that we get a relative error of \(\gamma /(1-\gamma )\) rather than the desired \(\gamma\), since we estimate \(|\mu |\) by \(|\bar{X}|\). Suppose once again that we have constructed a confidence interval for \(\mu\) based on a fixed number of replications \(n\). If we assume that our estimates of both the sample mean and sample variance will not change as the number of replications increases, an approximate expression for the number of replications, \(n_{r}^{*}(\gamma )\), required to obtain a relative error of \(\gamma\) is given by

\begin{equation}{\label{sch6eq29}}\tag{13}
n_{r}^{*}(\gamma )=\min\left\{i\geq n:\frac{t_{i-1;\alpha /2}\cdot\sqrt{S^{2}(n)/i}}{|\bar{X}(n)|}\leq\gamma^{\prime}=\frac{\gamma}{1-\gamma}\right\}
\end{equation}

where \(\gamma^{\prime}\) is the “adjusted” relative error needed to get an actual relative error of \(\gamma\). Again, \(n_{r}^{*}(\gamma )\) is approximated as the smallest integer \(i\) satisfying

\[i\geq S^{2}(n)\cdot\left (\frac{z_{\alpha /2}}{\gamma^{\prime}\bar{X}(n)}\right )^{2}.\]

If \(n_{r}^{*}(\gamma )>n\) and if we make \(n_{r}^{*}(\gamma )-n\) additional replications of the simulation, then the estimate \(\bar{X}\) based on all \(n_{r}^{*}(\gamma )\) replications should have a relative error of approximately \(\gamma\).

\begin{equation}{\label{sch6ex19}}\tag{14}\mbox{}\end{equation}

Example \ref{sch6ex19}. For the bank of Example \ref{sch6ex18}, suppose that we would like to estimate the expected average delay with a relative error of \(0.1\) and a confidence interval of \(90\%\). From the \(10\) available replications, we get

\[n_{r}^{*}(0.1)=\min\left\{i\geq 10:\frac{t_{i-1;0.05}\cdot\sqrt{0.31/i}}{2.03}\leq 0.09\right\}=27\]

where \(\gamma^{\prime}=0.1/(1-0.1)=0.09\). \(\sharp\)

The difficulty with using (\ref{sch6eq29}) directly to obtain an estimate \(\bar{X}\) with a relative error of \(\gamma\) is that \(\bar{X}(n)\) and \(S^{2}(n)\) may not be precise estimates of their corresponding sample parameters. If \(n_{r}^{*}(\gamma )\) is greater than the number of replications actually required, then a significant number of unnecessary replications may be made, resulting in a waste of computer resources. Conversely, if \(n_{r}^{*}(\gamma )\) is too small, then an estimate \(\bar{X}\) based on \(n_{r}^{*}(\gamma )\) replications may not be as precise as we think. We now present a sequential procedure (new replications are added one at a time) for obtaining an estimate of \(\mu\) with a specified relative error that takes only as many replications as are actually needed. The procedure assumes that \(X_{1},X_{2},\cdots\) is a sequence of i.i.d. random variables that need not be normal. The specific objective of the procedure is to obtain an estimate of \(\mu\) with a relative error of \(\gamma\) and a confidence interval of \(100(1-\alpha )\%\). Choose an initial number of replications \(n_{0}\geq 2\) satisfying

\[\delta (n,\alpha )=t_{n-1;\alpha /2}\cdot\sqrt{\frac{S^{2}(n)}{n}}\]

be the usual confidence interval half-length. Then the sequential procedure is as follows:

  1. Step 1. Make \(n_{0}\) replications of the simulation and set \(n=n_{0}\).
  2. Step 2. Compute \(\bar{X}(n)\) and \(\delta (n,\alpha )\) from \(X_{1},X_{2},\cdots ,X_{n}\).
  3. Step 3. If \(\delta (n,\alpha )/|\bar{X}(n)|\leq\gamma^{\prime}\), use \(\bar{X}(n)\) as the point estimate for \(\mu\) and stop. Equivalently, \[I(\alpha ,\gamma )=[\bar{X}(n)-\delta (n,\alpha ),\bar{X}(n)+\delta (n,\alpha )]\] is an approximate \(100(1-\alpha )\%\) confidence interval for \(\mu\) with the desired precision. Otherwise, replace \(n\) by \(n+1\), make an additional replication of the simulation, and go to step 2.

Note that the procedure computes a new estimate of \(Var(X)\) after each replication is obtained, and that the total number of replications required by the procedure is a random variable.

Example. For the bank of Example \ref{sch6ex18}, suppose that we would like to obtain an estimate of the expected average delay with a relative error of \(\gamma =0.1\) and a confidence level of \(90\%\). Using the previous \(n_{0}=10\) replications as a starting point, we obtain: number of replications at termination \(=74\), \(\bar{X}(74)=1.76\), \(S^{2}(74)=0.67\), and \(90\%\) confidence interval \([1.60,1.92]\). Note that the number of replications actually required, \(74\), is considerably larger than the \(27\) predicted in Example \ref{sch6ex19}, due mostly to the imprecise variance estimate based on \(10\) replications. \(\sharp\)

We now make our recommendation on the use of the fixed-sample-size and sequential procedures for terminating simulations. If one is performing an exploratory experiment where the precision of the confidence interval may not be overwhelmingly important, we recommend using the fixed-sample-size procedure. However, if the \(X_{j}\)’s are highly non-normal and the number of replications \(n\) is too small, the actual coverage of the constructed confidence interval may be somewhat lower than desired. From an exploratory experiment consisting of \(n\) replications, one can estimate the cost per replication and the sample variance of the \(X_{j}\)’s, and then obtain from (\ref{sch6eq28}) a rough estimate of the number of replications, \(n_{a}^{*}(\beta )\), required to estimate \(\mu\) with a desired absolute error \(\beta\). Alternatively, one can obtain from (\ref{sch6eq29}) a rough estimate of the number of replications, \(n_{r}^{*}(\gamma )\), required to estimate \(\mu\) with a desired relative error \(\gamma\). Sometimes the choice of \(\beta\) or \(\gamma\) may have to be tempered by the cost associated with the required number of replications. If it is finally decided to construct a confidence interval with a small relative error \(\gamma\), we recommend use of the sequential procedure with \(\gamma\leq 0.15\) and \(n_{0}\geq 10\). If one wants a confidence interval with a relative error \(\gamma\) greater than \(0.15\), we recommend several successive applications of the fixed-sample-size approach. In particular, one might estimate \(n_{r}^{*}(\gamma )\), collect, say \((n_{r}^{*}(\gamma )-n)/2\) more replications, and then using (\ref{sch6eq27}) to construct a confidence interval based on the existing \((n+n_{r}^{*}(\gamma ))/2\) replications. If the estimated relative error of the resulting confidence interval is still greater than \(\gamma^{\prime}\), then \(n_{r}^{*}(\gamma )\) can be re-estimated based on a new variance estimate, and some portion of the necessary additional replications may be collected, etc. To construct a confidence interval with a small absolute error \(\beta\), we once again recommend several successive applications of fixed-sample-size approach.

\begin{equation}{\label{c}}\tag{C}\mbox{}\end{equation}

Statistical Analysis for Steady-State Parameters.

Let \(Y_{1},Y_{2},\cdots\) be an output stochastic process from a single run of a non-terminating simulation. Suppose that

\[\mathbb{P}\{Y_{i}\leq y\}=F_{i}(y) \rightarrow F(y)=\mathbb{P}\{Y\leq y\}\]

as \(i\rightarrow\infty\). (We have suppressed in our notation the dependence of \(F_{i}\) on the initial conditions \(I\)). Then \(\theta\) is a steady-state parameter if it is a characteristic of \(Y\) such as \(E[Y]\), \(P\{Y\leq y\}\), or a quantile of \(Y\). One difficulty in estimating \(\theta\) is that the distribution function of \(Y_{i}\), for \(i=1,2,\cdots\), is different from \(F\), since it will generally not be possible to choose \(I\) to be representative of “steady-state behavior”. This causes an estimator of \(\theta\) based on the observations \(Y_{1},Y_{2},\cdots ,Y_{m}\) not to be “representative”. For example, the sample mean \(\bar{Y}(m)\) will be a biased estimator of \(\mu =\mathbb{E}(Y)\) for all finite values of \(m\). The problem we have just described is called the problem of the initial transient or the startup problem in the simulation literature. Suppose that we want to estimate the steady-state mean \(\mu =\mathbb{E}(Y)\), which is also generally defined by

\[\mu =\lim_{i\rightarrow\infty}\mathbb{E}(Y_{i}).\]

Therefore, the transient means converge to the steady-state mean. The most serious consequence of the problem of the initial transient is probably that \(\mathbb{E}(\bar{Y}(m))\neq\mu\) for any \(m\). The technique most often suggested for dealing with this problem is called warming-up the model or initial-data deletion. The idea is to delete some number of observations from the beginning of a run and to use only the remaining observations to estimate \(\mu\). For example, give the observations \(Y_{1},Y_{2},\cdots ,Y_{m}\), it is often suggested to use

\[\bar{Y}(m,l)=\frac{\sum_{i=l+1}^{m} Y_{i}}{m-l}\]

rather than \(\bar{Y}(m)\) as an estimator of \(\mu\). In general, one would expect \(\bar{Y}(m,l)\) to less biased than \(\bar{Y}(m)\), since the observations near the “beginning” of the simulation may not be very representative of steady-state behavior due to the choice of initial conditions. The question naturally arises as to how to choose the warm-up period \(l\). We would like to pick \(l\) and \(m\) such that \(E[\bar{Y}(m,l)]\approx\mu\). If \(l\) and \(m\) are chosen too small, then \(\mathbb{E}[\bar{Y}(m,l)]\) may be significantly different from \(\mu\). On the other hand, if \(l\) is chosen larger than necessary, then \(\bar{Y}(m,l)\) will probably have an unnecessary large variance. The simplest and most general technique for determining \(l\) is a graphical procedure due to Welch (1981,1983). Its specific goal is to determine a time index \(l\) satisfying \(\mathbb{E}(Y_{i})\approx\mu\) for \(i>l\), where \(l\) is a warm-up period. This is equivalent to determining when the transient mean curve \(\mathbb{E}(Y_{i})\) “flattens out” at level \(\mu\). In general, it is very difficult to determine \(l\) from a single replication due to the inherent variability of the process \(Y_{1},Y_{2},\cdots\). As a result, Welch’s procedure is based on making \(n\) independent replications of the simulation and employing the following four steps

  1. Step 1. Make \(n\) replications of the simulation \((n\geq 5)\), each of length \(m\) (where \(m\) is large). Let \(Y_{ji}\) be the \(i\)th observation from the \(j\)th replication.
  2. Step 2. Let \(\bar{Y}_{i}=\sum_{j=1}^{n} Y_{ji}/n\) for \(i=1,2,\cdots ,m\). The averaged process \(\bar{Y}_{1},\bar{Y}_{2},\cdots\) has means \(\mathbb{E}(\bar{Y})=\mathbb{E}(Y_{i})\) and variance \(\mbox{Var}(\bar{Y}_{i})=\mbox{Var}(Y_{i})/n\). Thus, the averaged process has the same transient mean curve as the original process, but its plot has only \((1/n)\)th the variance.
  3. Step 3. To smooth out the high-frequency oscillations in \(\bar{Y}_{1}, \bar{Y}_{2},\cdots\), we further define the moving average \(\bar{Y}_{i}(w)\),
    where \(w\) is the {\bf window} and is positive integer satisfying \(w\leq [m/2]\), as follows \[\bar{Y}_{i}(w)=\left\{\begin{array}{ll} {\displaystyle \frac{\sum_{s=-(i-1)}^{i-1} \bar{Y}_{i+s}}{2i-1}} & \mbox{if \(i=1,\cdots ,w\)}\\ {\displaystyle \frac{\sum_{s=-w}^{w} \bar{Y}_{i+s}}{2w+1}} & \mbox{if \(i=w,w+1,\cdots , m-w\)}\end{array}\right .\] Thus, if \(i\) is not too close to the beginning of the replications, then \(\bar{Y}_{i}(w)\) is just the simple average of \(2w+1\) observations of the averaged process centered at observation \(i\). It is called a moving average since \(i\) moves through time.
  4. Step 4. Plot \(\bar{Y}_{i}(w)\) for \(i=1,2,\cdots ,m-w\) and choose \(l\) to be that value of \(i\) beyond which \(\bar{Y}_{1}(w),\bar{Y}_{2}(w),\cdots\) appears to have converged.

Example. For simplicity, assume that \(m=10\), \(w=2\), \(\bar{Y}_{i}=i\) for \(i=1,2,3,4,5\), and \(\bar{Y}_{i}=6\) for \(i=6,7,8,9,10\). Then

\[\begin{array}{cccc}
\bar{Y}_{1}(2)=1 & \bar{Y}_{2}(2)=2 & \bar{Y}_{3}(2)=3 & \bar{Y}_{4}(2)=4\\
\bar{Y}_{5}(2)=4.8 & \bar{Y}_{6}(2)=5.4 & \bar{Y}_{7}(2)=5.8 &
\bar{Y}_{8}(2)=6
\end{array}\]

Before giving examples of applying Welch’s procedure to actual stochastic models, we make the following recommendations on choosing the parameters \(n\), \(m\), and \(w\).

  • Initially, make \(n=5\) or \(10\) replications (depending on model execution cost), with \(m\) as large as practical. In particular, \(m\) should be much larger than the anticipated value of \(l\) and also large enough to allow infrequent event (e.g. machine breakdowns) to occur a reasonable number of times.
  • Plot \(\bar{Y}_{i}(w)\) for several values of the windows \(w\) and choose the smallest value of \(w\) (if any) for which the corresponding plot is “reasonably smooth”. Use this plot to determine the length of the warm-up period \(l\). If \(w\) is too small, the plot of \(\bar{Y}_{i}(w)\) will be ragged. If \(w\) is too large, then the \(\bar{Y}_{i}\) observations will be over-aggregated and we will not have a good idea of the shape of the transient mean value curve, \(E[Y_{i}]\) for \(i=1,2,\cdots\).
  • If no value of \(w\) in step 3 is satisfactory, make \(5\) or \(10\) additional replications of length \(m\). Repeat step 2 using all available replications. For a fixed value of \(w\), the plot of \(\bar{Y}_{i}(w)\) will get “smoother” as the replications increases. The major difficulty in applying Welch’s procedure in practices is that the required number of replications, \(n\), may be relatively large if the process \(Y_{1},Y_{2},\cdots\) is highly variable.

\begin{equation}{\label{sch6ex30}}\tag{15}\mbox{}\end{equation}

Example \ref{sch6ex30}. A small factory consists of a machining center and inspection station in series. Unfinished parts arrive to the factory with exponential inter-arrival times having mean of \(1\) minute. Processing times at the machine are uniform on the interval \([0.65,0.70]\) minute, and subsequent inspection times at the inspection station are uniform on the interval \([0.75,0.80]\) minute. Ninety percent of inspected parts are “good” and are sent to shipping; \(10\) percent of the parts are “bad” and are sent back to the machine for rework. Both queues are assumed to have infinite capacity. The machine center is subject to randomly occurring breakdowns. In particular, a new (or freshly repaired) machine will break down after an exponential amount of calendar time with a mean of \(6\) hours. Repair times are uniform on the interval \([8,12]\) minutes. Assume that the factory is initially empty and idle. Consider the stochastic process \(N_{1},N_{2},\cdots\), where \(N_{i}\) is the number of parts produced in the \(i\)th hour. Suppose that we want to determine the warm-up period \(l\) so that we can eventually estimate the steady-state mean hourly throughput \(\mu =E[N]\). We made \(10\) independent replications of the simulation each of length \(m=160\) hours (or \(20\) days). We plot the averaged process \(\bar{N}_{i}\) for \(i=1,2,\cdots ,160\), and see that it is really “ragged”. It is clear that further smoothing of the plot is necessary, and that one replication, in general, is not sufficient to estimate \(l\). We now plot the moving average \(\bar{N}_{i}(w)\) for \(w=20\) and \(30\). From the plot for \(w=30\) (which is smoother), we choose a warm-up period of \(l=24\) hours. (The figure is almost “flat” for \(i\geq 24\)). Note that it is probably better to choose \(l\) too large rather than too small, since the goal is to have \(E[Y_{i}]\) close to \(\mu\) for \(i>l\). \(\sharp\)

Suppose that we want to estimate the steady-state mean \(\mu =\mathbb{E}(Y)\) of the process \(Y_{1},Y_{2},\cdots\). We now present the replication-deletion approach to obtaining a point estimate and confidence interval for \(\mu\). The analysis is similar to that for terminating simulations except that now only those observations beyond the warmup period \(l\) in each replication are used to form the estimates. Specifically, suppose that we make \(n’\) replications of the simulation each of length \(m’\) observations, where \(m’\) is much larger than the warm-up period \(l\) determined by Welch’s procedure. Let \(Y_{ji}\) be as defined before and let \(X_{j}\) be given by

\[X_{j}=\frac{\sum_{i=l+1}^{m’}Y_{ji}}{m’-l}\]

for \(j=1,2,\cdots ,n’\). Note that \(X_{j}\) just uses only those observations from the \(j\)th replication corresponding to “steady-state”, namely $latex Y_{j,l+1},
Y_{j,l+2},\cdots ,Y_{j,m’}$. Then the \(X_{j}\)’s are i.i.d. random variables with \(\mathbb{E}(X_{j})\approx\mu\), \(\bar{X}(n’)\) is an approximately unbiased point estimator for \(\mu\), and an approximate \(100(1-\alpha )\%\) confidence interval for \(\mu\) is given by

\begin{equation}{\label{sch6eq31}}\tag{16}
\bar{X}(n’)\pm t_{n’-1;\alpha /2}\cdot\sqrt{\frac{S^{2}(n’)}{n’}}.
\end{equation}

One legitimate objection that might be levied against the replication-deletion approach is that it uses one set of \(n\) replications (the pilot runs) to determine the warm-up period \(l\), and then uses only the last \(m’-l\) observations from a different set of \(n’\) replications to perform the actual analyzes. However, this is often not a problem due to the relatively low cost of computer time. In some situations, it should be possible to use the initial \(n\) pilot runs of length \(m\) observations both to determine \(l\) and to construct a confidence interval. In particular, if \(m\) is substantially larger than the selected value of the warmup period \(l\), then it is probably safe to use the “initial” runs for both purposes. Since Welch’s procedure is only approximate, a “small” number of observations beyond the warmup period \(l\) might contain significant bias relative to \(\mu\). However, if \(m\) is much larger than \(l\), these biased observations will have little effect on the overall quality (i.e., lack of bias) of \(X_{j}\) (based on \(m-l\) observations) or \(\bar{X}(n)\). Strictly speaking, however, it is more correct statistically to base the replication-deletion approach on two independent sets of replications.

\begin{equation}{\label{sch6ex33}}\tag{17}\mbox{}\end{equation}

Example. \ref{sch6ex33}. For the manufacturing system of Example~\ref{sch6ex30}, suppose that we would like to obtain a point estimate and \(90\%\) confidence interval for the steady-state mean hourly throughput \(\mu =\mathbb{E}(N)\). From the \(10\) replications of length \(m=160\) hours used there, we specified a warm-up period \(l=24\) hours. Since \(m=160\) is much larger than \(l=24\), we will use theses sample replications to construct a confidence interval. Let

\[X_{j}=\frac{\sum_{i=25}^{160}N_{ji}}{136}\]

for \(j=1,2,\cdots ,10\). Then, a point estimate and \(90\%\) confidence interval for \(\mu\) are given by

\[\widehat{\mu}=\bar{X}(10)=59.97\mbox{ and }\bar{X}(10)\pm t_{9;0.05}\cdot\sqrt{\frac{0.62}{10}}=59.97\pm 0.46.\]

Thus, in the long run we would expect the small factory to produce an average of about \(60\) parts per hour. The half-length of replication-deletion confidence interval given by (\ref{sch6eq31}) depends on the variance of \(X_{j}\), \(Var(X_{j})\), which will be unknown when the first \(n\) replications are made. Therefore, if we make a fixed number of replications of the simulation, the resulting confidence interval half-length may or may not be small enough for a particular purpose. We know, however, that the half-length can be decreased by a factor of approximately \(2\) by making four times as many replications.$\sharp$

We present a more comprehensive discussion of procedures for constructing a point estimate and a confidence interval for the steady-state mean \(\mu =\mathbb{E}(Y)\) of a simulation output process \(Y_{1},Y_{2},\cdots\). The following definitions of \(\mu\) are usually equivalent:

\[\mu =\lim_{i\rightarrow\infty}\mathbb{E}(Y_{i})\]

and

\[\mu =\lim_{m\rightarrow\infty}\frac{\sum_{i=1}^{m} Y_{i}}{m}\mbox{ with probability \(1\)}.\]

Two general strategies have been suggested in the simulation literature for constructing a point estimate and confidence interval for \(\mu\)

  • Fixed-sample size procedures. A single simulation run of an arbitrary fixed length is made, and then one of a number of available procedure is used to construct a confidence interval from the available data.
  • Sequential procedures. The length of a single simulation run is sequentially increased until and “acceptable” confidence interval can be constructed. There are several techniques for deciding when to stop the simulation run.

There are many fixed-sample-size procedures suggested in the literature. They are the replication-deletion approach, which was discussed in the above, the method of batch means, the auto-regressive method, the method of spectrum analysis, the regenerative method, the standardized time series method.

\begin{equation}{\label{d}}\tag{D}\mbox{}\end{equation}

Statistical Analysis for Steady-State Cycle Parameters.

Suppose that the output process \(Y_{1},Y_{2},\cdots\) does not have a steady-state distribution. Assume, on the other hand, that there is an appropriate cycle definition so that the process \(Y_{1}^{C},Y_{2}^{C},\cdots\) has a steady-state distribution \(F^{C}\), where \(Y_{i}^{C}\) is the random variable defined in the \(i\)th cycle. If \(Y^{C}\) has a distribution function \(F^{C}\), then we are interested in estimating some characteristic of \(Y^{C}\) such as the mean \(\mu^{C}=\mathbb{E}(Y^{C})\) or the probability \(\mathbb{P}\{Y^{C}\leq y\}\). Clearly, estimating a steady-state cycle parameter is just a special case of estimating a steady-state parameter, so all of the techniques discussed in the case of steady-state parameter can apply, except to the cycle random variables \(Y_{i}^{C}\) rather than to the original \(Y_{i}\)’s. For example, we could use Welch’s method to get a warm-up period and then apply the replication-deletion approach to obtain a point estimate and confidence interval for \(\mu^{C}\).

Example. Consider once again the small factory of Example \ref{sch6ex30} but suppose that there is a half-hour lunch break that starts \(4\) hours into each \(8\)-hour shift. This break stops the inspection process, but unfinished parts continue to arrive and to be processed by the unmanned machine. If \(N_{i}\) is the throughput in the \(i\)th hour, then the process \(N_{1},N_{2},\cdots\) does not have a steady-state distribution. We might, however, expect that it is periodic with a cycle length of \(8\) hours. To substantiate this, we made \(n=10\) replications of length \(m=160\) hours ($20$ shifts). From the plot of the averaged process \(\bar{N}_{i}\) for \(i=1,2,\cdots ,160\), we see that the process \(N_{1},N_{2},\cdots\) does indeed appear to have a cycle of length \(8\) hours. Let \(N_{i}^{C}\) be the average production in the \(i\)th \(8\)-hour cycle and assume that \(N_{1}^{C},N_{2}^{C},\cdots\) has a steady-state distribution. Suppose that we want to obtain a point estimate and a \(99\%\) confidence interval for the steady-state expected average production over a shift, \(\mu^{C}=\mathbb{E}(N^{C})\), using the replication-deletion approach. Let \(N_{ji}^{C}\) be the average production in the \(i\)th cycle of our \(j\)th available replication for \(j=1,2,\cdots ,10\) and \(i=1,2,\cdots ,20\), and let \(\bar{N}_{i}^{C}\) for \(i=1,2,\cdots ,20\) be the corresponding averaged process; that is, \(\bar{N}_{i}^{C}=\sum_{j=1}^{10}N_{ji}^{C}/10\), which is plotted. We conclude from this plot that further smoothing is desirable. As a result, we plot the moving average \(\bar{N}_{i}^{C}(w)\) from Welch’s procedure for both \(w=3\) and \(w=6\) shifts. From the plot for \(w=6\) (which is smoother), we choose a warmup period of \(l=5\) shifts or \(40\) hours. Let

\[X_{j}^{C}=\frac{\sum_{i=6}^{20} N_{ji}^{C}}{15}\mbox{ for }j=1,2,\cdots ,10.\]

Then a point estimate and \(99\%\) confidence interval for \(\mu^{C}\) are given by

\[\widehat{\mu}^{C}=\bar{X}^{C}(10)=60.24\]

and

\[\bar{X}^{C}(10)\pm t_{9;0.05}\cdot\sqrt{\frac{0.79}{10}}=60.24\pm 0.91\]

which also contains \(60\). \(\sharp\)

In the above discussion, we presented procedures for constructing a confidence interval for single measure of performance. However, for most real-world simulations several measures of performance are of interest simultaneously. Suppose that \(I_{s}\) is a \(100(1-\alpha _{s})\%\) confidence interval for the measure of performance \(\mu_{s}\), where \(s=1,2,\cdots ,k\). (The \(\mu_{s}\)’s may all be measures of performance for a terminating simulation or may all be measures for a non-terminating simulation). Then the probability that all \(k\) confidence intervals simultaneously contain their respective true measures satisfies

\[\mathbb{P}\{\mu_{s}\in I_{s}\mbox{ for all }s=1,2,\cdots ,k\}\geq 1-\sum_{s=1}^{k}\alpha_{s}\]

whether or not the \(I_{s}\)’s are independent. This result, known as the Bonferroni inequality, has serious implications for a simulation study. For example, suppose that one constructs \(90\%\) confidence interval, that is, \(\alpha_{s}=0.1\) for all \(s\), for \(10\) different measures of performance. Then the probability that each of the \(10\) confidence intervals contains its true measure can only be claimed to be greater than or equal to zero. Thus one cannot have much overall confidence in any conclusions drawn from such a study. The difficulty we have just described is known in the statistics literature as the multiple-comparisons problem. We now describe a practical solution to the above problem when the value of \(k\) is small. If one wants the overall confidence level associated with \(k\) confidence intervals to be at least \(100(1-\alpha )\%\), choose the \(\alpha_{s}\)’s satisfying \(\sum_{s=1}^{k}\alpha_{s}=\alpha\). (Note that the \(\alpha_{s}\)’s do not have to be equal. Thus \(\alpha_{s}\)’s corresponding to more important measures could be chosen smaller). Therefore, one could construct ten \(99\%\) confidence intervals and have the overall confidence level be at least \(90\%\). The difficulty with this solution is that the confidence intervals will be larger than they were originally if a fixed-sample size procedure is used, or more data will be required for a specified set of \(k\) relative errors if a sequential procedure is used. For this reason, we recommend that \(k\) be no larger than about \(10\). If one has a very large number of measures of performance, the only recourse available is to construct the usual \(90\%\) or \(95\%\) confidence intervals but to be aware that one or more of these confidence intervals probably does not contain its true measure.

Example. Consider the bank of Example \ref{sch6ex18} with five tellers and queue. The following table gives the results of using these \(10\) replications of the (terminating) simulation and (\ref{sch6eq27}) to construct \(96.6667\%\) confidence intervals for each of the measures of performance

\[\mathbb{E}\left [\frac{\int_{0}^{T} Q(t)dt}{T}\right ],\mathbb{E}\left [\frac{\sum_{i=1}^{N} D_{i}}{N}\right ],\mathbb{E}\left [\frac{\sum_{i=1}^{N} I_{i}(0,5)}{N}\right ]\]

such that the overall confidence interval is at least \(90\%\).

\[\begin{array}{ccc}
\hline \mbox{Measure of performance} & \mbox{Point estimate} & \mbox{\(96.667\%\) confidence interval}\\
\hline {\displaystyle E\left [\frac{\int_{0}^{T} Q(t)dt}{T}\right ]} & 1.97 & [1.55,2.40]\\
{\displaystyle E\left [\frac{\sum_{i=1}^{N} D_{i}}{N}\right ]} & 2.03 & [1.59,2.47]\\
{\displaystyle E\left [\frac{\sum_{i=1}^{N} I_{i}(0,5)}{N}\right ]} & 0.85 & [0.80,0.90]\\ \hline\end{array}\]

Example. Suppose for the small factory of Example \ref{sch6ex30} that we would like to obtain point estimates and confidence intervals for both steady-state mean hourly throughput \(\mu_{N}\) and the steady-state mean time in system of a part \(\mu_{T}\), with the overall confidence level being at least \(90\%\). Therefore, we will make the confidence level of each individual interval \(95\%\). Using the \(10\) replications from Example \ref{sch6ex33}, we plotted the moving average \(\bar{T}_{i}(w)\) for the time-in-system process \(T_{1},T_{2},\cdots\) in order to determine its warm-up period. Here \(T_{i}\) is the time in system of the \(i\)th departing part. Since this plot was highly variable, we made an additional \(10\) replications of length \(160\) hours and used the entire \(20\) replications for our analysis. We plot the hourly throughput moving average \(\bar{N}_{i}(w)\) for \(w=30\), and we plot the time-in-system moving average \(\bar{T}_{i}(w)\) for \(w=1200\). (Note that the number of \(T_{i}\) observations in a \(160\)-hour simulation run is a random variable with mean \(9600\). Therefore, for our analysis we used the minimum number of observations for any one of the \(20\) runs, which was \(9407\)). From the plots we decided on warmup periods of \(l_{N}=24\) hours and \(l_{T}=2286\) times, respectively. Note, however, that \(2286\) times corresponds to approximately \(38\) hours. Since \(24\) and \(2286\) are much smaller than \(160\) and \(9407\), respectively, we will use these same replications to construct the confidence intervals. Let

\[X_{j}=\frac{\sum_{i=25}^{160} N_{ji}}{136}\]

and

\[Y_{j}=\frac{\sum_{i=2287}^{9407} T_{ji}}{7121}\mbox{ for }j=1,2,\cdots ,20.\]

Then point estimates and \(95\%\) confidence intervals for \(\mu_{N}\) and \(\mu_{T}\) are given by

\[\widehat{\mu}_{N}=\bar{X}(20)=60.03, \widehat{\mu}_{T}=\bar{Y}(20)=6.16\mbox{minutes}\]

and

\[\bar{X}(20)\pm t_{19;0.025}\sqrt{\frac{0.7}{20}}=60.03\pm 0.39,
\bar{Y}(20)\pm t_{19;0.025}\sqrt{\frac{0.55}{20}}=6.16\pm 0.35.\]

Thus, we are at least \(90\%\) confidence that \(\mu_{N}\) and \(\mu_{T}\) are in the intervals \([59.64,60.42]\) and \([5.81,6.51]\), respectively. \(\sharp\)

The Bootstrapping Technique for Estimating Mean Square Errors.

Suppose now that \(X_{1},X_{2},\cdots ,X_{n}\) are independent random variables having a common distribution \(F\), and suppose we are interested in using them to estimate some parameter \(\theta (F)\) of the distribution \(F\). For example, \(\theta (F)\) could be the mean of \(F\), or it could be the median or the variance of \(F\), or any other parameter of \(F\). Suppose further that an estimator of \(\theta (F)\), call it \(g(X_{1},\cdots ,X_{n})\) has been proposed, and in order to judge its worth as an estimator of \(\theta (F)\) we are interested in estimating its mean square error. That is, we are interested in estimating the value of

\[\mbox{MSE}(F)=\mathbb{E}_{F}[(g({\bf X})-\theta (F))^{2}]\]

where we have used the notation \(E_{F}\) to indicate that the expectation is to be taken under the assumption that the random variables all have distribution \(F\). Now whereas there is an immediate estimator of the above MSE, namely \(S^{2}/n\), when \(\theta (F)=\mathbb{E}(X_{i})\) and \(g({\bf X})=\bar{X}\), it is not at all apparent how it can be estimated otherwise.

Note that if the distribution function \(F\) were known then we could theoretically compute the expected square of the difference between \(\theta\) and its estimator; that is, we could compute the mean square error. However, after we observe the values of the \(n\) data points, we have a pretty good idea what the underlying distribution looks like. Indeed, suppose that the observed values of the data are \(X_{i}=x_{i}\), \(i=1,\cdots , n\). We can now estimate the underlying distribution function \(F\) by the so-called empirical distribution function \(F_{e}\), where \(F_{e}(x)\), the estimate of \(F(x)\), the probability that a datum value is less than or equal to \(x\), is just the proportion of the \(n\) data values that are less than or equal to \(x\). That is,

\[F_{e}(x)=\frac{\mbox{number of \(i\): \(X_{i}\leq x\)}}{n}.\]

Another way of thinking about \(F_{e}\) is that it is the distribution function of a random variable \(X_{e}\) which is equally likely to take on any of the \(n\) values \(x_{i}\), \(i=1,\cdots ,n\). If the values \(x_{i}\) are not all distinct, then the above is to be interpreted to mean that \(X_{e}\) will equal the value \(x_{i}\) with a probability equal to the number of \(j\) such that \(x_{j}=x_{i}\) divided by \(n\); that is, if \(n=3\) and \(x_{1}=x_{2}=1\), \(x_{3}=2\), then \(X_{e}\) is a random variable that takes on the value \(1\) with probability \(2/3\) and \(2\) with probability \(1/3\). Now if \(F_{e}\) is “close” to \(F\), as it should be when \(n\) is large (indeed, the strong law of large numbers implies that with probability \(1\), \(F_{e}(x)\) converges \(F(x)\) as \(n\rightarrow\infty\), and another result, known as the Glivenko-Cantelli theorem, states that this convergence will, with probability one, be uniform in \(x\)), then \(\theta (F_{e})\) will probably be close to \(\theta (F)\), assuming that \(\theta\), in some sense, a continuous function of distribution, and \(\mbox{MSE}(F)\) should approximately be equal to

\[\mbox{MSE}(F_{e})=\mathbb{E}_{F_{e}}[(g({\bf X})-\theta (F_{e}))^{2}].\]

In the above expression the \(X_{i}\) are to be regarded as being independent random variables having distribution function \(F_{e}\). The quantity \(\mbox{MSE}(F_{e})\) is called the bootstrap approximation to the mean square error \(\mbox{MSE}(F)\).

Example. Suppose we are interested in estimating \(\theta (F)=\mathbb{E}[X]\) by using the sample mean \(\bar{X}=\sum_{i=1}^{n} X_{i}/n\). If the observed data are \(x_{i}\) for \(i=1,\cdots ,n\), then the empirical distribution \(F_{e}\) puts weight \(1/n\) on each of the points \(x_{1},\cdots ,x_{n}\) (combining weights if the \(x_{i}\) are not all distinct). Hence the mean of \(F_{e}\) is \(\theta (F_{e})=\bar{x}=\sum_{i=1}^{n} x_{i}/n\), and thus the bootstrap estimate of the mean square error, call it \(\mbox{MSE}(F_{e})\), is given by

\[\mbox{MSE}(F_{e})=\mathbb{E}_{F_{e}}\left [\left (\frac{1}{n}\sum_{i=1}^{n} X_{i}-\bar{x}\right )^{2}\right ]\]

where \(X_{1},\cdots ,X_{n}\) are independent random variables each distributed according to \(F_{e}\). Since

\[\mathbb{E}_{F_{e}}\left [\frac{1}{n}\sum_{i=1}^{n} X_{i}\right ]=\mathbb{E}_{F_{e}}[X]=\bar{x}\]

it follows

\begin{align*} \mbox{MSE}(F_{e}) & =\mbox{Var}_{F_{e}}\left (\frac{1}{n}\sum_{i=1}^{n} X_{i}\right )\\ & =\frac{\mbox{Var}_{F_{e}}(X)}{n}\end{align*}

Now

\begin{align*} \mbox{Var}_{F_{e}}(X) & =\mathbb{E}_{F_{e}}[(X-E_{F_{e}}[X])^{2}]\\ & =\mathbb{E}_{F_{e}}[(X-\bar{x})^{2}]\\ & =\frac{1}{n}\left [\sum_{i=1}^{n} (x_{i}-\bar{x})^{2}\right ]\end{align*}

Therefore, we have

\[\mbox{MSE}(F_{e})=\frac{\sum_{i=1}^{n} (x_{i}-\bar{x})^{2}}{n^{2}}\]

which compares quite nicely with \(S^{2}/n\), the usual estimate of the mean square error. Indeed, because the observed value of \(S^{2}/n\) is \(\sum_{i=1}^{n} (x_{i}-\bar{x})^{2}/(n(n-1))\) the bootstrap approximation is almost identical. \(\sharp\)

If the data values are \(X_{i}=x_{i}\), \(i=1,\cdots ,n\), then as the empirical distribution function \(F_{e}\) puts weight \(1/n\) on each of the points \(x_{i}\), it is usually easy to compute the value of \(\theta (F_{e})\); for example, if the parameter of interest \(\theta (F)\) was the variance of the distribution \(F\), then

\[\theta (F_{e})=\mbox{Var}_{F_{e}}(X)=\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}/n.\]

To determine the bootstrap approximation to the mean square error we then have to compute

\[\mbox{MSE}(F_{e})=\mathbb{E}_{F_{e}}[(g({\bf X})-\theta (F_{e}))^{2}]\]

However, since the above expectation is to be computed under the assumption that \(X_{1},\cdots ,X_{n}\) are independent random variables distributed according to \(F_{e}\), it follows that the vector \((X_{1},\cdots ,X_{n})\) is equally likely to take on any of the \(n^{n}\) possible values \((x_{i_{1}},x_{i_{2}},\cdots ,x_{i_{n}})\), \(i_{j}\in\{1,2,\cdots ,n\}\) for \(j=1,\cdots ,n\). Therefore, we obtain

\[\mbox{MSE}(F_{e})=\sum_{i_{n}}\cdots\sum_{i_{1}}\frac{(g(x_{i_{1}},x_{i_{2}},\cdots ,x_{i_{n}})-\theta (F_{e}))^{2}}{n^{n}}\]

where each \(i_{j}\) goes from \(1\) to \(n\), and so the computation of \(\mbox{MSE}(F_{e})\) requires, in general, summing \(n^{n}\) terms, an impossible task when \(n\) is large. However, as we know, there is an effective way to approximate the average of a large number of terms, namely, by using simulation. Indeed, we could generate a set of \(n\) independent random variables \(X_{11},\cdots ,X_{1n}\) each having distribution function \(F_{e}\) and then set

\[Y_{1}=(g(X_{11},\cdots ,X_{1n})-\theta (F_{e}))^{2}\]

Next, we generate a second set \(X_{21},\cdots ,X_{2n}\) and compute

\[Y_{2}=(g(X_{21},\cdots ,X_{2n})-\theta (F_{e}))^{2}\]

and so on, until we have collected the variables \(Y_{1},Y_{2},\cdots ,Y_{r}\). Because these \(Y_{i}\) are independent random variables having mean \(\mbox{MSE}(F_{e})\), it follows that we can use their average \(\sum_{i=1}^{r}Y_{i}/r\) as an estimate of \(\mbox{MSE}(F_{e})\).

  • It is quite easy to generate a random variable \(X\) having distribution \(F_{e}\). Because such a random variable should be equally likely to be \(x_{1},\cdots ,x_{n}\), just generate a random number \(U\) and set \(X=x_{I}\), where \(I=Int(nU)+1\). It is easy to check that this will still work even when the \(x_{i}\) are not all distinct.
  • The above simulation allows us to approximate \(\mbox{MSE}(F_{e})\), which is itself an approximation to the desired \(\mbox{MSE}(F)\). As such, it has been reported that roughly \(100\) simulation runs, that is, choosing \(r=100\), is usually sufficient.

Example. Suppose in Example \ref{sch6ex5} that we are interested in estimating the long-run average amount of time a customer spends in the system. That is, letting \(W_{i}\) be the amount of time the \(i\)th entering customer spends in the system for \(i\geq 1\), we are interested in

\[\theta =\lim_{n\rightarrow\infty}\frac{W_{1}+W_{2}+\cdots +W_{n}}{n}.\]

To show that the above limit does indeed exist (note that the random variables \(W_{i}\) are neither independent nor identically distributed), let \(N_{i}\) denote the number of customers that arrive on day \(i\), and let

\begin{align*}
D_{1} & =W_{1}+\cdots +W_{N_{1}}\\
D_{2} & =W_{N_{1}+1}+\cdots +W_{N_{1}+N_{2}}\\
& \vdots\\
D_{i} & =W_{N_{1}+\cdots +N_{i-1}+1}+\cdots +W_{N_{1}+\cdots +N_{i}}
\end{align*}

In words, \(D_{i}\) is the sum of the times in the system of all arrivals on day \(i\). We can now express \(\theta\) as

\[\theta =\lim_{m\rightarrow\infty}\frac{D_{1}+D_{2}+\cdots +D_{m}}{N_{1}+N_{2}+\cdots +N_{m}}\]

where the above follows because the ratio is just the average time in the system of all customers arriving in the first \(m\) days. Upon dividing numerator and denominator by \(m\), we obtain

\[\theta =\lim_{m\rightarrow\infty}\frac{(D_{1}+D_{2}+\cdots +D_{m})/m}{(N_{1}+N_{2}+\cdots +N_{m})/m}\]

Now as each day follows the same probability law, it follows that the random variables \(D_{1},\cdots ,D_{m}\) are all independent and identically distributed, as are the random variables \(N_{1},\cdots ,N_{m}\). Hence, by the strong law of large numbers, it follows that the average of the first \(m\) of the \(D_{i}\) will, with probability \(1\), converge to their common expectation, with a similar statement being true for the \(N_{i}\). Therefore, we have \(\theta =\frac{\mathbb{E}(D)}{\mathbb{E}(N)}\), where \(\mathbb{E}(N)\) is the expected number of customers to arrive in a day, and \(\mathbb{E}(D)\) is the expected sum of times those customers spend in the system. To estimate \(\theta\) we can thus simulate the system over \(k\) days, collecting on the \(i\)th run the data \(N_{i}\), \(D_{i}\), where \(N_{i}\) is the number of customers arriving on day \(i\) and \(D_{i}\) is the sum if the times they spend in the system, \(i=1,\cdots ,k\). Because the quantity \(\mathbb{E}(D)\) can then be estimated by

\[\bar{D}=\frac{D_{1}+\cdots +D_{k}}{k}\]

and \(\mathbb{E}(N)\) by

\[\bar{E}=\frac{N_{1}+\cdots +N_{k}}{k}\]

it follows that \(\theta =\mathbb{E}(D)/\mathbb{E}(E)\) can be estimated by

\[\widehat{\theta}=\frac{\bar{D}}{\bar{N}}=\frac{D_{1}+\cdots +D_{k}}{N_{1}+\cdots +N_{k}}\]

which, it should be noted, is just the average time in the system of all arrivals during the first \(k\) days. To estimate

\[\mbox{MSE}=\mathbb{E}\left [\left (\frac{\sum_{i=1}^{k} D_{i}}{\sum_{i=1}^{k} N_{i}}-\theta\right )^{2}\right ]\]

we employ the bootstrap approach. Suppose that the observed value of \(D_{i}\),$N_{i}$ is \(d_{i}\), \(n_{i}\), \(i=1,\cdots ,k\). That is, suppose that the simulation resulted in \(n_{i}\) arrivals on day \(i\) spending a total time \(d_{i}\) in the system. Thus the empirical joint distribution function of the random vector \((D,N)\) puts equal weight on the \(k\) pairs \((d_{i},n_{i})\) for \(i=1,\cdots ,k\). That is, under the empirical distribution function we have

\[\mathbb{P}_{F_{e}}\{D=d_{i},N=n_{i}\}=\frac{1}{k}\mbox{ for }i=1,\cdots ,k.\]

Therefore, we have

\[\mathbb{E}_{F_{e}}[D]=\bar{d}=\frac{d_{1}+\cdots +d_{k}}{k}\]

and

\[\mathbb{E}_{F_{e}}[N]=\bar{n}=\frac{n_{1}+\cdots +n_{k}}{k},\]

which implies

\[\theta (F_{e})=\frac{\bar{d}}{\bar{n}}.\]

Therefore, we also obtain

\[\mbox{MSE}(F_{e})=\mathbb{E}_{F_{e}}\left [\left (\frac{\sum_{i=1}^{k} D_{i}}{\sum_{i=1}^{k} N_{i}}-\frac{\bar{d}}{\bar{n}}\right )^{2}\right ]\]

where the above is to be computed under the assumption that the \(k\) pairs of random vectors \((D_{i},N_{i})\) are independently distributed according to \(F_{e}\). Since an exact computation of \(MSE(F_{e})\) would require computing the sum of \(k^{k}\) terms, we now perform a simulation experiment to approximate it. We generate \(k\) independent pairs of random vectors \((D_{1i},N_{1i})\), \(i=1,\cdots ,k\), according to the empirical distribution function \(F_{e}\), and then compute

\[Y_{1}=\left (\frac{\sum_{i=1}^{k} D_{1i}}{\sum_{i=1}^{k} N_{1i}}-\frac{\bar{d}}{\bar{n}}\right )^{2}\]

We then generate a second set \((D_{2i},N_{2i})\), \(i=1,\cdots ,k\), and compute the corresponding \(Y_{2}\). This continues until we have generated the \(r\) values \(Y_{1},\cdots ,Y_{r}\) (where \(r=100\) should suffice). The average of these \(r\) values, \(\sum_{i=1}^{r} Y_{i}/r\), is then used to estimate \(\mbox{MSE}(F_{e})\), which is itself our estimate of \(\mbox{MSE}\), the mean square error of our estimate of the average amount of time a customer spend in the system. \(\sharp\)

 

Hsien-Chung Wu
Hsien-Chung Wu
文章: 183

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *